瀏覽代碼

Script pour savoir qui a dit de vilain gros mots

pull/3/head
Figg 6 月之前
父節點
當前提交
a0cbc508cc
共有 2 個文件被更改,包括 30 次插入0 次删除
  1. 13
    0
      million/analyze/wordFinder.py
  2. 17
    0
      scripts/find_gromots.py

+ 13
- 0
million/analyze/wordFinder.py 查看文件

@@ -0,0 +1,13 @@
1
+import re
2
+from typing import List
3
+from million.model.message import Message
4
+
5
+
6
+def _wordFilter(msg: Message, regexs: List[str]) -> bool:
7
+    return msg.content and any(
8
+        re.search(rgx, msg.content) for rgx in regexs
9
+        )
10
+
11
+def findWords(messages: List[Message], words: List[Message]) -> List[Message]:
12
+    rWords = [r"\b"+word+r"\b" for word in words]
13
+    return filter(lambda m: _wordFilter(m, rWords), messages)

+ 17
- 0
scripts/find_gromots.py 查看文件

@@ -0,0 +1,17 @@
1
+
2
+from million.analyze.wordFinder import findWords
3
+from million.parse.fb_exports import FacebookExportParser
4
+
5
+
6
+DATA_PATH = './data/'
7
+
8
+parser = FacebookExportParser()
9
+
10
+export = parser.parse(DATA_PATH)
11
+
12
+grosMots = ['merde', 'putain', 'bite', 'nichon', 'con(ne)?', 'baiser?']
13
+
14
+msgGromots = findWords(export.messages, grosMots)
15
+
16
+for gromot in msgGromots:
17
+    print(f"{gromot.sender_name} : {gromot.content}")

Loading…
取消
儲存