A project to implement automatic crawling and analyzing of lecture posters at Southern University of Science and Technology.
SID | Name |
---|---|
12112729 | 杨烜 |
12110813 | 刘圣鼎 |
12110817 | 张展玮 |
12111427 | 勾业备 |
.
├── convert.py
├── get_info.py
├── get.py
├── update.py
├── visualize,py
├── GUI.py
├── README.md
└── stop_words_CN.txts
-
Python 3.9
-
numpy 1.24.0
-
pandas 2.0.0
-
matplotlib 3.7.0
-
beautifulsoup4 4.12.0
-
fake_useragent 1.4.0
-
tqdm 4.66.0
-
paddleocr 2.7.0
-
wordcloud 1.9.0
-
jieba 0.42.0
-
ttkbootstrap 1.10.0
python GUI.py
Crawl the poster and generate information:
- Choose the number of posters to crawl and analyze(one page has the latest 6 posters).
- Choose the chatgpt or Wenxinyi model we provided.
- Select the output path for the poster and analysis data.
- Click "导出".
Generate poster word cloud data:
- Select the entry path for the poster information form.
- Select word cloud and hot word output path.
- Add stop word (if not, leave it blank)
- Click "导出".