/python-do-differernt-csv

covid-19 rumor

Primary LanguageJupyter NotebookMIT LicenseMIT

Data Collecting 爬虫

  • snopes.py by Tianqi
    • It is used to collect data from website www.snopes.com and qc.wa.news.cn (departed)

Data Analysis 对csv数据分析

data_process.ipynb is written on Jupyter Notebook.

Data

  • news

    • news.csv and subfolder of each news
    • The number of subfolder records: 3936
  • twitter

    • Twitter.csv and subfolder of each twitter
    • The number of subfolder records: 1383
  • en_dup.csv

    • The number of records: 7179.
    • Part of data are collected manually by keywords searching from sources such as twitter.com.
    • Data from www.snopes.com and qc.wa.news.cn are collected by 'snopes.py'.

Acknowledgement

转发自:https://github.com/MickeysClubhouse/COVID-19-rumor-dataset