
Reddit data를 이용한 언어학적 특성을 도출 😇

Primary LanguageJupyter Notebook


Reddit data를 이용한 언어학적 특성을 도출 😇


🏈 goals: Find significant differences between Patients of Mental Disorder and not Mental Disorders


  1. crawling dataset from Reddits
  • codes/crawling-reddit.ipynb

  • with pushshift API & psaw in [pypl](pip install psaw)

  1. Sentimental analysis with LabMT
  • codes/labMT_sanity_check.ipynb
  1. Linguistic Analysis with LIWC
  • word count, word per sentences, words (> 6 letter)

  • sentimental analysis, sentimental classifications

  • pronouns analysis

  • time-oriented analysis

  • sanity check with results/compare_with_LIWC.ipynb

  1. Linguistic Analysis - replace LIWC
  • codes/final_liwc_alike.R & codes/final_replace_LIWC.ipynb

  • because of costs & clear understands

  1. analysis
  • codes/final_analysis.ipynb

  • results/analysis_between_subreddits.ipynb & results/analysis_between_recognize_or_not.ipynb