/suicide-risk-sentiment

Files related to the evaluation of sentiment analysis lexicons via a corpus study of suicide attempt-related clinical notes

Primary LanguagePython

Sentiment lexicons for suicide risk assessment

This repository contains files related to a study of 6 sentiment lexicons used for suicide risk assessment. This includes all code used to extract sentiment words from the eHOST-IT case-control cohort of CRIS clinical notes. The 6 lexicons (not provided) must be downloaded separately. These are:

The scripts are as follows:

  • emotions.py: code to prepare data and lexicons for experiments and extract sentiment words.
  • emotions_afinn.py: code to extract sentiment words using AFINN.
  • emotions_emolex.py: code to extract sentiment words using EmoLex.
  • emotions_pattern.py: code (Python 2.7) to extract words using the Pattern lexicon.
  • emotions_pattern_p36.py: code (Python 3.6) to extract sentiment words from previously tokenised text using Pattern.
  • emotions_swn.py: code to extract sentiment words using the NLTK interface for SentiWordNet 3.0.
  • sentiment_extraction.py: code to calculate frequency statistics and test cross-corpus statistical significance (Mann-Whitney U Test) of frequency differences.