/must-read-sentimentAnalysis

List of Resources for Sentiment Analysis Researcher Starter

Must-read-sentiment Analysis

List of Resources for Sentiment Analysis in General and including resources for Arabic Language as well. The list is under continous update.

Papers:

Sentiment Classification :

  1. Turney 2002 : Thumbs up or thumbs down?
  • the oldest notable work in sentiment classification,
  • turney using two words “excellent” and “poor” and point wise mutual PMI information to do unsupervised sentiment classification
  1. Pang, Lee 2002 : Thumbs up?: sentiment classification using machine learning techniques
  • Published a movie reviews dataset that everyone uses until now
  • used Machine learning classifiers for 3 k-folds crossvalidation
  • features were basic bag of words (unigrams and/or bigrams) word existence, word freq ( no tfidf )
  • other additional features : top unigrams, adjectives, position
  • compared results to results of features manually selected by two manual annotators
  • Accuracy of baseline (manual annotated features ~60-70%)
  • Accuracy of ML ~80-83%

Sentiment Analysis in Arabic Language:

  1. Abbasi et al. Sentiment Analysis in Multiple Languages: Feature Selection for Opinion Classification in Web Forums
  • very good categorized literature reviews about common features selected, techniques, domains of use in sentiment analysis
  • uses Entropy Weighted Genetic Algorithm to do feature selection among each of the previous techniques
  1. El-Beltagy, Samhaa R., and Ahmed Ali. "Open issues in the sentiment analysis of arabic social media: A case study." Innovations in Information Technology (IIT), 2013 9th International Conference on. IEEE, 2013.
  • overview of main issues and obstacles in arabic social media sentiment analysis
  • semi-automatic generation of ~4k entries egyptian dialect sentiment lexicon (link available in the paper) using conjunctions
  • Evaluation of Generated Lexicon
  1. Abdul-Mageed & Diab. "AWATIF: A Multi-Genre Corpus for Modern Standard Arabic Subjectivity and Sentiment Analysis." LREC. 2012.
  • Main usefulness : Good regulations for annotating sentiment datasets
  • multi-genre annotated corpus of Modern Standard Arabic for SA
  • built from different resources including Penn Arabic Treebank, Wikipedia Talk Pages and Web forums
  • manually annotated :
    • with Guidlines or simple Guidelines
    • with Trained Annotators/ Crowdsourcing
    • elaborate the importance of guidelines and training of annotators to produce dependable annota-tions
    • dataset not publicly available

Books:

  1. Bing Liu : Sentiment Analysis and Opinion Mining Book is a thorough literature review in various issues of sentiment analysis, this could be nice to get the big picture of sentiment analysis and also to get related work in any of the issues of sentiment analysis.

Courses:

  1. [NLP - Stanford, Dan Jurafsky & Christopher Manning] (https://www.coursera.org/course/nlp)

Datasets:

English Datasets :

Arabic Datasets & Lexicons :

People :

Glossary :

The Natural Language Processing Dictionary : Glossary contains definitions of wide range of used terms in Natural language processing topics, very useful when reading papers.

Miscellaneous:

  1. Chris. Manning : deep learning without magic part 1 : main interesting points :

Contributing:

Feel Free to Send a pull Request with any updates you think it's good to add