master-project-CSS

Archived organizational email data-sets have always been consid- ered a valuable resource for different aspects of textual analysis such as topic modelling and sentiment analysis. But most of the experiments done are performed on synthetic data due to a lack of an real life and adequate benchmark. The Enron email data-set is boon for such research. In this report I examine the differences between the behavioural aspects of the employees through topic modelling and sentiment/emotion analysis as well as try to gauge the ethical aspect of the employees through detection of moral language.

There are mainly three goals:

  • Sentiment Analysis: Inspect if there is any change in sen- timent and emotion before and after the discovery of the scandal.
  • Topic Modelling: Analyse what kind of topics come out of the email content of a level-A executive and low-level employee and whether the topics are similar or different.
  • Moral Language Inspection: Inspect if email content before and around scandal have presence of moral language for an employee.