/SentimentAnalysis

Sentiment Analysis project for the Machine Learning university course

Primary LanguagePython

Supervised Topic-Based Message Polarity Classification using Cognitive Computing

This project is part of Semeval 2017 Task 4

Paper published

A paper has been published about this work throught the proceedings of 4th Workshop on Sentic Computing, Sentiment Analysis, Opinion Mining, and Emotion Detection (EMSASW 2018)

Project tasks

Subtasks B-C: Topic-Based Message Polarity Classification:

Given a message and a topic, classify the message on
  • B) two-point scale: positive or negative sentiment towards that topic
  • C) five-point scale: sentiment conveyed by that tweet towards the topic on a five-point scale.

Project structure

The project is built as follow:

  • are_project_B.py : execution file task B
  • are_project_C.py : execution file task C
  • utilities.py : personal library with various useful functions used by tasks
  • train_BD.tsv : train dataset task B
  • train_CE.tsv : train dataset task C
  • test_BD.tsv : test dataset task B
  • test_CE.tsv : test dataset task C

Dataset structure

The following table explains how the dataset is composed:

Tweet id Topic Tweet classification Tweet text
522712800595300352 aaron rodgers neutral I just cut a 25 second audio clip of Aaron Rodgers talking about Jordy Nelson's grandma's pies. Happy Thursday.
523065089977757696 aaron rodgers negative @Espngreeny I'm a Fins fan, it's Friday, and Aaron Rodgers is still giving me nightmares 5 days later. I wished it was a blowout.
522477110049644545 aaron rodgers positive Aaron Rodgers is really catching shit for the fake spike Sunday night.. Wtf. It worked like magic. People just wanna complain about the L.
522551832476790784 aaron rodgers neutral If you think the Browns should or will trade Manziel you're an idiot. Aaron Rodgers sat behind Favre for multiple years.
522887492333084674 aaron rodgers neutral Green Bay Packers: Five keys to defeating the Panthers in week seven: Aaron Rodgers On Sunday, ... http://t.co/anCHQjSLh9 #NFL #Packers

Tasks resolution approach

  1. Data Preprocessing
  2. Every record has been associated with categories and concepts taken by IBM Watson
  3. Various classifiers has been trained to obtain the best obtainable scores requested by the challenge.
  4. Best results has been taken

Results of the studied case

The results of this research has been written into a paper proposed to Workshop on Sentic Computing, Sentiment Analysis, Opinion Mining, and Emotion Detection.