/EXDBI-1-Sentiment

🤓Knowledge Extraction from Databases and Hypertext Data - Assignment #1: Sentiment

Primary LanguagePython

EXDBI Sentiment

Assignment: Application various supervised machine learning methods in combination with different text document models for sentiment extraction.

Introduction

This semester assignment work is not meant to supply any kind of research of journal papers, it only demonstrates knowledge of relevant methods usage on research experiment (50 days, 20 tweets per day as minimum) text data to achieve as most precise result as possible in extraction of sentiment. Main goal, in more detail, was to get overall knowledge and experience with Bag-of-Words and Word2Vec models used for sentiment extraction. As the process of study there were many materials used as introduction into problematic, such as magnificent Kaggle tutorial Bag of Words Meets Bags of Popcorn, tutorials for Scikit-learn Python library Tutorials and many other pages, code sources and electronic materials, which lead to this work.

Original Code

Original code for Bag-of-Words and Word2Vec is here DeepLearningMovies. It's mostly reused with improvements or adaptions for assignment purposes. Code for Tf-Idf is inspired by Beat the benchmark with 'shallow' learning (0.95 LB), Kaggle discussion forum.

Assignment paper

Final version of assignment paper is available on GoogleDocs here: EXDBI - Sentiment