Sentimental Analysis on Movie Comments
A project for International Summer Research Program in University of California, San Diego
- A contest on Kaggle
- Using Linear Regression
- With Natural Language Processing method, such as stemming and bi-gram
- Prepocess - Lowercase, remove punctuation, do stemming, filter out stopwords.
- Feature -
(1) Count the frequency of each word.
(2) Keep the highest frequency n words.
(3) Each vector of features represents the sentence.
- Machine Learning - Linear Regression.
The model will consist of n size theta.
- Predict.