Decision_Support_System-Projects

1. Information Retrieval

Executed queries on the indexed documents (library Whoosh) from US government website. Improved Whoosh's baseline performance by at least 30% using NLTK tools (e.g.: adding stop word filter, intra-word filter, lower-word filter and NLTK's stemmers and lemmatizers).

2. Machine Learning

Classify news into 20 news groups using logistic regression and NB. Improve performance by trying different feature sets, feature encodings, amount of data and hyperparrameters. Feature encoding includes binary encoding, TF and TF-IDF. Try softmax approach and one-vs-all approach for multi-class classification.

3. Recommender Systems

Built movie recommendation systems based on Popularity, User Average, Similarity (Cosine, Euclidean, Manhattan under User-User, and Item-Item) in Collaborative Filtering, Content-Based Filtering, and Match Box. Evaluated using RMSE, P@k, and R@k. Yielded the optimal recommender “User-Cosine” with mean RMSE of 1.017, mean P@5 of 0.56, and mean R@5 of 0.49

4. Text Mining and Sentiment Analysis

Utilized library Vader to perform sentiment analysis. Employed frequency, mutual information, pointwise mutual information for words and phrases to analyze reviews from TripAdvisor

5. Social Network Analysis

This assignment involves social network analysis based on twitter data.