
Data analysis and data science projects form the Yandex Bootcamp

Primary LanguageJupyter Notebook

Yandex Data Science Projects

Data analysis and data science projects from the Yandex Bootcamp

Project name Description Skills Libraries
Car price Factors influencing car prices posted in a web page Data cleaning, dealing with missing values pandas, numpy, matplotlib
Mobile tariff analysis Revenue comparisons between mobile plans EDA, Hypothesis test pandas, numpy, scipy, matplotlib
Games ratings analysis Videogames ratings analysis and regional consumer profile based on trends Data cleaning, mergin, visualization and analysis using hypothesis tests pandas, scipy, seaborn, matplotlib
Phone plan recommendation Phone plan recomendation based on customer behavior using machine learning models Machine learning, model evaluation pandas, sklearn
Bank customer churn prediction Prediction of bank customer churn based on customer behavior to develop customer retention strategies Build supervised learning models (logistic regression, decision trees and random forest) with class imbalance (up-, down-sampling), and model evaluation pandas,sklearn
Oil well prospection Data driven decision for selecting profitable regions to drill oil wells Prediction using linear regression. Bootstrap for generating random distribution for risk assessment pandas, scipy, sklearn
Gold extraction model Linear regression models to predict the amount of gold extracted in different stages of purification process with the aim to optimize production Linear regression, data visualization, custom made evaluation metrics pandas, sklearn,seaborn
Insurance benefits prediction Prediction of insurance benefits using masked data for personal data protection Linear regression using masked data applying linear algebra pandas, numpy, math, sklearn
Car price predictive models Prediction of second hand car selling prices using regression models Regression models, comparision of models' quality in terms of fitting time and RMSE pandas, sklearn, lightGBM, Catboost
Cab orders prediction Cab order prediction at an airport using time series Time series, Regression models pandas, sklearn, statmodels
Movie reviews sentiment analysis Automatic classification of movies reviews using machine learning sentiment analysis Text preprocessing, regular expressions, TF-IDF, BERT embiddings, ML sklearn, spaCy, NLTK, transformers