/Yandex-Practicum

Primary LanguageJupyter Notebook

Yandex-Practicum

Data Science course projects

The projects were implemented during training at the school of data analysis Yandex-Practicum, by profession "Data Scientist". Below is a list of projects with a brief description and used libraries.

Link: https://practicum.yandex.ru/data-scientist/

Project name Description Used Libraries
Age detection (CV)    Open in Colab A machine learning model has been created that determines the age of a person from his photo. pandas, tensorflow, matplotlib
Credits Research of reliable borrowers. pandas, numpy, pymystem3
Real estate Research analysis of the cost of apartments in St. Petersburg and neighboring settlements was carried out according to data for several years. pandas, numpy, matplotlib
Telecom Based on a sample of 500 users of the Megaline company, the behavior of customers for 2018 is analyzed on two tariff plans: "Smart" and "Ultra". pandas, numpy, seaborn, scipy
Tariff recommendation Comparison of models for solving the problem of binary classification for choosing the optimal tariff for customers of the mobile operator "Megaline". pandas, numpy, matplotlib, sklearn, pylab
Client churn Building a model for solving the binary classification problem for predicting the outflow of Beta-Bank customers. pandas, numpy, matplotlib, sklearn, plotly
Study of oil producing regions The data of oil samples in three regions were considered, where the quality of oil and the volume of its reserves were measured. A machine learning model has been built to determine the region where mining will bring the greatest profit. Possible profits and risks are analyzed using the Bootstrap technique. pandas, numpy, sklearn
Linear algebra It is necessary to protect the data of clients of the insurance company "Though the Flood". A data conversion method has been developed to make it difficult to recover personal information from them. The correctness of its work is substantiated. pandas, numpy, sklearn
Car prices Service for the sale of used cars "Not beaten, not beautiful" is developing an application to attract new customers. In it, you can quickly find out the market value of your car. pandas, numpy, matplotlib, seaborn, sklearn, lightgbm, catboost
Taxi order forecast The options for building machine learning models for predicting the number of taxi orders for the next hour are considered. pandas, numpy, matplotlib, statsmodels, seaborn, sklearn, lightgbm, catboost, xgboost
Toxic comments The text classification models that determine the toxicity of the commentary text are considered. pandas, numpy, matplotlib, sklearn, torch, torch, transformers, nltk
Air transportation (SQL) The analysis of passenger demand for flights to cities where the largest cultural festivals are held is carried out. The source of information for the study was the airline database. pandas, numpy, matplotlib, scipy
Console games research An analysis of world sales of computer games, user and expert ratings, genres and platforms based on historical data from open sources is presented. The study is based on data up to 2016. pandas, numpy, seaborn, scipy, plotly
Gold recovery process The project contains the analysis of data on the concentration of metals at different stages of mining and ore refining. pandas, numpy, matplotlib, sklearn, plotly
Final project The task was set to learn how to predict the outflow of customers of the telecom operator "Notasingledisconnec.com". pandas, numpy, matplotlib, seaborn, scipy, plotly, sklearn

Certificate (PDF version):

Data Scientist — certificate

Training course:

Data Scientist — training course