/DS_portfolio

Избранные проекты по машинному обучению

Primary LanguageJupyter Notebook

Intro

Here I post projects with something interesting and useful.

Each project is a .ipynb file with:

  • formally set task
  • EDA
  • working code
  • text explanations
  • visualized results
  • conclusions and / or plans for improving the result

Large projects

Project Methods Stack
Full face recognition pipeline Face detection (YOLO), face landmarks coordinates regression, face alignment, face embedding DNN, cos similarity analysis, telegram bot implementation python, pytorch, albumentations, timm, aiogram, cv2
DCGAN for face generation Hand-made DCGAN, custom Gauss noise layers, gradien penalty, regularization by labels (soft, noise, flip), specific metrics: Fréchet inception distance и Leave-one-out-1-NN classification, epoch animation python, pytorch, matplotlib, albumentations, timm

Selected study projects

Project Methods Stack
Person Age by Photo
Yandex Practicum
EffecientNet V2 pretrained, strong augmentation, result analysis python, pytorch, matplotlib, albumentations, timm, альтернативный код в keras
Image Autoencoders
DL school MIPT
CNN, latent vectors, latent space sampling, VAE, Conditional VAE, KL divergence loss python, pytorch, sklearn, matplotlib
Toxic Comment Classification
Yandex Practicum
Text preprocessing, TF-IDF, lemmatization, word2vec, fine-tuned BERT, weighted loss python, pandas, numpy, pytorch, BERT, sklearn, optuna, matplotlib, seaborn, wordcloud, nltk, enchant, spacy, gensim
Semantic Segmentation of Skin Lesions
DL school MIPT
CNN, hand-made Unet и SegNet (based on VGG16), image augmentations, custom loss functions python, pytorch, sklearn, albumentations, matplotlib
Gold Recovery Efficiency Model
Yandex Practicum
EDA, preprocessing, feature selection by phik correlation, feature dimensionality reduction, gradient boosting, hyperparameter tuning python, pandas, seaborn, sklearn, pipeline, optuna, phik, xgboost
Simpsons Classification
DL school MIPT
CNN, transfer learning, image augmentation, class imbalance, extended result visualization python, pytorch, sklearn, matplotlib
Taxi Demand Time Series
Yandex Practicum
Exponential smoothing, SARIMA, linear regression, gradient boosting, hybrid boosting, hyperparameter tuning, neural networks: dense, LSTM python, pandas, numpy, seaborn, sklearn, pipeline, statsmodels, pytorch, lightgbm, optuna, prophet, phik, pmdarima
Used Car Price Prediction
Yandex Practicum
EDA, deep preprocessing, feature selection, gradient boosting, hyperparameter tuning, fully connected neural network, residual analysis, postcode coordinates, feature importance (permutation) python, pandas, numpy, seaborn, sklearn, pipeline, pytorch, lightgbm, xgboost, catboost, optuna, phik
Bank Customer Churn
Yandex Practicum
Class imbalance, T-SNE visualization, gradient boosting, hyperparameter tuning, cross-validation ROC curves, feature engineering, wrapper class, stacking, feature importance (permutation) python, pandas, numpy, seaborn, sklearn, pipeline, lightgbm, xgboost, optuna,
Titanic Analysis
Kaggle
Advanced analysis, advanced visualization, gradient boosting, hyperparameter tuning python, pandas, numpy, seaborn, plotly, sklearn, catboost, optuna,
Selecting Oil Well Location
Yandex Practicum
Synthetic data, bootstrap, QQ-plot, confidence intervals python, pandas, numpy, seaborn, sklearn, pipeline
Video Game Market Analysis
Yandex Practicum
Deep data analysis, advanced visualization, hypothesis testing python, pandas, numpy, seaborn, plotly, scipy
Tariff Recommendation for a Mobile Operator Client
Yandex Practicum
Classic ML models: logistic regression with feature engineering, SVM, naive Bayes classifier, decision trees, random forest, gradient boosting, model stacking python, pandas, sklearn, numpy, seaborn, xgboost

Contacts

E-mail: sergey.troschiev@gmail.com

Telegram: https://t.me/sergey_doc