/projects

My ML and data analytics projects and scripts from studies and as hobby projects. Feel free to use code if found interested

Primary LanguageJupyter NotebookThe UnlicenseUnlicense

Datascience and data analytics projects and scripts


Bachelor thesis

Stacking, or stacked generalization, is a technique in ensemble learning where multiple base models, or "weak learners," are trained and combined to form a metamodel with improved predictive power. The ISOVIS research group at LNU has created StackGenVis, a visual analytics system that helps users optimize performance metrics, manage input data, including selecting features, and choose top-performing algorithms. The current version of StackGenVis uses a single Linear Regression metamodel. This work aims to investigate the impact of alternative metamodels on the predictive performance of StackGenVis using provided data and charts for comparison.


Script used and testing in Numer.ai ML competition, check additional details at https://docs.numer.ai/tournament/learn

Notebook includes following steps:

  • feature selection using univariate Selection and Fit model using each importance as a threshold
  • Model testing and parameter tuning for Lasso Regression, Ridge Regression, XGBoost and ElasticNet models
  • Model ensembling using stacked generalization


Hobby projects

Analysis perfomed to discover the correlation (if any) between USD/RUB exchange rate and Oil Brent price. Intended as home project for DS course. Script includes Exploratory analysis steps, data wrangling stepsand simple machine learning models check.


Supervised Learning techniques

notebooks created for better material understanding



Unsupervised Learning techniques

notebooks created for better material understanding