/Data-Science-Projects

Repository containing data science projects.

Primary LanguageJupyter Notebook

Data Science Projects

Repository containing data science projects completed by me for academic and self learning purposes. Those are presented through Jupyter Notebooks and datasets (csv files).

Content:

  • Machine Learning

    • Principal Components Analysis with numpy: In this project, I will apply PCA to a dataset without using any of the popular machine learning libraries such as scikit-learn and statsmodels. The goal of this document is to have a deeper understanding of the PCA fundamentals using functions just from numpy library.

    • Shopper Segmentation (Unsupervised Learning): The objective of this project is to segment shoppers from a dataset given. K-Means, Agglomerative and DBSCAN are the three different unsupervised machine learning algorithms used for the project. At the end of the notebook, you can find the evaluation of those models comparing metrics as ARS (Adjusted Rand Score), NMI (Normalized Mutual Information) and Average Score.

    • Online News Popularity Prediction (Supervised Learning): This is project which objective is to predict the popularity of articles published by Mashable website. The machine learning algorithms used for this project were: Random Forest, Support Vector Classification and KNN / K-Nearest Neighbor.

    • Predictions of Admissions to Master's Degree (Supervised Learning): Using a Linear Regression Algorithm, this project was developed to predict the chance of admission of foreign students to Master's Degree Programs in American Colleges.

Tools: Python 3, Scikit-learn, pandas, numpy, matplotlib and seaborn

  • Data Analytics, Visualization and miscellaneous

Tools: Python 3, pandas, matplotlib, BeautifulSoup

Author

Wendy Navarrete