/DSA

Data Science and Analytics Study

Primary LanguageJupyter NotebookMIT LicenseMIT

Data Science and Analytics Study

This is a repository created for Data Science and Analytics exercises.

MBA - Content related to USP-ESALQ Data Science & Analytics MBA (2024-2025)

1 - Outliers - Exercise showing the usage of IQR to filter out outliers from a set and the differences from a boxplot before and after the removal. Also shows how to filter and pinpoint exactly which points have been removed.

2 - K-means for auto binning - Usage of a sklearn.kmeans model to cluster observations, so that the desired number of bins be automatically selected, in a balanced format, without manual choices. The notebook shows from manual binning until the automatic, kmeans binning.

3 - Segmentation Analysis - Usage of KMeans and RandomForest ML algorithms to a case where you need to classify clients and predict conversions.