This is a repository created for Data Science and Analytics exercises.
MBA - Content related to USP-ESALQ Data Science & Analytics MBA (2024-2025)
1 - Outliers - Exercise showing the usage of IQR to filter out outliers from a set and the differences from a boxplot before and after the removal. Also shows how to filter and pinpoint exactly which points have been removed.
2 - K-means for auto binning - Usage of a sklearn.kmeans model to cluster observations, so that the desired number of bins be automatically selected, in a balanced format, without manual choices. The notebook shows from manual binning until the automatic, kmeans binning.
3 - Segmentation Analysis - Usage of KMeans and RandomForest ML algorithms to a case where you need to classify clients and predict conversions.