Machine_learning_workflow

Collection of scripts and notebooks useful when conducting a project

Part 1: Preprocessing I. EDA (Exploratory Data Analysis)

Pandas profiling
Map columns

II. Cleaning

Get unique values and replace them with np.nan, numeric
Get column names
Set correct schema/datatypes. Get rid of invalid characters that might be problematic
Get rid of "code" values such as 999

III. Imputation

IV. Aggregation

Sum across different columns that have certain value

Part 2: Implementation I. Machine learning algorithms II. mlflow tracking III. Econometric/Statistical models IV. Hyperparameter tuning loop V. Metrics

dorissuzukiesmerio/Machine_learning_workflow

Machine_learning_workflow