MH6151-project

NTU-SPMS-MH6151 Data Mining project

1. Set-up

pip install -r requirements.txt

Run the python files with the format modelling.py --model_name <model_name> --output_file <path> and save output to the folder ./outputs. For example, to run and save the output for random forest classifier, execute the following command:

python modelling.py --model_name random_forest --output_file outputs/random_forest.txt

To add oversampling step to the training data, simply add the --oversampling option in the command.

python modelling.py --model_name random_forest --output_file outputs/random_forest.txt --oversampling

scripts/modelling.sh && scripts/modelling_oversampling.sh

.\scripts\modelling.bat
.\scripts\modelling_oversampling.bat

python modelling_insights.py > outputs/performance.txt