This project was created in fulfilment of the Masters in AI study unit 'ICS5110 - Applied Machine Learning'. It explores the application of machine learning algorithms to predict the population of Maltese regions based on demographic factors. A full report about the project can be found in ICS5110-Report.pdf
ICS5110-Applied-ML
├── NSO_Population_Sex_dataset
│ ├── NSO_DF_TOT_POP_BY_REG_DIST_LOC_1.5.csv
│ ├── NSO_POPULATION_DATA_CLEANED.csv
│ └── NSO_POPULATION_DATA_PREFEATURE.csv
├── DecisionTreeRegressor.py
├── LinearRegression.py
├── RandomForestRegressor.py
├── Utils.py
├── app.py
├── index.html
├── main.ipynb
└── requirements.txt
└── README-gradio.md
- NSO_Population_Sex_dataset: Contains the original dataset from the National Statistics Office (NSO) containing demographic data based on gender and LAUs. Two cleaned versions of the dataset are also available, used by the machine learning algorithms and the Gradio web app.
- DecisionTreeRegressor.py: Custom Decision Tree implementation.
- LineaRegression.py: Custom Linear Regression implementation.
- RandomForestRegressor.py: Custom Random Forest implementation.
- Utils.py: Utility functions used in main.ipynb, particularly the RMSE metric.
- app.py: Gradio web app implementation.
- index.html: Contains the iframe embedding the Gradio web app running on HuggingFace onto GitHub Pages.
- main.ipynb: Runs the models implemented, divided into data preprocessing, model training, and model evaluation.
- README-gradio.md:
README.md
file required by HuggingFace containing information about the implementation, including name, SDK version, files to run etc. For running on HuggingFace, change the name of this file toREADME.md
- requirements.txt: Lists the Python libraries that are required to run the Gradio web app. The libraries include
gradio
,pandas
,numpy
, andsklearn
.
- Data Preprocessing: The original dataset from Malta's National Statistics Office (NSO) is cleaned and preprocessed to prepare it for machine learning algorithms.
- Machine Learning Algorithms: Three machine learning algorithms are implemented: decision tree regression, random forest regression, and linear regression.
- Model Evaluation: The performance of each algorithm is evaluated using various metrics.
- Web Application: A user-friendly implementation of the model evaluation is created through Gradio and hosted through HuggingFace.
A Gradio web app was created to provide a user-friendly interface for predicting population based on selected demographic factors.
Report: Can be found above as ICS5110-Report.pdf HuggingFace: huggingface.co/spaces/ICS5110/Gradio-Web-Tool GitHub: nathanportelli.github.io/ICS5110-Applied-ML
Install the required libraries using pip:
pip install -r requirements.txt
Run:
python app.py
Open the web app in your browser through to the URL displayed in the IDE terminal.