/Python-ML-Time-Series-AirSafe

Greetings! This repository showcases the continuous assessment for CCT College Dublin's "Machine Learning for Business" course, specifically focusing on the application of machine learning models to time series data. In this project, we applied a total of 8 time series models to gain comprehensive insights into the dataset.

Primary LanguageJupyter Notebook

Machine-Learning-for-Business

CCT College Dublin Continuous Assessment | Assignment Title: MLBus_HDipData_CA1 Student Full Name & Number: Natalia de Oliveira Rodrigues 2023112 and Heitor de Araujo Filho 2023098

Assessment Task: Students are advised to review and adhere to the submission requirements documented after the assessment task.

This is a group-based project (Max 2 students) using the PYTHON programming language. Develop and deploy machine learning models in any one of the following areas only and analyse the results. – Covid-19 datasets – Transport datasets – Energy – Stock market dataset from only website: https://data.world/datasets/finance

The dataset should have at least 8000 rows and 10 columns (for example, type of variables may be categorical, continuous, and discrete) after cleaning and there is not any upper bound. The type of question(s) that you should formulate for the project will depend on the chosen domain of the dataset that your group is considering.

Project questions could be: (this is a small, suggested, sample of questions, other questions may be more appropriate to your project) – How to measure similarity or dissimilarity between different clusters? – Which clustering solution do you prefer, and why? – How to analyse and investigate an inflation rate for a specific product?

Your group may start with a simple approach to initiate your project work based on project objectives and enhance your work using distinct approaches. The group would need to consider the following instructions (a - d) during the development of this group project. a) Logical justification based on the reasoning for the specific choice of machine learning approaches (supervised/ Unsupervised) for the chosen problem and dataset (s). Justify the rationale for using the project management framework/ activities (CRISP-DM, KDD, or SEMMA). b) Machine Learning models can be used for Prediction, Classification, Clustering and time series analysis. Your group should plan on trying multiple approaches (at least two), with proper parameter-selection using hyperparameter tuning and a comparison between the chosen modelling approaches if essential. c) You/ Your group should train the Machine learning models, test and further validate the models. Perform a comparison of two or more ML modelling outcomes using a Table or graph visualization. Your group may employ dimensionality reduction methods to prepare the dataset based on your project requirements. d) Depending on the complexity of the problem, develop the clustering profiles that clearly describe the characteristics of the specific data within the cluster. Your group will present their findings and defend the results in the report (MS Doc). Your report should capture the following aspects that are relevant to your project investigations. i) Motivation, description of problem domain, justification of project objectives in the above-mentioned areas. (15 marks group) ii) Characterization and normalization of data if required, train and test supervised ML models based on three different splits in the case of supervised learning and discuss the variation in accuracy/ score obtained from the models. Use appropriate metrics to justify your results in the case of unsupervised learning. (25 marks group) iii) Interpret and justify the results based on the problem specification or project objectives. Comments and description of Python code, conclusions of the project should be specified in the report as well as jupyter notebook. Citations and references should be in the Harvard Style. (20 marks group) iv) Each team member presents a PowerPoint presentation of their work (maximum 5 slides) to emphasize their distinctive contributions based on their involvement in the project's conceptual understanding, code development, and deployment. (20 marks individual) v) Each team member fully described their individual contributions to the project in a reflective journal, using at least 500 to 700 words as well as images, diagrams, figures, and visualizations to elaborate his/ her work. (20 marks individual)