/Test-Debug-Test-Debug

University Project Backend testing

Primary LanguagePythonGNU General Public License v3.0GPL-3.0

Machine Learning and Deep Learning Model Selection Tool

This project encompasses a suite of Python scripts designed for loading datasets, preprocessing data, training various machine learning and deep learning models, and making predictions based on user inputs. It serves as an end-to-end solution for training and utilizing predictive models.

Description

The tool consists of several Python scripts, each handling a specific aspect of the machine learning pipeline:

  • data_preprocessing.py: Manages the loading, cleaning, and preprocessing of the dataset.
  • model_training.py: Handles the training of both traditional machine learning and deep learning models.
  • user_prediction.py: Facilitates making predictions based on user-provided inputs using the trained models.
  • main.py: Acts as the orchestrator for the entire process, tying together the functionalities of the other scripts.

Installation

Before running the project, ensure that you have Python installed on your system. Then, install the required libraries using the following command:

pip install pandas scikit-learn tensorflow numpy


These libraries include Pandas for data manipulation, Scikit-learn for machine learning algorithms, TensorFlow for deep learning models, and NumPy for numerical operations.
Usage

Follow these steps to run the project:

    Clone the repository or download all the Python scripts.
    Install the necessary Python libraries as indicated in the Installation section.
    Run main.py to execute the tool. This script will use the dataset located at the specified URL, preprocess the data, train the models, and make a sample prediction.

Components
data_preprocessing.py

This script is responsible for loading a specified dataset from a URL. It performs essential data cleaning steps, such as handling missing values, and preprocesses the data by encoding categorical variables and normalizing numerical features.
model_training.py

In this script, various machine learning models such as Logistic Regression, Decision Tree, and Random Forest are trained. It also includes the training of a simple neural network model using TensorFlow.
user_prediction.py

This script allows for making predictions with the trained models based on user inputs. The user input should be formatted according to the model's requirements, typically as a dictionary matching the dataset's feature structure.
main.py

The main script orchestrates the workflow by utilizing the functionalities provided in the other scripts. It oversees the process from data preprocessing to making predictions.
Future Developments

Suggestions for future enhancements include:

    Developing a graphical user interface or a web interface for easier interaction.
    Incorporating more sophisticated machine learning and deep learning models.
    Implementing comprehensive model evaluation and comparison metrics.
    Expanding the preprocessing capabilities to handle more diverse datasets.

Contributions

Contributions to this project are welcome. If you have suggestions for improvement or have identified bugs, please feel free to contribute.