Not everyone has knowledge of ML and their models to recognize certain types of patterns within a huge dataset. To an ordinary user, it would be very difficult training a model over a set of data, providing it with an algorithm that it can use to reason over and learn from that data and also perform operations like preprocessing, imputation, visualization etc. A system to input data and get appropriate diagnosis along with an auto selected model would come in handy especially to the modern industrial sectors. The platform allows users who don't have any background knowledge of ML or its models and operations to predict and analyze data with ease.
A platform that eases the work of data and business analysts in generating inferences from data without having knowledge of the coding side of things. It will provide complete data handling capabilities from ETL(Extract, Transform and Load) pipelines needed to build the dataset, automated data profiling and cleaning, analyzing how the data changes over time, improving the data quality. As the next step, neural networks would be automatically generated to perform classification or regression tasks on any target variable from the dataset. The platform will also create and suggest beautiful visualizations for the given dataset that can help drive decisions and understand the data at hand better.
- Allow the user to input a dataset in the form of CSV format.
- Perform basic operations on the input dataset such as identification of columns, their data types, statistics like mean, min, max, grouping etc.
- Perform data imputation for missing data for a given dataset.
- Prepare the input dataset by applying various preprocessing techniques like handling outlier, one hot encoding, feature scaling etc
- Develop an algorithm for Automatic Model Selection, using a genetic approach that automatically and efficiently finds the most suitable neural network model for a given dataset.
- Develop an auto data visualization algorithm to show top k data visualization for a given dataset.
- VSCode
- MongoDB
Clone the repo
git clone https://github.com/deepanshu2506/auto-ml.git
Install the dependencies by running:
pip3 install -r requirements.txt
yarn install
flask run
cd ./frontend
yarn start
Frontend
: React
Backend
: Flask(Python)
Database
: MongoDB