Application URL : InsurancePremiumPredictor
This app predicts Insurance premium price based on some data.
This project is created with below technologies/tools/resorces:
- Python: 3.7
- Machine Learning
- Jupyter Notebook
- HTML/CSS
- Docker
- Git
- CI/CD Pipeline
- Heroku
Create a conda environment
conda create -p venv python==3.7 -y
activate conda environment
conda activate venv/
To install requirement file
pip install -r requirements.txt
- Add files to git
git add .
orgit add <file_name>
- To check the git status
git status
- To check all version maintained by git
git log
- To create version/commit all changes by git
git commit -m "message"
- To send version/changes to github
git push origin main
- Data Ingestion
- Data Validation
- Data Transformation
- Model Training
- Model Evaluation
- Model Deployement
- Data ingestion is the process in which unstructured data is extracted from one or multiple sources and then prepared for training machine learning models.
- Data validation is an integral part of ML pipeline. It is checking the quality of source data before training a new mode
- It focuses on checking that the statistics of the new data are as expected (e.g. feature distribution, number of categories, etc).
- Data transformation is the process of converting raw data into a format or structure that would be more suitable for model building.
- It is an imperative step in feature engineering that facilitates discovering insights.
- Model training in machine learning is the process in which a machine learning (ML) algorithm is fed with sufficient training data to learn from.
- Model evaluation is the process of using different evaluation metrics to understand a machine learning model’s performance, as well as its strengths and weaknesses.
- Model evaluation is important to assess the efficacy of a model during initial research phases, and it also plays a role in model monitoring.
- Deployment is the method by which we integrate a machine learning model into production environment to make practical business decisions based on data.