This repository contains the code for predicting health insurance charges using linear regression. The model is built using the data set named medical.csv
, which is already included in the code.
Health insurance charges can vary significantly depending on various factors such as age, BMI, smoking habits, region, etc. Predicting these charges accurately can help insurance companies and individuals in making informed decisions regarding insurance coverage and premiums.
In this project, we utilize linear regression to predict health insurance charges based on a set of features extracted from the medical.csv
dataset.
The medical.csv
dataset contains the following columns:
- Age: Age of the individual
- Sex: Gender of the individual (male/female)
- BMI: Body Mass Index of the individual
- Children: Number of children/dependents covered by the insurance
- Smoker: Whether the individual is a smoker or not (yes/no)
- Region: The region where the individual resides
- Charges: Health insurance charges incurred by the individual
To run the code in this repository, you'll need:
- Python 3.x
- Jupyter Notebook
- Libraries: Pandas, NumPy, Matplotlib, Seaborn, Scikit-learn
You can install the required libraries using pip:
pip install pandas numpy matplotlib seaborn scikit-learn
- Clone the repository:
git clone https://github.com/VijayMakkad/Health-Insurance-Prediction
- Navigate to the repository directory:
cd health-insurance-prediction
- Open the Jupyter Notebook file
insurance_prediction_LinearR.ipynb
:
jupyter notebook Health_Insurance_Prediction.ipynb
- Follow the instructions provided in the notebook to train the linear regression model and make predictions.
The trained linear regression model achieves a certain level of accuracy in predicting health insurance charges based on the provided features. The results and evaluation metrics are discussed in detail within the Jupyter Notebook.
This project is licensed under the MIT License - see the LICENSE file for details.