/Car.ly

Car.ly - Vehicle Insurance Claim Fraud Detection -- Machine Learning Strategies for Detecting Vehicle Insurance Claim Fraud

Primary LanguageJupyter NotebookMIT LicenseMIT

Car.ly - Vehicle Insurance Claim Fraud Detection

Machine Learning Strategies for Detecting Vehicle Insurance Claim Fraud

image

🚗 Abstract

Vehicle insurance claim fraud is a costly issue for insurance companies, prompting the need for effective detection and prevention methods. Fraudulent claims span from staged accidents to false reports of damages or injuries. To combat this, insurers employ strategies such as data analytics and machine learning algorithms. These approaches involve scrutinizing various data points, including claim histories, vehicle and driver information, and accident details, to uncover potential instances of fraud. By examining patterns, insurers can flag suspicious activity that may indicate fraudulent behavior. Machine learning algorithms take this analysis a step further by leveraging artificial intelligence to detect complex patterns that traditional methods might miss. For instance, these algorithms can analyze social media data to identify fraudulent behavioral patterns. Embracing advanced technologies enables insurers to proactively protect their financial interests while ensuring timely processing of legitimate claims.

🚕 Objective

  • The goal is to develop and implement a fraud detection system that leverages advanced technologies such as data analytics, machine learning algorithms, and predictive modeling.
  • Such a system can analyze large volumes of claims data to identify patterns and anomalies that indicate potential fraud.

🛠 Methods Used

  • Data was analyzed, missing values were searched for, and categorical and continuous data were separated.
  • Unnecessary columns were removed from the dataset.
  • Some column data were converted into meaningful numerical values.
  • Object type data was grouped and dealt with.
  • Relations were built between PolicyType and two other columns.
  • The BasePolicy column was recreated and compared with PolicyType, revealing insightful results.
  • Desired values were provided to data that were available in a range.
  • Exploratory Data Analysis was performed on the dataset.
  • The dataset was split into training and test data.
  • Several algorithms were used to train the model:
    • Support Vector Classifier
    • Naive Bayes Classifier
    • KNN Classifier
    • Decision Tree Classifier
    • Random Forest Classifier
    • XGBoost Classifier
    • Logistic Regression
  • The performance of the model was evaluated, and the highest accuracy was determined.

🧿 Results Analysis and Conclusion

Among all the other algorithms, K Nearest Neighbours achieved the highest accuracy, closely followed by Support Vector and Random Forest. This project provided me with valuable insights into the application of different classification algorithms. The dataset used for this project was extensive, consisting of over 10,000 records and more than 30 columns, which contributed to the high level of accuracy achieved.

🗿 Authors and Developers


Abhishek Sharma

Ishita Pahari

Udayan Misra

Digbijoy Dasgupta

Shreya Bose

Raktim Karmakar

© 2023 Abhishek Sharma

forthebadge forthebadge forthebadge ForTheBadge makes-people-smile