/CMS-Medicare-Data-FRAUD-Detection

CMS Medicare Fraud Detection

Primary LanguageJupyter Notebook

CMS-Medicare-Data-FRAUD-Detection

Healthcare fraud is a main problem that causes substantial monetary loss in Medicare/Medicaid and insurance industry. The Centers for Medicare and Medicaid Services (CMS) have setup Medicare Part D programs since 2006. CMS relies on it to detect and prevent fraud, waste and abuse in Part D program. But using the traditional methods, the fraud detection is conducted on random samples by human experts. The consequences are the samples might be misleading or manual detection is costly. According to Office of Inspector General report: Since 2006, the Medicare Fraud has rapidly increased. The fraud patterns include the following four types:

• Fraud by Service Providers (Doctors, hospitals, pharmacies) • Fraud by Insurance subscribers (patient or patient’s employers) • Fraud by insurance carriers • Conspiracy Frauds (involved with all parties)

Also along with fraud the following patterns reported by OIG report:

• Commonly abuse drug/opioid has grown faster than spending for all Part D drugs • Pharmacies with questionable billing raise concerns about pharmacy-related fraud schemes • Geographic hotspots for certain non-controllable drugs points to possible fraud and abuse

This project will try to use the machine learning method to address the above problems and detect the fraudulence Medicare claims from the CMS open datasets and other use open data.

The objectives of this project will be

• Build a simple Data Model to show the relationships among the different datasets and identify the key feature-sets for fraud detections • Build a comprehensive machine learning model to detect fraud pattern based on the different features: Service Providers (Doctors, Pharmacies), Insurance subscribers (patients), Geo-demographic and commonly abuse drugs prescriptions • Setup a benchmark metrics to measure and evaluate the experimental result • Market-ready product