/pii-data-scrubber

This is demo repo to demonstrate how to leverage Azure Text Analytics to perform Personally identifiable information (PII) data scrubbing by Python (Jupyter Notebook). This is important part of data wrangling/data cleaning.

Primary LanguageJupyter Notebook

PII Data Scrubber

  • Version 1.2 - Added Price information (by $ amount, including in sentence) scrubbing.
  • Version 1.1 - Added Hong Kong Identity Card (HKID) number (including in sentence) scrubbing.

This is demo repo to demonstrate how to leverage Azure Text Analytics to perform Personally identifiable information (PII) data scrubbing by Python (Jupyter Notebook). This is important part of data wrangling/data cleaning.

Sample PII data (data/pii-sample-data.csv) is contain dummy variations of Visa Card number, Master Card number, American Express Card number, Phone number, Name, Address, Email Address, Hong Kong Identity Card (HKID) number (including in sentence) & Price information (by $ amount, including in sentence).

alt text