/anonymization-app

Remove personally identifiable information from text.

Primary LanguagePythonGNU Affero General Public License v3.0AGPL-3.0

anonymization-app

Remove personally identifiable information from text, at scale, using AI.

See it action and test it yourself on the demo version.

The API is publicly accessible at https://anonymization-app.azurewebsites.net.

Description

Synopsis: a dockerized python API that removes personally identifiable information (PII) from text.

Worflow: POST some text and receive it back without PII.

API Usage

See the documentation.

Models

All models are basically neural networks trained to perform Named Entity Recognition (NER). Specifically, they look for person names in text. The following models are currently supported:

  • ensemble (default and recommended): use all available models
  • presidio: fancy regex + spaCy models for NER. Built and maintained by Microsoft.
  • BERT: BERT model, fine-tuned for NER. Open-source, hosted by HuggingFace.

Setup

From project root, run locally with pipenv run python main.py.

Deploy with Azure Web Apps to serve publicly, for example as explained here.