NER-Detection-and-Anonymization

NER Detection

There are various methond to detect the NER. Core NLP,NLTK,Spacy,RASA NLU.

There are some models available, which we can train on our data. Syntaxnet, CRF, LSTM,BERT.

I have used spacy here as it gives good result, easy to use and fast as well but it cannot distinguish b/w named entity which start with samll alphabet and for that i have used caseless model of stanfordNERTagger.

I have also used core NLP for date and mobile number detection.

Data Anonymization

As per my knowledge there are two option available to anonymizie the data.

1 Core python programming 2 Faker library

I have used faker library to encyrpt or say to fake the data.

Work with word document

There are also two option by which we can word on word document.

1 Docx 2 win32com.client

I have used win32com client as it is easy to use and can read or write the whole document as per source format.