Is there a way to deidentify websites? right now it looks like the pretrained ner_deid_large NER model doesn't catch them
egenc opened this issue · 1 comments
egenc commented
websites, we sometimes have texts in the end of our reports saying where to sent feedback and the website addresss has identifying information about the hospital
egenc commented
I have solved it checking this link:
https://colab.research.google.com/github/JohnSnowLabs/spark-nlp-workshop/blob/master/tuto[…]reprocessing_with_SparkNLP_Annotators_Transformers.ipynb
https://colab.research.google.com/github/JohnSnowLabs/spark-nlp-workshop/blob/master/tuto[…]gs/Healthcare/1.2.Contextual_Parser_Rule_Based_NER.ipynb