/DiSMed

De-identifying Spanish medical texts

Primary LanguagePythonMIT LicenseMIT



DiSMed - De-identifying Spanish Medical texts

DiSMed is a de-identification methodology for Spanish medical texts based on Named Entity Recognition (NER). It is based on spaCy and partially based on the networks designed by Gillaume Genthial implemented on Tensorflow 1. DiSMed includes both the Python code and the curated dataset, available under request under a research use agreement.

Data access can be requested at BIMCV.

The results obtained are available at Journal of Biomedical Semantics: De-identifying Spanish medical texts - named entity recognition applied to radiology reports.