/bio-medical_ner

This repo contains all data and code necessary to reproduce the experiments of ner on some open bio-medical corpora

Primary LanguageSQLPLApache License 2.0Apache-2.0

Evaluation of YASET on various bio-medical datasets

This repo describes the code and the processes used to evaluate Yaset, a neural model for NER on differents datasets. These experiences and their interpretations are precisely discribed in the following paper, Tourille et al., 2018.

Corpora

The different corpora are:

Each of the folder contains a README.md, commented jupyter notebooks, a data folder and partial summary of the results in a json directory.

Utils

The auxilliary functions used in notebooks are discribed in utils_paper.

Embeddings

The origin and construction of word embeddings used by the model are in en_word_emb, a notebook describes how they are constructed.

Requirements:

Install the Python dependencies with : pip install -r requirements.txt

Install Yaset following the instructions

Citation

Tourille, Julien, Matthieu Doutreligne, Olivier Ferret, Aurélie Névéol, Nicolas Paris, et Xavier Tannier. « Evaluation of a Sequence Tagging Tool for Biomedical Texts ». Proceedings of the 9th International Workshop on Health Text Mining and Information Analysis (LOUHI 2018), 2018, pages 193–203.