facebookresearch/Clinical-Trial-Parser

the order of the labels

Closed this issue · 1 comments

I went thru pytext train < src/resources/config/ner.json.

The training data has multiple labels. When running pytext predict, the result shows the word classifications in numerical label. How do you match the numbers to the original text labels?

experimentally, I can tell 1=chronic_disease, 7=cancer, 8=age. but, where can I look this up? and btw, this order disagrees with that given in bin/README.md.

That logic is in src/ie/ner.py. Running ./script/ie_parse.sh should reproduce the results in this repo. The following lines in the ie_parse script do the named entity recognition (NER) portion of the IE parser:

export PYTHONPATH="$(pwd)/src
python src/ie/ner.py -m bin/ner.c2 -i data/output/ie_extracted_clinical_trials.tsv -o data/output/ie_ner_clinical_trials.tsv