CoNLL reader function setting document = sentence

Question

Dekermanjian opened this issue 3 years ago · 1 comments

The CoNLL().readDataset() is not working as expected. The document is equal to the sentence and is not being built by the -DOCSTART- -X- -X- O flag.

I am not sure if this issue will affect the training of a NERDL model. However, it makes it impossible to refer back to a specific document (not sentence) where the entity is detected. To reproduce the example you can go through the example notebook provided here : https://github.com/JohnSnowLabs/spark-nlp-workshop/blob/master/jupyter/training/english/dl-ner/ner_dl.ipynb

and inspect the columns after reading the CoNLL data.

Your Environment

Answer 1 · 2021-07-14T17:49:47.000Z

Not the right repo for this issue.