reader.py misses the the <EX_ENAMEX> annotated entities on LeMonde Corpus
pjox opened this issue · 3 comments
pjox commented
Hello,
I recently found that the reader.py
script does not parse the entities annotated on the French corpus LeMonde annotated as <EX_ENAMEX>
, meaning that it misses some entities and also that it cuts some sentences as it ignores the text inside the tags.
This should not have a big impact as the number of entities annotated like this is rather little, but still it would be nice to patch the little bug. 😄
Thanks!
kermitt2 commented
kermitt2 commented
merged...