zhongkaifu/RNNSharp

A few questions regarding how to exploit RNNSharp

njfm0001 opened this issue · 0 comments

Hi,

I highly appreciate your work and thank you very much for providing this model in C#. To add in a bit of context, I will be using your model during my research, and I would like to know as much as possible about how to exploit RNNSharp for NER. I would greatly appreciate if you could answer some of my doubts and questions that I haven't seen enough explained in the README or in other issues written by other users:

  1. What is validated corpus in the process of encoding a model? How does it differ from the training corpus file?

  2. Is there any quicker way of preparing training files in the format required for the encoding process? Say I would like to use a CONLL training corpus but, as the format is different, manually formatting requires a huge amount of time. How to go about?

  3. How to do gazzeteer-matching? Say I have a list of named entities of locations from GeoNames and I want the algorithm to integrate them and do named entity matching in the decoding process. Is there any guide how to do so?

  4. I would like to know how to feed features into the algorithm, such as handcrafted linguistic features (capitalization, linguistic pattern combinations and the like) to build a hybrid linguistic and deep-learning-based NER system.

  5. How to perform evaluation metrics on our test dataset such as recall, precision and f1-score?

Thank you very much in advance, and have a nice day.