LiberAI/NSpM

How to deal with out of vocabulary words?

qasim9872 opened this issue · 0 comments

Hi,

I recently utilized the technique that has been discussed in this project for transforming a natural language sentence into a SPARQL query. Based on this, I created an end to end question answering system as part of my final year project. The system works well for known resource names, however; for questions which contain out of vocabulary words (resource names/words not part of the training data), the system does not predict an accurate query.

In the Neural Machine Translation for Query Construction paper, it says that External pre-trained word embeddings help deal with vocabulary mismatch. I am not sure how this would be implemented, could you provide any insight? I am already finished with the project but I would still like to learn about this.

The project I created is available on GitHub and can be found here if you would like to see. There's also a deployed version of the system and can be found here.

Thanks for the help in advance.