Here you can find simple flask api with /ping
and /query
, which in response to a query with a question, returns 10 questions from the index that are most similar to it
At the first stage, for the input question, the 100 closest questions (by euqlidian distance) are searched in the FAISS index. Each question in index represented as weighted mean of word glove embeddings with tfidf weights.
100 nearest neighbors come to the entrance of the pre-trained KNRM model, after reranking the top-10 similar questions are sent
- clone repo
- download data to repo folder
$ pip install -r requirements.txt
2*. Train model (it takes about 20 min, also it is not necessarily, pretrained weights are already in model_state folder)
$ python train_model.py
in case if you want to train model by yourself (it takes about 20 min)
$ chmod u+x run.sh
$ ./run.sh
structure:
{
"queries" : ["How old is Barak Obama?", ..., "Why sky is blue?"]
}
structure:
{
"lang_check" : [True, ..., False], # if eng lang was detected
"suggestions" : [
[[<question id in index>, <similar question>], ..., [<question id in index>, <similar question>]], # suggestion ordered by relevance from most relevant to least
...
[[<question id in index>, <similar question>], ..., [<question id in index>, <similar question>]]
]
}
query:
{"queries" : ["What is main life lesson?"]}
answer:
{
"lang_check": [
true
],
"suggestions": [
[
[
"78367",
"What are some life lesson you can give me?"
],
[
"1096",
"What is the best lesson in life?"
],
[
"438530",
"What is your biggest life lesson?"
],
[
"444219",
"What is the most important lesson in life?"
],
[
"101153",
"What's the most important life lesson?"
],
[
"152190",
"What's the most important lesson about life?"
],
[
"125074",
"What is the biggest lesson that life has taught you?"
],
[
"268276",
"What is the most crucial lesson life has taught you so far?"
],
[
"165146",
"What is your life lesson that you learn from today?"
],
[
"438531",
"Can you tell me about your biggest life lesson?"
]
]
]
}