AhmadSakor/falcon

number of resource

nnadine25 opened this issue ยท 7 comments

hi , i whould ask about the number of total resources in the lc_quad dataset , to get sure about the number when we calculate precision and recall in entity linking , i found 6621 resources

Hi
thanks for your interest in our work.
we didn't count the number of resources for lc_quad dataset.
We calculated the P and R for each question separately and then divided the sum by the number of questions. I think measuring the P and R for each question separately makes more sense because some entities' mentions may be more challenging in some questions other than the others:
500 questions for this dataset
https://github.com/AhmadSakor/falcon/blob/master/datasets/lcquad_qaldformat.zip
and 3253 questions for this dataset
https://github.com/AhmadSakor/falcon/blob/master/datasets/LC-QUAD3253.csv

The second dataset is a subset of the first dataset published by (https://jens-lehmann.org/files/2018/www_qa_pipelines.pdf)

Maybe you can extract the number of resources from our evaluation files here
https://github.com/AhmadSakor/falcon/blob/master/evaluation_results/results_LCQUAD35_final.csv
https://github.com/AhmadSakor/falcon/blob/master/evaluation_results/results_LCQUAD_final.csv

thank you ,

and how you calculate precion , and recal for each question?

i think like this ?if there is an exact match with lc_quad you add 1 and if there is a mention extracted according to the field "intermediary_question": you add 1 ??

does you use "intermediary_question" in lc_quad to denote the number of entity mentions that should be linked.

We extracted the URI from the provided SPARQL query for each question.
Then if the URI of the extracted entity or relation by Falcon matches the one mentioned in the SPARQL query, we consider it as a relevant resource.

based on the relevant and the extracted resources we calculate the P and R using the following formulas (document is considered as a resource)

image

image

thank you ,
can you tell me the regex used to extract the relevant resource from sparql query

you can use the function "read_LCQUAD" in this script
https://github.com/AhmadSakor/falcon/blob/master/evaluation/evaluation_paper.py

thank you , i use pycharm ide , and when i pass all the questions , the ide block and closed , what you do to ensure the evaluation, and how time it took ?