Source code of paper "Systematic Assessment of Factual Knowledge in Large Language Models" - EMNLP Findings 2023
Our framework contains four main components:
kg
: declare how to read and preprocess knowledge graphextractor
: to extract question triplets, nodes and relation summary from the knowledge graph.generator
: to generate questions/answers from extracted tripletsevaluator
: to evaluate LLM's response Checkcertlm/registry.py
for the list of supported extractors, generators and evaluators
Please check this document for steps to reproduce our experiments in the paper.
For a new KG dataset, it should extend the certlm.kg_dataset.BaseKG
class and implement following methods:
load_relation()
: load all available relations toself.relations
get_input_relation_file(relation)
: return path to input relation file for a given relationget_relation_name(relation)
: get relation labelget_relation_type(relation)
: return relation type: 1-1, N-1, N-M
Current supported KG:
The preprocessed data can be found here.
Example of T-REx relation
{
"relation": "P19",
"template": "[X] was born in [Y]",
"label": "place of birth",
"description": "most specific known (e.g. city instead of country, or hospital instead of city) birth location of a person, animal or fictional character",
"type": "N-1"
}
A triplet extractor for a KG should extend the certlm.extractors.BaseExtractor
class and implement following methods:
extract_relation(relation, relation_input_file, output)
: extract relation data stored inrelation_input_file
and save tooutput
file. In addition to theoutput
file, a relation summary file is also created to index all the subject, object nodes which will be useful for evaluating N-M relations.get_input_question_files(data_dir, relation)
: return path to questions and summary extracted from the given relation
Note that, we should standardize the relation info to follow this format for reusability
The question output is a jsonl file where each line is a json with following structure:
{
"subject_label": subject_label,
"object_label": object_label,
"object_uri": object_uri,
"subject_uri": subject_uri,
"relation_info": {
"relation": "P19",
"template": "[X] was born in [Y]",
"label": "place of birth",
"description": "most specific known (e.g. city instead of country, or hospital instead of city) birth location of a person, animal or fictional character",
"type": "N-1",
"subject_symbol": "[X]",
"object_symbol": "[Y]"
}
}
The relation summary should have following structure
{
"relation_info": {
"relation": "P19",
"template": "[X] was born in [Y]",
"label": "place of birth",
"description": "most specific known (e.g. city instead of country, or hospital instead of city) birth location of a person, animal or fictional character",
"type": "N-1",
"subject_symbol": "[X]",
"object_symbol": "[Y]"
},
"node_summary": {
"objects": {
"uri1": {
"object_label": object_label,
"object_uri": object_uri,
"subjects": [array of subject uri]
},
...
}
},
"subjects": {
"uri1": {
"subject_label": subject_label,
"subject_uri": subject_uri,
"objects": [array of object uri]
},
...
}
}
}
Example of running command
python run_certlm.py --step extract \
--kg trex \
--data-dir ./examples/TREx \
--data-file relations.jsonl \
--output-dir ./output
We support the following generator: template
, llm-mask
, llm-question
. Each generator needs to implement the following methods
generate_questions(triplet_input_file, relation_summary_file, output_file)
# question
prompts = [
{"role": "system", "content": prompt},
{"role": "user", "content": content}
]
expected_answers = [{"uri": a[0], "label": a[1]} for a in answers]
question_record = {
"question_id": question_id,
"prompts": prompts,
"answers": expected_answers,
}
json_record = json.dumps(question_record)
fout.write(json_record + '\n')
Example of running command
python run_certlm.py --step question_generate \
--kg trex \
--generator masking \
--data-dir ./examples/TREx \
--data-file relations.jsonl \
--gen-input-dir ./output/tuples
If this repo is useful for your own research, please cite us with the following bibtex entry
@article{luo2023systematic,
title={Systematic Assessment of Factual Knowledge in Large Language Models},
author={Luo, Linhao and Vu, Thuy-Trang and Phung, Dinh and Haffari, Gholamreza},
journal={Findings of EMNLP},
year={2023}
}