/llm-facteval

Source code of paper "Systematic Assessment of Factual Knowledge in Large Language Models" - EMNLP Findings 2023

Primary LanguagePython

LLM-facteval

Source code of paper "Systematic Assessment of Factual Knowledge in Large Language Models" - EMNLP Findings 2023

Our framework contains four main components:

  • kg: declare how to read and preprocess knowledge graph
  • extractor: to extract question triplets, nodes and relation summary from the knowledge graph.
  • generator: to generate questions/answers from extracted triplets
  • evaluator: to evaluate LLM's response Check certlm/registry.py for the list of supported extractors, generators and evaluators

Reproducibility

Please check this document for steps to reproduce our experiments in the paper.

Adding new KGs to the framework

Knowledge Graph (KG)

For a new KG dataset, it should extend the certlm.kg_dataset.BaseKG class and implement following methods:

  • load_relation(): load all available relations to self.relations
  • get_input_relation_file(relation): return path to input relation file for a given relation
  • get_relation_name(relation): get relation label
  • get_relation_type(relation): return relation type: 1-1, N-1, N-M

Current supported KG:

The preprocessed data can be found here.

T-REx

Example of T-REx relation

{
  "relation": "P19",
  "template": "[X] was born in [Y]",
  "label": "place of birth",
  "description": "most specific known (e.g. city instead of country, or hospital instead of city) birth location of a person, animal or fictional character",
  "type": "N-1"
}

Extractor

A triplet extractor for a KG should extend the certlm.extractors.BaseExtractor class and implement following methods:

  • extract_relation(relation, relation_input_file, output): extract relation data stored in relation_input_file and save to output file. In addition to the output file, a relation summary file is also created to index all the subject, object nodes which will be useful for evaluating N-M relations.
  • get_input_question_files(data_dir, relation): return path to questions and summary extracted from the given relation

Note that, we should standardize the relation info to follow this format for reusability

The question output is a jsonl file where each line is a json with following structure:

 {
    "subject_label": subject_label,
    "object_label": object_label,
    "object_uri": object_uri,
    "subject_uri": subject_uri,
    "relation_info": {
      "relation": "P19",
      "template": "[X] was born in [Y]",
      "label": "place of birth",
      "description": "most specific known (e.g. city instead of country, or hospital instead of city) birth location of a person, animal or fictional character",
      "type": "N-1",
      "subject_symbol": "[X]",
      "object_symbol": "[Y]"
    }
}

The relation summary should have following structure

{
  "relation_info":  {
      "relation": "P19",
      "template": "[X] was born in [Y]",
      "label": "place of birth",
      "description": "most specific known (e.g. city instead of country, or hospital instead of city) birth location of a person, animal or fictional character",
      "type": "N-1",
      "subject_symbol": "[X]",
      "object_symbol": "[Y]"
    },
  "node_summary": {
    "objects": {
      "uri1": {
        "object_label": object_label,
        "object_uri": object_uri,
        "subjects": [array of subject uri]
      },
      ...
    }
  },
  "subjects": {
      "uri1": {
        "subject_label": subject_label,
        "subject_uri": subject_uri,
        "objects": [array of object uri]
      },
      ...
    }
  }
}

Example of running command

python run_certlm.py --step extract \
    --kg trex \
    --data-dir ./examples/TREx \
    --data-file relations.jsonl \
    --output-dir ./output  

Generator

We support the following generator: template, llm-mask, llm-question. Each generator needs to implement the following methods

  • generate_questions(triplet_input_file, relation_summary_file, output_file)
# question
    prompts = [
            {"role": "system", "content": prompt},
            {"role": "user", "content": content}
    ]
    expected_answers = [{"uri": a[0], "label": a[1]} for a in answers]
    question_record = {
        "question_id": question_id,
        "prompts": prompts,
        "answers": expected_answers,
    }
    json_record = json.dumps(question_record)
    fout.write(json_record + '\n')

Example of running command

python run_certlm.py --step question_generate \
    --kg trex \
    --generator masking \
    --data-dir ./examples/TREx \
    --data-file relations.jsonl \
    --gen-input-dir ./output/tuples 

Citation

If this repo is useful for your own research, please cite us with the following bibtex entry

@article{luo2023systematic,
  title={Systematic Assessment of Factual Knowledge in Large Language Models},
  author={Luo, Linhao and Vu, Thuy-Trang and Phung, Dinh and Haffari, Gholamreza},
  journal={Findings of EMNLP},
  year={2023}
}