/HypoTermQA

Introduces Hypothetical Terms (HypoTermQA ๐Ÿฅถ) dataset to benchmark hallucination tendency of LLMs.

Primary LanguagePython

๐Ÿฅถ HypoTermQA

Hypothetical Terms Dataset for Benchmarking Hallucination Tendency of LLMs

๐Ÿ“š Table of contents

๐Ÿ“˜ Introduction

HypoTermQA repository contains

  • ๐Ÿ“Š The HypoTermQA Benchmarking Dataset View Dataset
  • ๐Ÿ’ป The sample code to use the HypoTermQA Dataset on LLMs View Code
  • ๐Ÿงช The sample code to evaluate hallucination tendency of LLMs View Code
  • ๐Ÿ“œ The sample code to reproduce the paper View Code
  • ๐Ÿ”„ Intermediate results of the dataset generation process View Results
  • ๐Ÿ“ˆ Intermediate results of the LLM evaluation process View Results

๐Ÿ“œ Publication

This repository contains the implementation of our research presented in the following paper:

HypoTermQA: Hypothetical Terms Dataset for Benchmarking Hallucination Tendency of LLMs

The paper was presented at EACL SRW 2024. You can see the poster we presented below:

Poster Image

๐Ÿ“ Citation

Our paper will be published in the proceedings of EACL SRW 2024. The citation details will be updated once the proceedings are published. Please check back for updates.

In the meantime, you can cite our work as follows:

@misc{uluoglakci2024hypotermqa,
  title={HypoTermQA: Hypothetical Terms Dataset for Benchmarking Hallucination Tendency of LLMs},
  author={Cem Uluoglakci and Tugba Taskaya Temizel},
  year={2024},
  howpublished={To appear in the proceedings of EACL SRW 2024}
}

๐Ÿš€ Usage and Examples

This repository contains several examples that demonstrate different aspects of the HypoTermQA Dataset and its usage with LLMs:

๐Ÿ“‹ Requirements

  1. PYTHON_VERSION=3.10.5
  2. Ollama Container
  3. MySql Server
  4. Mongo DB
  5. Milvus DB
  6. Pytorch (https://pytorch.org/get-started/locally/)

๐Ÿ”ง Set Up

Follow these steps to set up the development environment:

  1. Create a virtual environment:

    For Unix or MacOS, run:

    python3 -m venv venv

    For Windows, run:

    python -m venv venv
  2. Activate the virtual environment:

    On Unix or MacOS, run:

    source venv/bin/activate

    On Windows, run:

    .\venv\Scripts\activate
  3. Install the required packages:

    pip install -r requirements.txt

๐Ÿ‘‹ Contact

If you have any questions, issues, or if you need support with the project, you can get in touch with us:

Please feel free to report any bugs or issues, we appreciate your feedback!

๐Ÿ™ Acknowledgements

The computational experiments conducted with open LLMs in this study were fully performed at TUBITAK ULAKBIM, High Performance and Grid Computing Center (TRUBA resources).