Hypothetical Terms Dataset for Benchmarking Hallucination Tendency of LLMs
- ๐ Introduction
- ๐ Publication
- ๐ Citation
- ๐ Usage and Examples
- ๐ Requirements
- ๐ง Set Up
- ๐ Contact
- ๐ Acknowledgements
HypoTermQA repository contains
- ๐ The HypoTermQA Benchmarking Dataset View Dataset
- ๐ป The sample code to use the HypoTermQA Dataset on LLMs View Code
- ๐งช The sample code to evaluate hallucination tendency of LLMs View Code
- ๐ The sample code to reproduce the paper View Code
- ๐ Intermediate results of the dataset generation process View Results
- ๐ Intermediate results of the LLM evaluation process View Results
This repository contains the implementation of our research presented in the following paper:
HypoTermQA: Hypothetical Terms Dataset for Benchmarking Hallucination Tendency of LLMs
The paper was presented at EACL SRW 2024. You can see the poster we presented below:
Our paper will be published in the proceedings of EACL SRW 2024. The citation details will be updated once the proceedings are published. Please check back for updates.
In the meantime, you can cite our work as follows:
@misc{uluoglakci2024hypotermqa,
title={HypoTermQA: Hypothetical Terms Dataset for Benchmarking Hallucination Tendency of LLMs},
author={Cem Uluoglakci and Tugba Taskaya Temizel},
year={2024},
howpublished={To appear in the proceedings of EACL SRW 2024}
}
This repository contains several examples that demonstrate different aspects of the HypoTermQA Dataset and its usage with LLMs:
- ๐ป Using the HypoTermQA Dataset with LLMs: These examples show how to use the HypoTermQA Dataset with Language Models.
- ๐งช Evaluating Hallucination Tendency of LLMs: These examples demonstrate how to evaluate the hallucination tendency of Language Models using our dataset.
- ๐ Reproducing the Paper: These examples provide the code necessary to reproduce hypothetical dataset creation process.
- PYTHON_VERSION=3.10.5
- Ollama Container
- MySql Server
- Mongo DB
- Milvus DB
- Pytorch (https://pytorch.org/get-started/locally/)
Follow these steps to set up the development environment:
-
Create a virtual environment:
For Unix or MacOS, run:
python3 -m venv venv
For Windows, run:
python -m venv venv
-
Activate the virtual environment:
On Unix or MacOS, run:
source venv/bin/activate
On Windows, run:
.\venv\Scripts\activate
-
Install the required packages:
pip install -r requirements.txt
If you have any questions, issues, or if you need support with the project, you can get in touch with us:
- GitHub: GitHub Profile
- LinkedIn: LinkedIn Profile
Please feel free to report any bugs or issues, we appreciate your feedback!
The computational experiments conducted with open LLMs in this study were fully performed at TUBITAK ULAKBIM, High Performance and Grid Computing Center (TRUBA resources).