🥶 HypoTermQA

Hypothetical Terms Dataset for Benchmarking Hallucination Tendency of LLMs

📚 Table of contents

📘 Introduction
📜 Publication
📝 Citation
🚀 Usage and Examples
📋 Requirements
🔧 Set Up
👋 Contact
🙏 Acknowledgements

📘 Introduction

HypoTermQA repository contains

📊 The HypoTermQA Benchmarking Dataset View Dataset
💻 The sample code to use the HypoTermQA Dataset on LLMs View Code
🧪 The sample code to evaluate hallucination tendency of LLMs View Code
📜 The sample code to reproduce the paper View Code
🔄 Intermediate results of the dataset generation process View Results
📈 Intermediate results of the LLM evaluation process View Results

📜 Publication

This repository contains the implementation of our research presented in the following paper:

HypoTermQA: Hypothetical Terms Dataset for Benchmarking Hallucination Tendency of LLMs

The paper was presented at EACL SRW 2024. You can see the poster we presented below:

📝 Citation

Our paper will be published in the proceedings of EACL SRW 2024. The citation details will be updated once the proceedings are published. Please check back for updates.

In the meantime, you can cite our work as follows:

@misc{uluoglakci2024hypotermqa,
  title={HypoTermQA: Hypothetical Terms Dataset for Benchmarking Hallucination Tendency of LLMs},
  author={Cem Uluoglakci and Tugba Taskaya Temizel},
  year={2024},
  howpublished={To appear in the proceedings of EACL SRW 2024}
}

🚀 Usage and Examples

This repository contains several examples that demonstrate different aspects of the HypoTermQA Dataset and its usage with LLMs:

💻 Using the HypoTermQA Dataset with LLMs: These examples show how to use the HypoTermQA Dataset with Language Models.
🧪 Evaluating Hallucination Tendency of LLMs: These examples demonstrate how to evaluate the hallucination tendency of Language Models using our dataset.
📜 Reproducing the Paper: These examples provide the code necessary to reproduce hypothetical dataset creation process.

📋 Requirements

PYTHON_VERSION=3.10.5
Ollama Container
MySql Server
Mongo DB
Milvus DB
Pytorch (https://pytorch.org/get-started/locally/)

🔧 Set Up

Follow these steps to set up the development environment:

Create a virtual environment:

For Unix or MacOS, run:
```
python3 -m venv venv
```
For Windows, run:
```
python -m venv venv
```
Activate the virtual environment:

On Unix or MacOS, run:
```
source venv/bin/activate
```
On Windows, run:
```
.\venv\Scripts\activate
```
Install the required packages:
```
pip install -r requirements.txt
```

👋 Contact

If you have any questions, issues, or if you need support with the project, you can get in touch with us:

GitHub: GitHub Profile
LinkedIn: LinkedIn Profile

Please feel free to report any bugs or issues, we appreciate your feedback!

🙏 Acknowledgements

The computational experiments conducted with open LLMs in this study were fully performed at TUBITAK ULAKBIM, High Performance and Grid Computing Center (TRUBA resources).

cemuluoglakci/HypoTermQA