Simulating Follow-up Questions in Conversational Search

Setup

Authentication

For building models only

The LLama2 model needs access permissions to their repository. Hence, you need to have a Hugging Face account with an access token (can be created here). Fill out this Meta-AI form and request permission to their models here.
The Alpaca model requires an Chatnoir API token which can be requested here.

Then

make auth

General

This will also download the dataset

make clean install
git submodule init
git submodule update

Generate Follow-up Questions

Configure which dataset, model and prompt type should be used
```
make configure # creates run.yml
```

Activate the virtual environment and run the experiment.

source venv/bin/activate
python src/python/generate_followup_questions.py --config run.yml

Generate Tables

Automated Comparison (Table 1)

This computation may take a while depending on your hardware. A GPU is preferred for this experiment.

source venv/bin/activate
python src/python/compute_automatic_comparison.py

Human Assessment (Table 2)

source venv/bin/activate
python src/python/compute_human_assessment.py

User Modeling (Table 3)

source venv/bin/activate
python src/python/compute_user_model.py

Compute top-k most-frequent leading bigrams (Appendix)

source venv/bin/activate
python src/python/compute_leading_bigrams_frequency.py

Calculate Kappa

# setup
Rscript -e 'dir.create(Sys.getenv("R_LIBS_USER"), showWarnings=FALSE);install.packages("irr", lib=Sys.getenv("R_LIBS_USER"))'

# run
cat data/corpus-webis-follow-up-questions-24/simulation-annotations.json.gz \
  | gunzip \
  | python3 src/python/parse-label-studio-human-assessment-for-kappa.py /dev/stdin \
  | sed 's/not_generic/specific/' \
  > data/simulation-single-annotations.tsv
./src/r/kappa.R  data/simulation-single-annotations.tsv

Save models

Edit src/python/save.py

CUDA_VISIBLE_DEVICES="" HF_DATASETS_OFFLINE=1 TRANSFORMERS_OFFLINE=1 python src/python/save.py

In llama.cpp:

python convert-hf-to-gguf.py <model directory>

webis-de/ecir24-simulating-follow-up-questions