LAMB: Location-Aware Modular Bi-encoder for Tourism Question Answering

This repo provides the source code & data of our paper: Location-Aware Modular Bi-encoder for Tourism Question Answering (AACL 2023).

Dependencies

python>=3.6
torch>=1.11.0
trnasformers>=4.19.2
datasets>=2.2.2
huggingface-hub>=0.6.0

Quick Start for Training and Inference

Download the data

You can download the model parameters and dataset from here. Replace the entire data folder with the downloaded folder.

Training

To train the LAMB model, enter the src directory, and run LAMB_Exec.py.

This example code below trains the model with negative size of 8 and batch size of 8, it can be trained on a single tesla V100 16GB GPU. To reproduce the best results reported in the paper, use the 32GB or 40 GB GPU and the corresponding experimental setup.

python LAMB_Exec.py
    --q_encoder distilbert-base-uncased \
    --e_encoder distilbert-base-uncased \
    --l_encoder ../data/loc_module/loc_2layer.pth \
    --data_dir ../data \
    --location_module text \
    --batch_size 4 \
    --gradient_accumulation_steps 2 \
    --samples_per_qa 8 \
    --hard_negatives_per_qa 5 \
    --s1_train_epochs 5 \
    --s2_train_epochs 5 \
    --output_dir ../output \

haonan-li/LAMB

LAMB: Location-Aware Modular Bi-encoder for Tourism Question Answering

Dependencies

Quick Start for Training and Inference

Download the data

Training