Code to support Kent K. Chang, Danica Chen, David Bamman, “Dramatic Conversation Disentanglement.” ACL 2023 Findings.
N.B. the code is not thoroughly cleaned. Please contact kentkchang [at] berkeley.edu if you have any questions.
pip install -r requirements.txt
- BERT baseline
accelerate launch --config_file accelerate_config.yaml train_baseline.py \
--encoder_name DialogueLineEncoder \
--epochs 6 \
--batch_size 4 \
--use_tqdm True \
--train_file train.tsv \
--dev_file dev.tsv \
--model_name bert-base-cased \
--num_negative_examples 5
- 6-way classifier, based on Structural Characterization for Dialogue Disentanglement (Ma et al., ACL 2022) -- original implementation here
accelerate launch --config_file accelerate_config.yaml train_4DD.py \
--epochs 10 \
--batch_size 2 \
--train_file train.tsv \
--dev_file dev.tsv \
--max_previous_utterance 6 \
--model_name bert-base-cased \
--model_output output \
--log_output log \
--use_tqdm True
- Featurized
accelerate launch --config_file accelerate_config.yaml train_linear.py \
--encoder_name LogisticRegression \
--epochs 6 \
--batch_size 4 \
--train_file train.tsv \
--dev_file dev.tsv \
--model_name bert-base-cased \
--num_negative_examples 5
- Multi-task
accelerate launch --config_file accelerate_config.yaml train_multitask.py \
--encoder_name MultitaskDialogueEncoder \
--epochs 10 \
--batch_size 4 \
--train_file train.tsv \
--dev_file dev.tsv \
--model_name bert-base-cased \
--cluster_weight 0.5 \
--num_negative_examples 5
Download the trained model (6-way classifier, 4DD
) here and put dcd_pytorch_model-01062023-221722-epoch4.bin
in trained_models
.
accelerate launch --config_file accelerate_config_2proc.yaml run_test-take5-inference_4DD.py \
--batch_size 12 \
--max_previous_utterance 6 \
--model_path dcd_pytorch_model-01062023-221722-epoch4.bin \
--model_folder trained_models \
--model_name bert-base-cased \
--test_folder data/ \
--preds_output_folder output/