Part-of-speech tagging using BERT
Download BERT models
./scripts/getmodels.sh
Experiment with FinBERT cased and TDT data
MODELDIR="models/bert-base-finnish-cased"
DATADIR="data/tdt"
python3 train.py \
--vocab_file "$MODELDIR/vocab.txt" \
--bert_config_file "$MODELDIR/bert_config.json" \
--init_checkpoint "$MODELDIR/bert-base-finnish-cased" \
--data_dir "$DATADIR" \
--learning_rate 5e-5 \
--num_train_epochs 3 \
--predict test \
--output pred.tsv
python scripts/mergepos.py "$DATADIR/test.conllu" pred.tsv > pred.conllu
python scripts/conll18_ud_eval.py -v "$DATADIR/gold-test.conllu" pred.conllu
Manually annotated data
(A small part of this data is found in data/ud-treebanks-v2.2/
)
curl --remote-name-all https://lindat.mff.cuni.cz/repository/xmlui/bitstream/handle/11234/1-2837/ud-treebanks-v2.2.tgz
tar xvzf ud-treebanks-v2.2.tgz
Predictions from CoNLL'18 participants
(A small part of this data is found in data/official-submissions/
)
curl --remote-name-all https://lindat.mff.cuni.cz/repository/xmlui/bitstream/handle/11234/1-2885/conll2018-test-runs.tgz
tar xvzf conll2018-test-runs.tgz
Evaluation script
wget https://universaldependencies.org/conll18/conll18_ud_eval.py \
-O scripts/conll18_ud_eval.py
Gold data
for t in tdt ftb pud; do
mkdir data/$t
for f in data/ud-treebanks-v2.2/*/fi_${t}-ud-*.conllu; do
s=$(echo "$f" | perl -pe 's/.*\/.*-ud-(.*)\.conllu/$1/')
egrep '^([0-9]+'$'\t''|[[:space:]]*$)' $f | cut -f 2,4 \
> data/$t/$s.tsv
done
cut -f 2 data/$t/test.tsv | egrep -v '^[[:space:]]*$' | sort | uniq \
> data/$t/labels.txt
mv data/$t/test.tsv data/$t/gold-test.tsv
cp data/ud-treebanks-v2.2/*/fi_${t}-ud-test.conllu data/$t/gold-test.conllu
done
PUD doesn't have train and dev, use TDT
for s in train dev; do
cp data/tdt/$s.tsv data/pud
done
Test data with predicted tokens
for t in tdt ftb pud; do
cp data/official-submissions/Uppsala-18/fi_$t.conllu data/$t/test.conllu
egrep '^([0-9]+'$'\t''|[[:space:]]*$)' data/$t/test.conllu \
| cut -f 2 | perl -pe 's/(\S+)$/$1\tX/' > data/$t/test.tsv
done
Best UPOS result for each Finnish treebank in CoNLL'18 from https://universaldependencies.org/conll18/results-upos.html
fi_ftb: 1. HIT-SCIR (Harbin): 96.70
fi_pud: 1. LATTICE (Paris) : 97.65
fi_tdt: 1. HIT-SCIR (Harbin): 97.30
The scripts run here are specific to a particular Slurm system configuration. You will need to edit them to match your setup if you want to rerun this.
./slurm/run-parameter-selection.sh
python3 slurm/select_params.py logs/*.out | cut -f 1-12 > slurm/selected-params.tsv
./slurm/run-selected-params.sh
python3 slurm/summarize_test.py logs/*.out | cut -f 2,4,11-14 > results.tsv
This should give approximately the following results:
Model Corpus Mean
FinBERT cased FTB 98.39
FinBERT uncased FTB 98.28
biBERT70 FTB 98.30
biBERT80 FTB 98.16
M-BERT cased FTB 95.87
M-BERT uncased FTB 96.00
FinBERT cased PUD 98.08
FinBERT uncased PUD 97.94
biBERT70 PUD 98.08
biBERT80 PUD 98.07
M-BERT cased PUD 97.58
M-BERT uncased PUD 97.48
FinBERT cased TDT 98.23
FinBERT uncased TDT 98.12
biBERT70 TDT 98.17
biBERT80 TDT 98.14
M-BERT cased TDT 96.97
M-BERT uncased TDT 96.59
In scripts/
venv.sh
to create venv on puhti Inslurm/
batch-dev-pos.sh
runs more than one set of parameters for each sbatch submission, so that time is not wasted queuingbatch-run-parameter-selection.sh
used for submitting jobs forbatch-dev-pos.sh
print.sh
prints a file for all the parameters to be searched. One line per job. Used forbatch-run-parameter-selection.sh
- function
batch_read_logs
added insummarize.py
for processing files where all the results have been collected in one file select_params.py
changed to usingbatch_read_logs
instead ofread_logs
summarize_test.py
changed to usingbatch_read_logs
instead ofread_logs
- Parameter search on the dev set
- use
print.sh
to print out all the combinations - use
batch-dev-pos.sh
andbatch-run-parameter-selection.sh
to do the experiments
- Collecting results
- grab for 'DEV-RESULT' and 'accuracy' in the stdout logging files for the results, collect the results into a tsv file
dev-results.tsv
- Use
select_params.py
to select the parameters for the test set
python3 slurm/select_params.py slurm/dev-results.tsv | cut -f 1-12 > slurm/selected-params.tsv
- Run the selected parameters on the test set
- use
batch-test-pos.sh
andbatch-run-selected-params.sh
- Organizing the results
- grab for 'TEST-RESULT':
grep -h 'TEST-RESULT' logs/*.out > delme.test-result
cat test-result.delme| grep biBERT70 > delme.70
python3 slurm/summarize_test.py delme.70
rm delme.*
batch size 20, 4 epochs out of memory (requires 6GB of memory)
Check for out of memory errors
less logs/batch-*.err | grep -i 'status: out of'
Parameter search on the dev set Batching the jobs according to the following parameters {16,20}{biBERT70,biBERT80}{[23],4}