Grey literature answer quality / user reputation measurement with BERT and DistilBERT.
--data_dir
and--labels
are always required.- Default sequence is
'TQA'
. Use--sequence
to change. - Default model is
'bert'
. Use--model
to change. - Default device is
'cpu'
. Use--device
to change.
To see all arguments and options:
python3 main.py --help
python3 main.py --data_dir='data/dp' --labels='sum_class' --device='cuda' --crop=0.25
⚠️ Here,--data-dir
must includeraw.csv
that will be divided into train, dev and test sets, and stored underdata/dp/TQA
(since the default sequence is'TQA'
).
Modified version of the code in https://github.com/isspek/west_iyte_plausability_news_detection