models/bart-mq: finetuned version of facebook/bart-large-mnli on data/training.csv
models/deberta-mq: finetuned version of microsoft/deberta-large-mnli on data/training.csv
models/bart-adj: finetuned version of models/bart-mq on data/training-adj.csv
models/deberta-adj: finetuned version of models/deberta-mq on data/training-adj.csv
Creating the models
source .venv/bin/activate
# facebook/bart-large-mnli and microsoft/deberta-large-mnli will automatically# be downloaded from huggingface.co when used# models/bart-mq
python -m sarn.train --output-dir "models/bart-mq" --log-dir "logs/bart-mq""facebook/bart-large-mnli""data/training.csv"# models/deberta-mq
python -m sarn.train --output-dir "models/deberta-mq" --log-dir "logs/deberta-mq""microsoft/deberta-large-mnli""data/training.csv"# models/bart-adj
python -m sarn.train --output-dir "models/bart-adj" --log-dir "logs/bart-adj""facebook/bart-large-mnli""data/training-adj.csv"# models/deberta-adj
python -m sarn.train --output-dir "models/deberta-adj" --log-dir "logs/deberta-adj""microsoft/deberta-large-mnli""data/training-adj.csv"
Model statistics
Accuracy
Model
data/evaluation.csv
data/evaluation-adj.csv
facebook/bart-large-mnli
65.25%
40.97%
microsoft/deberta-large-mnli
71.19%
47.22%
models/bart-mq
57.63%
34.72%
models/deberta-mq
61.86%
34.72%
models/bart-adj
45.76%
58.33%
models/deberta-adj
42.37%
57.64%
ROC curves
An ROC curve (receiver operating characteristic curve) is a graph showing the performance of a classification model at all classification thresholds.
[...]
AUC stands for "Area under the ROC Curve." That is, AUC measures the entire two-dimensional area underneath the entire ROC curve (think integral calculus) from (0,0) to (1,1).
[...]
AUC provides an aggregate measure of performance across all possible classification thresholds. One way of interpreting AUC is as the probability that the model ranks a random positive example more highly than a random negative example.