Automatic essay scoring using BERT embeddings
You can install the dependencies using Poetry:
$ poetry install
Or else use the Docker image / Dockerfile
. The Docker image has been tested
mainly under Singularity.
Input data is a JSON array of objects [{...}, {...}] with one object per essay. It should have at least the keys:
- "essay": an array of strings for each line of the essay e.g. ["Lorem ipsum --", "dolar"]
- "lab_grade": the grade as an string e.g. "3"
You can run the Snakefile with Snakemake. It should work locally with:
$ snakemake all -C RAW_DATASETS=/path/to/raw_datasets/
If using SLURM+Singulairty you can use
singslurm2. You will need to make
a configuration for your cluster clusc.json
and then run:
CLUSTER_CONFIG=`pwd`/clusc.json \
SIF_PATH=/path/to/my.sif \
SNAKEFILE=/finnessayscore/workflow/Snakefile \
RESTART_TIMES=0 \
$SINGSLURM2/entrypoint.sh \
--use-singularity \
--singularity-args '"--nv"' \
all \
-C RAW_DATASETS=/path/to/raw_datasets/
Convert TKP2 exam data to the JSON format by using:
$ python -m finnessayscore.process_tkp tkp.xls tkp.json
Some models need parsed data. In this case, further preprocessing should be done like so:
$ python -m finnessayscore.parse example.json example_parse.json
You will need to provide the grading scale of your dataset as a pickle file.
You can generate some standard grading scales with
finnessayscore.mk_grade_pickle
e.g. for the TKP2 20-point scale:
$ python -m finnessayscore.mk_grade_pickle outof20 outof20.pkl
Training:
$ python -m finnessayscore.train \
--epochs 1 \
--batch_size=5 \
--model_type whole_essay \
--data_dir /path/to/datadir
A confusion matrix and scores on the validation set are printed at the end of training.
Results on tensorboard
$ tensorboard --logdir lightning_logs/ --port <port_number>
Getting explanation jsons using for example TKP2 dataset:
$ python -m finnessayscore.explain.explain_trunc \
--gpu \
--model_type pedantic_trunc_essay_ord \
--class_nums /path/to/outof20.pkl \
--load_checkpoint /path/to/out/checkpoint.ckpt \
--data_dir /path/to/tkp2_exam.json
If you want you can use --exclude_upos
to give parts of speech to put in the
reference/ignore in LIG. Commonly this would be PUNC. In this case you must
give a data_dir which has info from the dependency parser in it.
You can then view them by modifying the explain-trunc.ipynb
Jupyter notebook.