SOME: Reference-less Sub-Metrics Optimized for Manual Evaluations of Grammatical Error Correction
Paper: https://www.aclweb.org/anthology/2020.coling-main.573.pdf
- Python >= 3.6.0
- Pytorch >= 1.3.1
- transformers >= 3.0.2
- Download trained model here.
- These model are trained on TMU-GFM-Dataset.
These models can be downloaded by:
bash prepare.sh
tree gfm-models
# gfm-models
# ├── fluency
# │ ├── config.json
# │ ├── pytorch_model.bin
# │ ├── special_tokens_map.json
# │ ├── tokenizer_config.json
# │ ├── training_args.bin
# │ └── vocab.txt
# ├── grammer
# │ ├── ...
# └── meaning
# ├── ...
from some_wrapper import Dataclass_for_args, SOME_Wrapper
some = SOME_Wrapper(
g_dir='gfm-models/grammer',
f_dir='gfm-models/fluency',
m_dir='gfm-models/meaning',
batch_size=5,
weight_g=0.55,
weight_f=0.43,
weight_m=0.02
)
srcs = [
'This is a sample sentence .',
'This is an another sample sentene .'
]
trgs = [
'This a is sample sentence .',
'This is another sample sentence .'
]
scores = some.score(srcs, trgs)
print(scores) # [0.7722907622655234, 0.9522199455897014]
python some.py [hypothesis file] [source file] \
--g-dir [directry path of grammar model] \
--f-dir [directry path of fluency model] \
--m-dir [directry path of meaning model] > predict_score
More option can be found python some.py -h
.
The default weight of each model are tuned with Kendall tau on Grundkiewicz et al. (2015).
Details can be found the paper.