This repository provides a script for an integrated evaluation of Natural Language Generation (NLG) metrics.
It utilizes components of the CommonGen evaluation function, and you can modify the concept settings as needed.
NLG_metrics/
├── BARTScore
├── BERTScore
├── BLEU
├── CiDEr
├── METEOR
├── ROUGE
├── SPICE
├── result
├── test
├── Dockerfile # Docker execution file
├── install.sh # Script to install required packages
├── similarity.py # Integrated evaluation function
├── similarity.sh # Script to execute integrated evaluation
└── requirements.txt # List of required packages
To install the required packages, you can run the following commands:
conda create -n $YOUR_ENV$ python==3.8
conda activate $YOUR_ENV$
sh install.sh
You should also download the following file and move on your SPICE/lib folder
https://drive.google.com/file/d/1Hwu0qXV5s3hM1sq43fDUGdi_mlyXZHpK/view?usp=sharing
Please make sure to specify the paths and dataset file settings within the shell files before running the script.
To execute the integrated evaluation, run:
sh similarity.sh
@Code{
year={2023},
title={NLG_metric},
author={Jaehyung Seo},
affiliation={Korea University, NLP & AI LAB},
email={seojae777@korea.ac.kr}}
This script is based on CommonGen, BERTScore, BARTScore. We thank the authors for their academic contribution.