Blanche is a library containing the most important nlg evaluation metric and some emerging ones
All metrics have citations in blanche source code
Note: FEQA and FactCC may have some trouble in execution
- Create data folder (blanche/data)
- Put it file of references and predictions
update_test_set("name of test set", "references file name", "predictions file name")
- Results are saved in a metrics directory
name_of_test_set_name_metrics