bigscience-workshop/lm-evaluation-harness

A framework for few-shot evaluation of autoregressive language models.

PythonMIT

Issues

Number of fewshot examples
#158 opened 5 months ago by lpq29743
0
Single reference target
#159 opened 5 months ago by lpq29743
0
How can I use BigBIO (Biomedical Dataset Library) in this repository ?
#160 opened 6 months ago by Davidwhw
0
AssertionError
#161 opened 6 months ago by lpc-eol
2
max_length not set correctly
#148 opened a year ago by hatimbr
1
OverflowError: math range error
#106 opened a year ago by Muennighoff
1
Rouge score
#157 opened a year ago by Muennighoff
1
lm_eval.list_model_apis() not found
#156 opened a year ago by robertLiuLinFeng
1
Python 3.8 Support
#155 opened a year ago by vrunm
0
translation evalution error
#154 opened a year ago by laozhanghahaha
0
Seq2Seq: Special tokens are also added to targets for LL computation
#149 opened a year ago by samsontmr
2
Unknown issue with loading object
#147 opened a year ago by pku-yao-cheng
1
cache not storing predictions
#146 opened a year ago by rbawden
0
How this evaluation is done?
#142 opened 2 years ago by a-cavalcanti
1
Space prepended for Seq2Seq
#135 opened 2 years ago by Muennighoff
1
Implement COMET
#50 opened 2 years ago by StellaAthena
1
Create a nice API for getting (response, label) pairs to plug into external libraries.
#51 opened 2 years ago by StellaAthena
0
BLEURT or BERTScore added to NLG datasets.
#59 opened 2 years ago by jordiclive
1
copa+…As a result, C1 or C2? prompting error
#65 opened 2 years ago by StellaAthena
9
Bloom tested dataset not exist in this repo
#114 opened 2 years ago by switiz
1
different score ranges are confusing
#119 opened 2 years ago by Muennighoff
2
Selecting prompts
#107 opened 2 years ago by Muennighoff
3
Multilingual prompts
#105 opened 2 years ago by Muennighoff
1
`float object is not subscriptable` error in SuperGlue
#70 opened 2 years ago by StellaAthena
1
Clean up interface for HF models
#35 opened 2 years ago by StellaAthena
6
FLORES bugged with T5
#64 opened 2 years ago by StellaAthena
0