declare-lab/instruct-eval

This repository contains code to quantitatively evaluate instruction-tuned models such as Alpaca and Flan-T5 on held-out tasks.

PythonApache-2.0

Issues

Evaluating with adapters
#35 opened a month ago by maanasharma5
0
Support for lm_eval v0.4 and higher
#34 opened 3 months ago by Shinning-Zhou
0
Multi GPU Support is required
#33 opened 8 months ago by chintan-ushur
0
Evaluate EncoderDecoderModels
#32 opened a year ago by Bachstelze
0
Colab notebook
#31 opened a year ago by Bachstelze
0
CRASS
#30 opened a year ago by Bachstelze
0
Evaluate on a single 24GB/32GB GPU
#29 opened a year ago by lemyx
1
How to submit own model to leaderboard?
#28 opened a year ago by timothylimyl
1
[Prompt Template] Silent bug - Performance Killer
#27 opened a year ago by timothylimyl
0
What are the metrics for the evaluation results?
#26 opened a year ago by zhimin-z
0
Can not reproduce results on the table
#3 opened 2 years ago by simplelifetime
7
Reproduce the accuracy of chavinlo/alpaca-native on MMLU
#25 opened a year ago by sglucas
0
Support for larger batch_size
#18 opened a year ago by soumyasanyal
1
HHH Benchmark evaluation question: why using base prompt and (A - A_base) > (B - B_base)?
#20 opened a year ago by t170815518
1
What to do about broken Evals?
#21 opened a year ago by damhack
1
Fail to Evaluate Model on human_eval
#22 opened a year ago by yjw1029
1
C-Eval
#17 opened a year ago by duanqiyuan
0
请问能加入对baichuan大模型的支持吗
#16 opened 2 years ago by linghongli
1
add multiple gpu support
#15 opened 2 years ago by lxy444
0
[Feature Request] Saving Prediction Results
#14 opened 2 years ago by guanqun-yang
0
Is there any parallel processing methods?
#13 opened 2 years ago by wwngh1233
0
Add config to save eval results
#12 opened 2 years ago by arthurtobler
0
Future directions
#11 opened 2 years ago by tju01
1
执行过程中报错，信息如下
#9 opened 2 years ago by linghongli
1
Regarding the comparison to lm-evaluation-harness
#10 opened 2 years ago by gakada
0
Integrate the evaluation in the Transformers trainer with transformers.TrainerCallback
#7 opened 2 years ago by BaohaoLiao
1
AutoModelForCausalLM supports llama models now
#6 opened 2 years ago by passaglia
1
Add License
#5 opened 2 years ago by passaglia
2
Add zero-shot evaluation results
#4 opened 2 years ago by LeeShiyang
1
Prompt format for LLaMa
#2 opened 2 years ago by LeeShiyang
2