Leaderboard of JudgLM evaluations

Question

Opened this issue 8 months ago · 2 comments

I have evaluated one of my model using your JudgeLM 13B model.

How do I benchmark this against other models for comparison?

Answer 1 · 2023-12-03T12:46:33.000Z

Can you share the .jsonl files examples you used? @sachith-surge

Answer 2 · 2023-12-07T04:31:48.000Z

I used this judgelm-val-5k-judge-samples.jsonl file to evaluate my model.