vectara/hallucination-leaderboard
Leaderboard Comparing LLM Performance at Producing Hallucinations when Summarizing Short Documents
PythonApache-2.0
Issues
- 1
Reproducing HF model Summaries
#80 opened by Noor-Nizar - 2
Claude 3.5 Sonnet New
#77 opened by sushantnair - 2
Any arxiv paper or report?
#24 opened by zhimin-z - 1
- 2
Would you please provide a citation bibtex?
#26 opened by JacksonWuxs - 1
Can we reproduce the leadersboard?
#21 opened by amir-abdi - 2
Google PaLM reference
#8 opened by suddhasatwabhaumik - 1
Can you add a CITATION.cff for easy citation?
#19 opened by sigjhl - 4
instance level metric outputs
#6 opened by cabreraalex - 1
- 1
Claude 2.1 Benchmark Missing
#16 opened by lukestanley - 1
Generation parameters
#9 opened by qmdnls - 2
- 2
- 1
Google Palm API version?
#5 opened by zizhaozhang - 1
GPT 4-Turbo
#3 opened by orionsolidified