Official repo for the paper PHUDGE: Phi-3 as Scalable Judge. Evaluate your LLMs with or without custom rubric, reference answer, absolute, relative and much more. It contains a list of all the available tool, methods, repo, code etc to detect hallucination, LLM evaluation, grading and much more.
Jupyter Notebook
- alberto-trxry
- Anindyadeep@PremAI-io
- balsulami
- bobflagg
- buildwithedwardAgilisium Consulting
- chuan298
- dbnduchatrji
- dTony33
- evdcush
- ggrizzly
- hooman-bayerBayer Pharmaceuticals
- imanousar@Pfizer
- imchkkim
- jinunyachhyonIRIIS
- JizhongpengShanghai
- josecohenca
- jwyniaHappi
- Kartikaggarwal98Washington DC
- kimsh0507
- LekssaysImi-N-Tanoute, Morocco
- luciolcv
- Lwachira
- MrAnayDongreUnited States
- nanoflooder
- onexixi
- pilarcodeGFT
- polya20
- pthavarasaParis, France
- riiduanMilan
- shawn2306Mumbai, india
- SimranAnand1Vellore Institute of Technology (VIT)
- smellslikemlSmellsLikeML
- tongji1907
- ucalyptus2Africa
- veeravignesh1Bangalore
- xxrjunNational Central University