llm-evaluation-toolkit

There are 9 repositories under llm-evaluation-toolkit topic.

JohnSnowLabs/langtest
Deliver safe & effective language models
Language:Python501 10 44840
athina-ai/athina-evals
Python SDK for running evaluations on LLM generated responses
Language:Python216 5 113
parea-ai/parea-sdk-py
Python SDK for experimenting, testing, evaluating & monitoring LLM-powered applications - Parea AI (YC S23)
Language:Python74 2 16
Re-Align/just-eval
A simple GPT-based evaluation tool for multi-aspect, interpretable assessment of LLMs.
Language:Python74 4 46
zhuohaoyu/KIEval
[ACL'24] A Knowledge-grounded Interactive Evaluation Framework for Large Language Models
Language:Python32 3 22
scalexi/scalexi
scalexi is a versatile open-source Python library, optimized for Python 3.11+, focuses on facilitating low-code development and fine-tuning of diverse Large Language Models (LLMs).
Language:Python12 1 01
Agenta-AI/job_extractor_template
Template for an AI application that extracts the job information from a job description using openAI functions and langchain
Language:Python5 3 01
parea-ai/parea-sdk-ts
TypeScript SDK for experimenting, testing, evaluating & monitoring LLM-powered applications - Parea AI (YC S23)
Language:TypeScript4 0 11
EricLiclair/prayog-IndicInstruct
Indic evals for quantised models AWQ / GPTQ / EXL2
Language:Python0 0 00