llm-evaluation-toolkit

There are 9 repositories under llm-evaluation-toolkit topic.

  • JohnSnowLabs/langtest

    Deliver safe & effective language models

    Language:Python5011044840
  • athina-ai/athina-evals

    Python SDK for running evaluations on LLM generated responses

    Language:Python2165113
  • parea-ai/parea-sdk-py

    Python SDK for experimenting, testing, evaluating & monitoring LLM-powered applications - Parea AI (YC S23)

    Language:Python74216
  • Re-Align/just-eval

    A simple GPT-based evaluation tool for multi-aspect, interpretable assessment of LLMs.

    Language:Python74446
  • zhuohaoyu/KIEval

    [ACL'24] A Knowledge-grounded Interactive Evaluation Framework for Large Language Models

    Language:Python32322
  • scalexi/scalexi

    scalexi is a versatile open-source Python library, optimized for Python 3.11+, focuses on facilitating low-code development and fine-tuning of diverse Large Language Models (LLMs).

    Language:Python12101
  • Agenta-AI/job_extractor_template

    Template for an AI application that extracts the job information from a job description using openAI functions and langchain

    Language:Python5301
  • parea-ai/parea-sdk-ts

    TypeScript SDK for experimenting, testing, evaluating & monitoring LLM-powered applications - Parea AI (YC S23)

    Language:TypeScript4011
  • EricLiclair/prayog-IndicInstruct

    Indic evals for quantised models AWQ / GPTQ / EXL2

    Language:Python0000