/YiVal

๐Ÿš€ Evaluate and Evolve.๐Ÿš€ YiVal is an open-source GenAI-Ops tool for tuning and evaluating prompts, configurations, and model parameters using customizable datasets, evaluation methods, and improvement strategies.

Primary LanguagePythonApache License 2.0Apache-2.0

๐Ÿงš๐Ÿปโ€๏ธ YiVal

Website ยท Producthunt ยท Documentation

โšก Build any Generative AI application with evaluation and improvement โšก

๐Ÿ‘‰ Follow us: Twitter | Discord

Downloads License: MIT GitHub star chart Dependency Status Open Issues

๐Ÿค” What is YiVal?

YiVal is an GenAI-Ops framework that allows you to iteratively tune your Generative AI model metadata, params, prompts and retrieval configs all at once with your preferred choices of test dataset generation, evaluation algorithms and improvement strategies.

Check out our quickstart guide! โ†’

๐Ÿ“ฃ What's Next?

Expected Features in Sep

  • Add ROUGE and BERTScore evaluators
  • Add support to midjourney
  • Add support to LLaMA2-70B, LLaMA2-7B, Falcon-40B,
  • Support LoRA fine-tune to open source models

๐Ÿš€ Features

๐Ÿ”ง Experiment Mode: ๐Ÿค– Agent Mode (Auto-prompting):
Workflow Define your AI/ML application โžก๏ธ Define test dataset โžก๏ธ Evaluate ๐Ÿ”„ Improve โžก๏ธ Prompt related artifacts built โœ… Define your AI/ML application โžก๏ธ Auto-prompting โžก๏ธ Prompt related artifacts built โœ…
Features ๐ŸŒŸ Streamlined prompt development process
๐ŸŒŸ Support for multimedia and multimodel
๐ŸŒŸ Support CSV upload and GPT4 generated test data
๐ŸŒŸ Dashboard tracking latency, price and evaluator results
๐ŸŒŸ Human(RLHF) and algorithm based improvers
๐ŸŒŸ Service with detailed web view
๐ŸŒŸ Customizable evaluators and improvers
๐ŸŒŸ Non-code experience of Gen-AI application build
๐ŸŒŸ Witness your Gen-AI application born and improve with just one click

Model Support matrix

We support 100+ LLM ( gpt-4 , gpt-3.5-turbo , llama e.g.).

Different Model sources can be viewed as follow

Model llm-Evaluate Human-Evaluate Variation Generate Custom func
OpenAI โœ… โœ… โœ… โœ…
Azure โœ… โœ… โœ… โœ…
TogetherAI โœ… โœ… โœ… โœ…
Cohere โœ… โœ… โœ… โœ…
Huggingface โœ… โœ… โœ… โœ…
Anthropic โœ… โœ… โœ… โœ…
MidJourney โœ… โœ…

To support different models in custom func(e.g. Model Comparison) , follow our example

To support different models in evaluators and generators , check our config

Installation

pip install yival

Demo

Colab

Demo Supported Features Colab Link
๐Ÿฏ Animal story with MidJourney Multi-modal support of text and images Open In Colab
๐ŸŒŸ Model Comparison in QA ability Easy model evaluation comparison thanks to LiteLLM Open In Colab
๐Ÿ”ฅ Startup Company Headline Generation Bot Automated prompt evolution Open In Colab
๐Ÿงณ Build Your Customized Travel Guide Bot Automate prompt generation with retrieval methods Open In Colab
๐Ÿ“– Enhance Model Translation Capabilities Finetune the translation performance of llama2 with replicate Open In Colab

Multi-model Mode

Yival has multimodal capabilities and can handle generated images in AIGC really well.

Find more information in the Animal story demo we provided.

yival run demo/configs/animal_story.yml

pic

Basic Interactive Mode

To get started with a demo for basic interactive mode of YiVal, run the following command:

yival demo --auto_prompts

Once started, navigate to the following address in your web browser:

http://127.0.0.1:8073/interactive

Click to view the screenshot

Screenshot 2023-08-17 at 10 55 31 PM

For more details on this demo, check out the Basic Interactive Mode Demo.

Question Answering with expected result evaluator

yival demo --qa_expected_results

Once started, navigate to the following address in your web browser: http://127.0.0.1:8073/

Click to view the screenshot Screenshot 2023-08-18 at 1 11 44 AM

For more details, check out the Question Answering with expected result evaluator.

Automatically generate prompts with evaluator

yival demo --basic_interactive

Once started, navigate to the following address in your web browser: http://127.0.0.1:8073/

Click to view the screenshot Screenshot 2023-08-18 at 1 11 44 AM

Contributors

๐ŸŒŸ YiVal welcomes your contributions! ๐ŸŒŸ

๐Ÿฅณ Thanks so much to all of our amazing contributors ๐Ÿฅณ

Paper Implement

Paper Author Topics YiVal Contributor
Large Language Models Are Human-Level Prompt Engineers Yongchao Zhou, Andrei Ioan Muresanu, Ziwen Han YiVal Evolver,Auto-Prompting @Tao Feng
BERTScore: Evaluating Text Generation with BERT Tianyi Zhang, Varsha Kishore, Felix Wu YiVal Evaluator, bertscore, rouge @crazycth