⚡ Build any Generative AI application with evaluation and enhancement ⚡
YiVal is a versatile platform and framework that streamlines the evaluation
and enhancement of your Generative AI applications.
It empowers you to
generate better results, reduce latency, and decrease inference cost easily.
Depending on your knowledge and comfort level, YiVal will help you
simultaneously optimize prompts, model metadata, model parameters, and
retrieval configurations. You can easily customize your test data, evaluation
methods, and enhancement strategies, all in one place. Enhance and evaluate
everything with ease!
Check out our quickstart guide!
- Python Version: Ensure you have
Python 3.10
or later installed. - OpenAI API Key: Obtain an API key from OpenAI. Once you have the key, set
it as an environment variable named
OPENAI_API_KEY
.
Install the yival
package directly using pip:
pip install yival
If you're looking to contribute or set up a development environment:
-
Install Poetry: If you haven't already, install Poetry.
-
Clone the Repository:
git clone https://github.com/YiVal/YiVal.git cd YiVal
-
Setup with Poetry: Initialize the Python virtual environment and install dependencies using Poetry. Make sure to run the below cmd in
/YiVal
directory:poetry install --sync
After setting up, you can quickly get started with YiVal by generating datasets of random tech startup business names.
-
Navigate to the yival Directory:
cd /YiVal/src/yival
-
Set OpenAI API Key: Replace
$YOUR_OPENAI_API_KEY
with your actual OpenAI API key.export OPENAI_API_KEY=$YOUR_OPENAI_API_KEY
-
Define YiVal Configuration: Create a configuration file named
config_data_generation.yml
for automated test dataset generation with the following content:description: Generate test data dataset: data_generators: openai_prompt_data_generator: chunk_size: 100000 diversify: true model_name: gpt-4 input_function: description: # Description of the function Given a tech startup business, generate a corresponding landing page headline name: headline_generation_for_business parameters: tech_startup_business: str # Parameter name and type number_of_examples: 3 output_csv_path: generated_examples.csv source_type: machine_generated
-
Execute YiVal: Run the following command from within the
/YiVal/src/yival
directory:yival run config_data_generation.yml
-
Check the Generated Dataset: The generated test dataset will be stored in
generated_examples.csv
.
Demo.mp4
Use Case Demo | Supported Features | Colab Link |
---|---|---|
🐯 Craft your AI story with ChatGPT and MidJourney | Multi-modal support: Design an AI-powered narrative using YiVal's multi-modal support of simultaneous text and images. It supports native and seamless Reinforcement Learning from Human Feedback(RLHF) and Reinforcement Learning from AI Feedback(RLAIF). Please watch the video above for this use case. | |
🌟 Evaluate performance of multiple LLMs with your own Q&A test dataset | Conveniently evaluate and compare performance of your model of choice against 100+ models, thanks to LiteLLM. Analyze model performance benchmarks tailored to your customized test data or use case. | |
🔥 Startup Company Headline Generation Bot | Streamline generation of headlines for your startup with automated test data creation, prompt crafting, results evaluation, and performance enhancement via GPT-4. | |
🧳 Build a Customized Travel Guide Bot | Leverage automated prompts inspired by the travel community's most popular suggestions, such as those from awesome-chatgpt-prompts. | |
📖 Build a Cheaper Translator: Use GPT-3.5 to teach Llama2 to create a translator with lower inference cost | Using Replicate and GPT-3.5's test data, you can fine-tune Llama2's translation bot. Benefit from 18x savings while experiencing only a 6% performance decrease. | |
🤖️ Chat with Your Favorite Characters - 澹台烬(Dantan Ji) from《长月烬明》(Till the End of the Moon) | Bring your favorite characters to life through automated prompt creation and character script retrieval. | |
🔍Evaluate guardrails's performance in generating Python(.py) outputs | Guardrails: where are my guardrails? 😭 Yival: I am here. ⭐️ The integrated evaluation experiment is carried out with 80 LeetCode problems in csv, using guardrail and using only GPT-4. The accuracy drops from 0.625 to 0.55 with guardrail, latency increases by 44%, and cost increases by 140%. Guardrail still has a long way to go from demo to production. |
If you want to contribute to YiVal, be sure to review the contribution guidelines. We use GitHub issues for tracking requests and bugs. Please join YiVal's discord channel for general questions and discussion. Join our collaborative community where your unique expertise as researchers and software engineers is highly valued! Contribute to our project and be a part of an innovative space where every line of code and research insight actively fuels advancements in technology, fostering a future that is intelligently connected and universally accessible.
🌟 YiVal welcomes your contributions! 🌟
🥳 Thanks so much to all of our amazing contributors 🥳