Issues
- 3
Support for GPT-4o
#1529 opened by PrashantDixit0 - 1
TensorFlow fails while no TensorFlow expected to run at all
#1532 opened by artkpv - 1
Schelling point eval doesn't work
#1533 opened by johny-b - 1
Possibility to sell high quality benchmarks
#1437 opened by guliashvili - 1
What is this
#1527 opened by DXv-3 - 1
Getting started example doesn't work - oieval attempts to update a None type object
#1515 opened by jswang - 4
- 3
When installing the project dependencies, i got: "ERROR: Could not build wheels for greenlet, which is required to install pyproject.toml-based projects"
#1513 opened by JuanmaMenendez - 2
Support for Azure OpenAI client
#1469 opened by pkt1583 - 0
Setting completion function args via CLI does not work
#1504 opened by LoryPack - 0
- 0
Support multiple completions for ModelbasedClassify
#1484 opened by tom-christie - 0
- 4
Eval-running often hangs on last sample
#1384 opened by sjadler2004 - 1
Local run doesn't save logs to disk
#1459 opened by charles-somm - 3
- 1
Tagged Release For 2.0.0
#1456 opened by michaelAlvarino - 0
`Failed to open: ../registry/data/social_iqa/few_shot.jsonl` with custom registry
#1394 opened by LoryPack - 3
国际化的支持
#1239 opened by wangkunmin - 0
Request to change arithmetical_puzzles prompting
#1448 opened by ArcticBeat05 - 2
Error structure in `utils` after openai package upgrade
#1432 opened by inwaves - 3
- 5
- 3
oaieval --help errors for me
#1369 opened by sjadler2004 - 1
Do not back off on `openai.BadRequestError`
#1408 opened by johny-b - 0
Improvements to `Match`: case insensitive and strip
#1421 opened by LoryPack - 5
Using different models in evaluating mode-graded eval and in generating the completion
#1393 opened by LoryPack - 0
- 2
Evals broken with latest openai package v1.1.1
#1399 opened by ojaffe - 2
Expose run_id to code being run within an eval
#1264 opened by robatwilliams - 1
Having trouble building Evals locally? Try this.
#1340 opened by silverfoxf7 - 1
In the task "balance_chemical_equation", many instances have incorrect labels.
#1386 opened by dongZheX - 5
Multiple evals not found
#1379 opened by SUMEETRM - 0
Should random collection of values be supported?
#1382 opened by assert6 - 0
Context window of completion functions not accounted for
#1377 opened by pskl - 0
Use github.com/apssouza22/chatflow as a conversational layer. It would enable actual API requests to be carried out from natural language inputs.
#1362 opened by GiovanniSmokes - 1
- 0
Evaluate the cost of running tests
#1350 opened by onjas-buidl - 1
How to eval output with ideal_answer directly without having to define the completion_fn ?
#1342 opened by liuyaox - 0
- 0
Publish latest evals framework to PyPI
#1344 opened by robatwilliams - 0
Find claims from research paper
#1338 opened - 0
Accuracy Score
#1328 opened by jeyarajcs - 1
All evals currently in the repo appear only to have dev samples: is this correct?
#1319 opened by mesotron - 0
Please approve pull request, changes were made.
#1298 opened by nickabooch - 1
oaieval hangs a lot
#1292 opened by shamas- - 2
Meaning of "elsuite" folder name
#1282 opened by siftxxx - 0
Code Evals
#1275 opened by billxbf - 0
closedqa prompt is not adequate for gpt-4-0613
#1228 opened by JasonGross - 1
You should see GPT-4 API access enabled in your account in the next few days.
#1208 opened by verheesj