mccaffary/GPT-4-Domain-Specific-Language

Suggestion in the direction of OpenAI integration

UltimatePea opened this issue · 1 comments

Thank you for putting together this work.

OpenAI has announced open-evals https://github.com/openai/evals, where openai will improve user-submitted benchmarks. Perhaps we could submit a benchmark as follows: given an arbitrary grammar (could be randomly generated), the AI is able to synthesize sentences in that grammar and to judge whether a particular sentence conforms to the grammar.

That would be a "progress tracker" for the full syntactic reasoning capabilities with GPT-4 in parallel with official APIs.

Thanks – I agree this could make an interesting evals project; indeed, I've been thinking of submitting some (admittedly different) evaluations to OpenAI.

Feel free to submit a pull request if you have any ideas!