Constrained decoding with Extended Backus-Naur Form (EBNF)
Closed this issue · 1 comments
Confirm this is a feature request for the Node library and not the underlying OpenAI API.
- This is a feature request for the Node library
Describe the feature or improvement you're requesting
Similar to the current zodResponseFormat
, but instead of using Zod schemas, developers would define output structures using EBNF.
Does OpenAI use a CFG internally?
https://openai.com/index/introducing-structured-outputs-in-the-api/
To do this, we convert the supplied JSON Schema into a context-free grammar (CFG).
Implementing this feature would enable more constrained formats such as JSON, SVG, HTML, Git diff patches, PostScript, and CSV.
Here's an example using the OpenAI npm library with openai.beta.chat.completions.parse()
and a new ebnfResponseFormat
.
JSON EBNF
const jsonEbnf = `
json ::= object | array
object ::= '{' pair (',' pair)* '}'
pair ::= string ':' value
array ::= '[' value (',' value)* ']'
value ::= string | number | object | array | 'true' | 'false' | 'null'
string ::= '"' [a-zA-Z0-9_]+ '"'
number ::= [0-9]+
`;
const completion = await openai.beta.chat.completions.parse({
model: "gpt-4o-mini",
messages: [
{
role: "system",
content: "Generate a valid JSON object with name and age.",
},
{ role: "user", content: "Create an example." },
],
response_format: ebnfResponseFormat(jsonEbnf),
});
const result = completion.choices[0].message.parsed;
JSON EBNF with Specific Schema
const specificJsonEbnf = `
json ::= object
object ::= '{' 'name:' string ',' 'age:' number '}'
string ::= '"' [a-zA-Z0-9_ ]+ '"'
number ::= [0-9]+
`;
const specificCompletion = await openai.beta.chat.completions.parse({
model: "gpt-4o-mini",
messages: [
{
role: "system",
content:
"Generate a JSON object with the specific schema {name:string, age:number}.",
},
{ role: "user", content: "Create an example." },
],
response_format: ebnfResponseFormat(specificJsonEbnf),
});
const specificResult = specificCompletion.choices[0].message.parsed;
written with gpt-4o
Additional context
SVG EBNF
const svgEbnf = `
svg ::= '<svg' attribute* '>' content '</svg>'
attribute ::= [a-zA-Z]+ '="' [a-zA-Z0-9]+ '"'
content ::= '<circle' attribute* '/>' | '<rect' attribute* '/>'
`;
HTML EBNF
const htmlEbnf = `
html ::= '<html>' content '</html>'
content ::= '<head>' headContent '</head>' '<body>' bodyContent '</body>'
headContent ::= '<title>' string '</title>'
bodyContent ::= element*
element ::= '<div>' content '</div>' | '<p>' string '</p>'
string ::= '"' [a-zA-Z0-9_ ]+ '"'
`;
Git Diff EBNF
const gitDiffEbnf = `
diff ::= 'diff --git ' file file '\n' chunk+
file ::= 'a/' [a-zA-Z0-9./]+ | 'b/' [a-zA-Z0-9./]+
chunk ::= '@@' lineInfo lineInfo '@@\n' changes
lineInfo ::= '-' [0-9]+ ',' [0-9]+
changes ::= (addition | deletion | context)*
addition ::= '+' [a-zA-Z0-9_ ]+ '\n'
deletion ::= '-' [a-zA-Z0-9_ ]+ '\n'
context ::= ' ' [a-zA-Z0-9_ ]+ '\n'
`;
PostScript EBNF
const postscriptEbnf = `
postscript ::= '%!' commands
commands ::= command*
command ::= operator operand*
operator ::= '/' [a-zA-Z]+
operand ::= number | string | array
array ::= '[' operand* ']'
number ::= [0-9]+('.'[0-9]+)?
string ::= '(' [a-zA-Z0-9 ]+ ')'
`;
Thanks for reporting!
This sounds like a feature request for the underlying OpenAI API and not the SDK, so I'm going to go ahead and close this issue.
Would you mind reposting at community.openai.com?