Native SDK support for Structured Outputs
Opened this issue Β· 15 comments
The Problem
We all hate trying to coax ChatGPT into adhering to a JSON schema. OpenAI decided to make that easier for us.
The flow:
- Declare the data transfer objects (DTOs) that you want from the model.
- Demand the response exactly in your format.
- Parse the response with code instead of godawful regex/string extractor
It would be really nice to have native support within this ruby gem!
Prior Art
In python, a nice example with math Q&A
from pydantic import BaseModel
from openai import OpenAI
class Step(BaseModel):
explanation: str
output: str
class MathResponse(BaseModel):
steps: list[Step]
final_answer: str
client = OpenAI()
completion = client.beta.chat.completions.parse(
model="gpt-4o-2024-08-06",
messages=[
{"role": "system", "content": "You are a helpful math tutor."},
{"role": "user", "content": "solve 8x + 31 = 2"},
],
response_format=MathResponse,
)
message = completion.choices[0].message
if message.parsed:
print(message.parsed.steps)
print(message.parsed.final_answer)
else:
print(message.refusal)Potential Solve
Not sure what the ideal pydantic replacement would be, but perhaps dry-struct? DTO declarations could look like this:
require 'dry-struct'
require 'dry-types'
module Types
include Dry.Types()
end
class Step < Dry::Struct
attribute :explanation, Types::String
attribute :output, Types::String
end
class MathResponse < Dry::Struct
attribute :steps, Types::Array.of(Step)
attribute :final_answer, Types::String
end(Not sure if the gem should handle any schema validations, since that's purportedly OpenAI's job, but there's dry-validations if so.)
The rest of OpenAI's math tutor example might look like
client = OpenAI::Client.new
completion = client.chat(
parameters: {
model: "gpt-4o-2024-08-06",
messages: [
{ role: "system", content: "You are a helpful math tutor."},
{ role: "user", content: "solve 8x + 31 = 2"},
],
response_format: MathResponse
}
)
message = completion.dig("choices", 0, "message")
if message.parsed
message.parsed.steps.each do |step|
puts step.explanation
puts step.output
end
puts message.parsed.final_answer
else
puts message.refusal
endI like dry-rb and use it daily, but I think it should be bare-bones and let the developer create abstractions on top of it using dry-rb, Sorbet, Active Record etc. At least make passing and returning dry-rb objects optional.
Example:
client = OpenAI::Client.new
completion = client.chat(
parameters: {
model: "gpt-4o-2024-08-06",
messages: [
{ role: "system", content: "You are a helpful math tutor." },
{ role: "user", content: "solve 8x + 31 = 2" },
],
response_format: {
type: "json_schema",
json_schema: {
name: "math_response",
strict: true,
schema: {
type: "object",
properties: {
steps: {
type: "array",
items: {
type: "object",
properties: {
explanation: {
type: "string",
},
output: {
type: "string",
},
},
required: ["explanation", "output"],
additionalProperties: false,
},
},
final_answer: {
type: "string",
},
},
required: ["steps", "final_answer"],
additionalProperties: false,
},
},
},
}
)And if response_format is passed, the library should use JSON.parse on the output.
The grok-ruby gem has an optional dry-schema integration: https://github.com/drnic/groq-ruby#using-dry-schema-with-json-mode
The grok-ruby gem has an optional dry-schema integration: https://github.com/drnic/groq-ruby#using-dry-schema-with-json-mode
And the Node official library has support for Zod. So, making an argument against my previous comment, maybe it should support dry's Dry::Schema.JSON (optionally, I hope).
Here's a ruby script implementing StructuredOutputs::Schema that provides the functionality OpenAI added to their python SDK for super-simple structured output definitions. Replicates their cookbook example perfectly.
Key thing is this simplicity:
class MathReasoning < StructuredOutputs::Schema
def initialize
super do
define :step do
string :explanation
string :output
end
array :steps, items: ref(:step)
string :final_answer
end
end
end
schema = MathReasoning.new
result = client.parse(
model: "gpt-4o-2024-08-06",
messages: [
{ role: "system", content: "You are a helpful math tutor. Guide the user through the solution step by step." },
{ role: "user", content: "how can I solve 8x + 7 = -23" }
],
response_format: schema
)To get this:
{
"steps": [
{
"expression": "8x + 7 = -23",
"explanation": "This is your starting equation."
},
{
"expression": "8x = -23 - 7",
"explanation": "Subtract 7 from both sides to isolate the term with 'x' on the left side."
},
{
"expression": "8x = -30",
"explanation": "Simplify the right side by calculating -23 - 7, which is -30."
},
{
"expression": "x = -30 / 8",
"explanation": "Divide both sides by 8 to solve for 'x'."
},
{
"expression": "x = -3.75",
"explanation": "Simplify the division to get the final value of 'x'. Alternatively, divide both sides of the equation 30 by 2 to simplify it down to 15, then divide again to get 15 divided by 4, which is -3.75."
}
],
"final_answer": "x = -3.75"
}The response_format is very necessary. This script is excellent, and I hope it gets implemented in the project because itβs extremely useful!
@alexrudall Do you think it could be included?
Hoping this gets included soon!
I support this. is there a bounty?
I adapted @jeremedia's script to work with rails. I renamed Schema to BaseSchema as not to conflict with the existing schema.rb.
class BaseSchema
MAX_OBJECT_PROPERTIES = 100
MAX_NESTING_DEPTH = 5
def initialize(name = nil, &block)
# Use the provided name or derive from class name
@name = name || self.class.name.split('::').last.downcase
# Initialize the base schema structure
@schema = {
type: 'object',
properties: {},
required: [],
additionalProperties: false,
strict: true
}
@definitions = {}
# Execute the provided block to define the schema
instance_eval(&block) if block_given?
validate_schema
end
# Convert the schema to a hash format
def to_hash
{
name: @name,
description: 'Schema for the structured response',
schema: @schema.merge({ '$defs' => @definitions })
}
end
private
# Define a string property
def string(name, enum: nil, description: nil)
add_property(name, { type: 'string', enum:, description: }.compact)
end
# Define a number property
def number(name)
add_property(name, { type: 'number' })
end
# Define a boolean property
def boolean(name)
add_property(name, { type: 'boolean' })
end
# Define an object property
def object(name, &block)
properties = {}
required = []
BaseSchema.new.tap do |s|
s.instance_eval(&block)
properties = s.instance_variable_get(:@schema)[:properties]
required = s.instance_variable_get(:@schema)[:required]
end
add_property(name, { type: 'object', properties:, required:, additionalProperties: false })
end
# Define an array property
def array(name, items:)
add_property(name, { type: 'array', items: })
end
# Define an anyOf property
def any_of(name, schemas)
add_property(name, { anyOf: schemas })
end
# Define a reusable schema component
def define(name, &block)
@definitions[name] = BaseSchema.new(&block).instance_variable_get(:@schema)
end
# Reference a defined schema component
def ref(name)
{ '$ref' => "#/$defs/#{name}" }
end
# Add a property to the schema
def add_property(name, definition)
@schema[:properties][name] = definition
@schema[:required] << name
end
# Validate the schema against defined limits
def validate_schema
properties_count = count_properties(@schema)
raise 'Exceeded maximum number of object properties' if properties_count > MAX_OBJECT_PROPERTIES
max_depth = calculate_max_depth(@schema)
raise 'Exceeded maximum nesting depth' if max_depth > MAX_NESTING_DEPTH
end
# Count the total number of properties in the schema
def count_properties(schema)
return 0 unless schema.is_a?(Hash) && schema[:properties]
count = schema[:properties].size
schema[:properties].each_value do |prop|
count += count_properties(prop)
end
count
end
# Calculate the maximum nesting depth of the schema
def calculate_max_depth(schema, current_depth = 1)
return current_depth unless schema.is_a?(Hash) && schema[:properties]
max_child_depth = schema[:properties].values.map do |prop|
calculate_max_depth(prop, current_depth + 1)
end.max
[current_depth, max_child_depth].max
end
end
require 'json'
require 'dry-schema'
require 'ostruct'
# Client class for interacting with OpenAI API
class OpenAISchemaClient
def initialize
OpenAI.configure do |config|
config.access_token = ENV['OPENPIPE_ACCESS_TOKEN']
config.uri_base = 'https://app.openpipe.ai/api/v1'
config.log_errors = true
end
@client = OpenAI::Client.new
end
# Send a request to OpenAI API and parse the response
def parse(model:, messages:, response_format:)
response = @client.chat(
parameters: {
model:,
messages:,
response_format: {
type: 'json_schema',
json_schema: response_format.to_hash
}
}
)
content = JSON.parse(response['choices'][0]['message']['content'])
if response['choices'][0]['message']['refusal']
OpenStruct.new(refusal: response['choices'][0]['message']['refusal'], parsed: nil)
else
OpenStruct.new(refusal: nil, parsed: content)
end
end
end
# example usage:
# begin
# # Create an OpenAI client
# client = OpenAISchemaClient.new
# # Create an instance of the MathReasoning schema
# schema = MathReasoning.new
# # Send a request to OpenAI API
# result = client.parse(
# model: 'gpt-4o-2024-08-06',
# messages: [
# { role: 'system', content: 'You are a helpful math tutor. Guide the user through the solution step by step.' },
# { role: 'user', content: 'how can I solve 8x + 7 = -23' }
# ],
# response_format: schema
# )
# # Handle the response
# if result.refusal
# puts "The model refused to respond: #{result.refusal}"
# else
# puts JSON.pretty_generate(result.parsed)
# end
# rescue StandardError => e
# puts "Error: #{e}"
# end
class MathReasoning < BaseSchema
def initialize
super do
define :step do
string :explanation
string :output
end
array :steps, items: ref(:step)
string :final_answer
end
end
end
Any updates on this? Will be a great feature π !
Based on the https://gist.github.com/jeremedia/7e874bc6283a10ce8b4d2746413d3ce4#file-ruby-structured-outputs-v4-rb
I got this working. The API call is a bit different but my example below:
require 'openai'
require_relative 'structured_outputs'
class WorkbankSchema < StructuredOutputs::Schema
def initialize
super do
define :word do
string :japanese
string :romaji
string :english
end
array :nouns, items: ref(:word)
array :verbs, items: ref(:word)
array :adjectives, items: ref(:word)
array :adverbs, items: ref(:word)
end
end
end
class OpenAIService
def self.client
@client ||= OpenAI::Client.new(access_token: ENV['OPENAI_API_KEY'])
end
def self.generate_vocabulary(domain, word_count_per_category: 5)
prompt = <<~PROMPT
Generate a Japanese vocabulary wordbank for the domain: #{domain}
Format the response as a JSON array of objects with the following structure:
[
{
"word": "Japanese word",
"category": "noun|verb|adjective|adverb"
}
]
Include #{word_count_per_category} words for each category (noun, verb, adjective, adverb) that are commonly used in #{domain}.
Make sure the words are appropriate for the domain and useful for constructing basic sentences.
PROMPT
schema = WorkbankSchema.new
response = client.chat(
parameters: {
model: "gpt-4o-2024-08-06",
messages: [
{ role: "system", content: "You are a Japanese language teacher creating vocabulary lists." },
{ role: "user", content: prompt }
],
response_format: { type: 'json_schema', json_schema: schema},
temperature: 0.7
}
)
begin
JSON.parse(response.dig("choices", 0, "message", "content"))
rescue JSON::ParserError => e
puts "Error parsing OpenAI response: #{e.message}"
nil
rescue StandardError => e
puts "Error in OpenAI request: #{e.message}"
nil
end
end
endThis would be a really cool feature.
@alexrudall any updates here?
Responses API (with tools):
response = client.responses.create(
parameters: {
model: "gpt-4o",
input: "solve 8x + 31 = 2",
tools: [{ type: "mcp", server_url: "..." }],
text: {
format: {
type: "json_schema",
name: "math_response",
schema: your_schema,
strict: true
}
}
}
)The gem passes parameters directly to OpenAI's REST API - structured outputs work out of the box.
https://platform.openai.com/docs/guides/structured-outputs#examples