Native SDK support for Structured Outputs

Question

Native SDK support for Structured Outputs

Opened this issue a year ago · 15 comments

kirby-jobber commented a year ago

The Problem

We all hate trying to coax ChatGPT into adhering to a JSON schema. OpenAI decided to make that easier for us.

The flow:

Declare the data transfer objects (DTOs) that you want from the model.
Demand the response exactly in your format.
Parse the response with code instead of godawful regex/string extractor

It would be really nice to have native support within this ruby gem!

Prior Art

In python, a nice example with math Q&A

from pydantic import BaseModel

from openai import OpenAI


class Step(BaseModel):
    explanation: str
    output: str


class MathResponse(BaseModel):
    steps: list[Step]
    final_answer: str


client = OpenAI()

completion = client.beta.chat.completions.parse(
    model="gpt-4o-2024-08-06",
    messages=[
        {"role": "system", "content": "You are a helpful math tutor."},
        {"role": "user", "content": "solve 8x + 31 = 2"},
    ],
    response_format=MathResponse,
)

message = completion.choices[0].message
if message.parsed:
    print(message.parsed.steps)
    print(message.parsed.final_answer)
else:
    print(message.refusal)

Potential Solve

Not sure what the ideal pydantic replacement would be, but perhaps dry-struct? DTO declarations could look like this:

require 'dry-struct'
require 'dry-types'

module Types
  include Dry.Types()
end

class Step < Dry::Struct
  attribute :explanation, Types::String
  attribute :output, Types::String
end

class MathResponse < Dry::Struct
  attribute :steps, Types::Array.of(Step)
  attribute :final_answer, Types::String
end

(Not sure if the gem should handle any schema validations, since that's purportedly OpenAI's job, but there's dry-validations if so.)

The rest of OpenAI's math tutor example might look like

client = OpenAI::Client.new

completion = client.chat(
  parameters: {
    model: "gpt-4o-2024-08-06",
    messages: [
      { role: "system", content: "You are a helpful math tutor."},
      { role: "user", content: "solve 8x + 31 = 2"},
    ],
    response_format: MathResponse
  }
)

message = completion.dig("choices", 0, "message")
if message.parsed
  message.parsed.steps.each do |step|
    puts step.explanation
    puts step.output
  end
  puts message.parsed.final_answer
else
  puts message.refusal
end

Answer 1 · 2024-08-07T01:24:10.000Z

I like dry-rb and use it daily, but I think it should be bare-bones and let the developer create abstractions on top of it using dry-rb, Sorbet, Active Record etc. At least make passing and returning dry-rb objects optional.

Example:

client = OpenAI::Client.new

completion = client.chat(
  parameters: {
    model: "gpt-4o-2024-08-06",
    messages: [
      { role: "system", content: "You are a helpful math tutor." },
      { role: "user", content: "solve 8x + 31 = 2" },
    ],
    response_format: {
      type: "json_schema",
      json_schema: {
        name: "math_response",
        strict: true,
        schema: {
          type: "object",
          properties: {
            steps: {
              type: "array",
              items: {
                type: "object",
                properties: {
                  explanation: {
                    type: "string",
                  },
                  output: {
                    type: "string",
                  },
                },
                required: ["explanation", "output"],
                additionalProperties: false,
              },
            },
            final_answer: {
              type: "string",
            },
          },
          required: ["steps", "final_answer"],
          additionalProperties: false,
        },
      },
    },
  }
)

And if response_format is passed, the library should use JSON.parse on the output.

Answer 2 · 2024-08-07T22:04:57.000Z

The grok-ruby gem has an optional dry-schema integration: https://github.com/drnic/groq-ruby#using-dry-schema-with-json-mode

Answer 3 · 2024-08-08T06:08:44.000Z

The grok-ruby gem has an optional dry-schema integration: https://github.com/drnic/groq-ruby#using-dry-schema-with-json-mode

And the Node official library has support for Zod. So, making an argument against my previous comment, maybe it should support dry's Dry::Schema.JSON (optionally, I hope).

Answer 4 · 2024-08-15T18:28:14.000Z

Here's a ruby script implementing StructuredOutputs::Schema that provides the functionality OpenAI added to their python SDK for super-simple structured output definitions. Replicates their cookbook example perfectly.

Key thing is this simplicity:

class MathReasoning < StructuredOutputs::Schema
  def initialize
    super do
      define :step do
        string :explanation
        string :output
      end
      array :steps, items: ref(:step)
      string :final_answer
    end
  end
end

schema = MathReasoning.new
  
result = client.parse(
  model: "gpt-4o-2024-08-06",
  messages: [
    { role: "system", content: "You are a helpful math tutor. Guide the user through the solution step by step." },
    { role: "user", content: "how can I solve 8x + 7 = -23" }
  ],
  response_format: schema
)

To get this:

{
  "steps": [
    {
      "expression": "8x + 7 = -23",
      "explanation": "This is your starting equation."
    },
    {
      "expression": "8x = -23 - 7",
      "explanation": "Subtract 7 from both sides to isolate the term with 'x' on the left side."
    },
    {
      "expression": "8x = -30",
      "explanation": "Simplify the right side by calculating -23 - 7, which is -30."
    },
    {
      "expression": "x = -30 / 8",
      "explanation": "Divide both sides by 8 to solve for 'x'."
    },
    {
      "expression": "x = -3.75",
      "explanation": "Simplify the division to get the final value of 'x'. Alternatively, divide both sides of the equation 30 by 2 to simplify it down to 15, then divide again to get 15 divided by 4, which is -3.75."
    }
  ],
  "final_answer": "x = -3.75"
}

Answer 5 · 2024-09-02T05:14:41.000Z

The response_format is very necessary. This script is excellent, and I hope it gets implemented in the project because it’s extremely useful!
@alexrudall Do you think it could be included?

Answer 6 · 2024-09-02T21:53:32.000Z

Strong endorse with @bastos approach

Answer 7 · 2024-09-21T19:27:59.000Z

Hoping this gets included soon!

Answer 8 · 2024-10-01T15:20:44.000Z

I support this. is there a bounty?

Answer 9 · 2024-10-01T15:34:01.000Z

I adapted @jeremedia's script to work with rails. I renamed Schema to BaseSchema as not to conflict with the existing schema.rb.

class BaseSchema
  MAX_OBJECT_PROPERTIES = 100
  MAX_NESTING_DEPTH = 5

  def initialize(name = nil, &block)
    # Use the provided name or derive from class name
    @name = name || self.class.name.split('::').last.downcase
    # Initialize the base schema structure
    @schema = {
      type: 'object',
      properties: {},
      required: [],
      additionalProperties: false,
      strict: true
    }
    @definitions = {}
    # Execute the provided block to define the schema
    instance_eval(&block) if block_given?
    validate_schema
  end

  # Convert the schema to a hash format
  def to_hash
    {
      name: @name,
      description: 'Schema for the structured response',
      schema: @schema.merge({ '$defs' => @definitions })
    }
  end

  private

  # Define a string property
  def string(name, enum: nil, description: nil)
    add_property(name, { type: 'string', enum:, description: }.compact)
  end

  # Define a number property
  def number(name)
    add_property(name, { type: 'number' })
  end

  # Define a boolean property
  def boolean(name)
    add_property(name, { type: 'boolean' })
  end

  # Define an object property
  def object(name, &block)
    properties = {}
    required = []
    BaseSchema.new.tap do |s|
      s.instance_eval(&block)
      properties = s.instance_variable_get(:@schema)[:properties]
      required = s.instance_variable_get(:@schema)[:required]
    end
    add_property(name, { type: 'object', properties:, required:, additionalProperties: false })
  end

  # Define an array property
  def array(name, items:)
    add_property(name, { type: 'array', items: })
  end

  # Define an anyOf property
  def any_of(name, schemas)
    add_property(name, { anyOf: schemas })
  end

  # Define a reusable schema component
  def define(name, &block)
    @definitions[name] = BaseSchema.new(&block).instance_variable_get(:@schema)
  end

  # Reference a defined schema component
  def ref(name)
    { '$ref' => "#/$defs/#{name}" }
  end

  # Add a property to the schema
  def add_property(name, definition)
    @schema[:properties][name] = definition
    @schema[:required] << name
  end

  # Validate the schema against defined limits
  def validate_schema
    properties_count = count_properties(@schema)
    raise 'Exceeded maximum number of object properties' if properties_count > MAX_OBJECT_PROPERTIES

    max_depth = calculate_max_depth(@schema)
    raise 'Exceeded maximum nesting depth' if max_depth > MAX_NESTING_DEPTH
  end

  # Count the total number of properties in the schema
  def count_properties(schema)
    return 0 unless schema.is_a?(Hash) && schema[:properties]

    count = schema[:properties].size
    schema[:properties].each_value do |prop|
      count += count_properties(prop)
    end
    count
  end

  # Calculate the maximum nesting depth of the schema
  def calculate_max_depth(schema, current_depth = 1)
    return current_depth unless schema.is_a?(Hash) && schema[:properties]

    max_child_depth = schema[:properties].values.map do |prop|
      calculate_max_depth(prop, current_depth + 1)
    end.max
    [current_depth, max_child_depth].max
  end
end

require 'json'
require 'dry-schema'
require 'ostruct'

# Client class for interacting with OpenAI API
class OpenAISchemaClient
  def initialize
    OpenAI.configure do |config|
      config.access_token = ENV['OPENPIPE_ACCESS_TOKEN']
      config.uri_base = 'https://app.openpipe.ai/api/v1'
      config.log_errors = true
    end
    @client = OpenAI::Client.new
  end

  # Send a request to OpenAI API and parse the response
  def parse(model:, messages:, response_format:)
    response = @client.chat(
      parameters: {
        model:,
        messages:,
        response_format: {
          type: 'json_schema',
          json_schema: response_format.to_hash
        }
      }
    )

    content = JSON.parse(response['choices'][0]['message']['content'])

    if response['choices'][0]['message']['refusal']
      OpenStruct.new(refusal: response['choices'][0]['message']['refusal'], parsed: nil)
    else
      OpenStruct.new(refusal: nil, parsed: content)
    end
  end
end

# example usage:

# begin
#   # Create an OpenAI client
#   client = OpenAISchemaClient.new
#   # Create an instance of the MathReasoning schema
#   schema = MathReasoning.new

#   # Send a request to OpenAI API
#   result = client.parse(
#     model: 'gpt-4o-2024-08-06',
#     messages: [
#       { role: 'system', content: 'You are a helpful math tutor. Guide the user through the solution step by step.' },
#       { role: 'user', content: 'how can I solve 8x + 7 = -23' }
#     ],
#     response_format: schema
#   )

#   # Handle the response
#   if result.refusal
#     puts "The model refused to respond: #{result.refusal}"

#   else
#     puts JSON.pretty_generate(result.parsed)

#   end
# rescue StandardError => e
#   puts "Error: #{e}"
# end


class MathReasoning < BaseSchema
  def initialize
    super do
      define :step do
        string :explanation
        string :output
      end
      array :steps, items: ref(:step)
      string :final_answer
    end
  end
end

Answer 10 · 2024-12-11T09:59:42.000Z

Any updates on this? Will be a great feature 🚀 !

Answer 11 · 2025-02-05T03:42:47.000Z

Based on the https://gist.github.com/jeremedia/7e874bc6283a10ce8b4d2746413d3ce4#file-ruby-structured-outputs-v4-rb

I got this working. The API call is a bit different but my example below:

require 'openai'
require_relative 'structured_outputs'

class WorkbankSchema < StructuredOutputs::Schema
  def initialize
    super do
      define :word do
        string :japanese
        string :romaji
        string :english
      end
      array :nouns, items: ref(:word)
      array :verbs, items: ref(:word)
      array :adjectives, items: ref(:word)
      array :adverbs, items: ref(:word)
    end
  end
end

class OpenAIService
  def self.client
    @client ||= OpenAI::Client.new(access_token: ENV['OPENAI_API_KEY'])
  end

  def self.generate_vocabulary(domain, word_count_per_category: 5)
    prompt = <<~PROMPT
      Generate a Japanese vocabulary wordbank for the domain: #{domain}
      Format the response as a JSON array of objects with the following structure:
      [
        {
          "word": "Japanese word",
          "category": "noun|verb|adjective|adverb"
        }
      ]
      Include #{word_count_per_category} words for each category (noun, verb, adjective, adverb) that are commonly used in #{domain}.
      Make sure the words are appropriate for the domain and useful for constructing basic sentences.
    PROMPT

    schema = WorkbankSchema.new
    response = client.chat(
      parameters: {
        model: "gpt-4o-2024-08-06",
        messages: [
          { role: "system", content: "You are a Japanese language teacher creating vocabulary lists." },
          { role: "user", content: prompt }
        ],
        response_format: { type: 'json_schema', json_schema: schema},
        temperature: 0.7
      }
    )

    begin
      JSON.parse(response.dig("choices", 0, "message", "content"))
    rescue JSON::ParserError => e
      puts "Error parsing OpenAI response: #{e.message}"
      nil
    rescue StandardError => e
      puts "Error in OpenAI request: #{e.message}"
      nil
    end
  end
end

Answer 12 · 2025-02-05T03:49:43.000Z

Is this repo still being maintained? Suprised there isn’t a native structured output solution atm

…

On Tue, Feb 4, 2025 at 10:43 PM Andrew Brown ***@***.***> wrote: Based on the https://gist.github.com/jeremedia/7e874bc6283a10ce8b4d2746413d3ce4#file-ruby-structured-outputs-v4-rb I got this working. The API call is a bit different but my example below: require 'openai'require_relative 'structured_outputs' class WorkbankSchema < StructuredOutputs::Schema def initialize super do define :word do string :japanese string :romaji string :english end array :nouns, items: ref(:word) array :verbs, items: ref(:word) array :adjectives, items: ref(:word) array :adverbs, items: ref(:word) end endend class OpenAIService def self.client @client ||= OpenAI::Client.new(access_token: ENV['OPENAI_API_KEY']) end def self.generate_vocabulary(domain, word_count_per_category: 5) prompt = <<~PROMPT Generate a Japanese vocabulary wordbank for the domain: #{domain} Format the response as a JSON array of objects with the following structure: [ { "word": "Japanese word", "category": "noun|verb|adjective|adverb" } ] Include #{word_count_per_category} words for each category (noun, verb, adjective, adverb) that are commonly used in #{domain}. Make sure the words are appropriate for the domain and useful for constructing basic sentences. PROMPT schema = WorkbankSchema.new response = client.chat( parameters: { model: "gpt-4o-2024-08-06", messages: [ { role: "system", content: "You are a Japanese language teacher creating vocabulary lists." }, { role: "user", content: prompt } ], response_format: { type: 'json_schema', json_schema: schema}, temperature: 0.7 } ) begin JSON.parse(response.dig("choices", 0, "message", "content")) rescue JSON::ParserError => e puts "Error parsing OpenAI response: #{e.message}" nil rescue StandardError => e puts "Error in OpenAI request: #{e.message}" nil end endend — Reply to this email directly, view it on GitHub <#508 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ACGUU5SMN5UIMEMTGFNLH5D2OGCE5AVCNFSM6AAAAABMDJOTE2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDMMZVGYZTIOJSHE> . You are receiving this because you commented.Message ID: ***@***.***>

Answer 13 · 2025-04-09T22:32:57.000Z

This would be a really cool feature.

Answer 14 · 2025-07-01T15:43:55.000Z

@alexrudall any updates here?

Answer 15 · 2025-07-28T20:03:55.000Z

Responses API (with tools):

response = client.responses.create(
  parameters: {
    model: "gpt-4o",
    input: "solve 8x + 31 = 2",
    tools: [{ type: "mcp", server_url: "..." }],
    text: {
      format: {
        type: "json_schema",
        name: "math_response",
        schema: your_schema,
        strict: true
      }
    }
  }
)

The gem passes parameters directly to OpenAI's REST API - structured outputs work out of the box.

https://platform.openai.com/docs/guides/structured-outputs#examples