Enabling JSON Format Responses for Image Inputs in Gemini 1.5

Question

Enabling JSON Format Responses for Image Inputs in Gemini 1.5

kihapper opened this issue 8 months ago · 5 comments

The new Gemini 1.5 model has the capability to enforce JSON responses, but there is limited documentation available on how to implement this, particularly for obtaining JSON responses from image inputs.

Could you provide an example of how to achieve this?
The REST example here and the official doc provided did not give me clarity on the process.

Answer 1 · 2024-04-16T15:12:40.000Z

I am calling it like below in python but is getting the error message
ValueError: Protocol message GenerationConfig has no "response_mime_type" field.

# Configure JSON response and safety settings
generation_config = {
    'response_mime_type': 'application/json'  # Add this line for JSON response
}

# Call generate_content with the updated configuration
response = model.generate_content(
    [prompt_variable, image_data],
    generation_config=generation_config,
    stream=False
)

 response.resolve()

Answer 2 · 2024-04-16T18:35:33.000Z

Hi @kihapper thanks for reporting the issue. We are working on this - This should be fixed in the upcoming SDK release. Will update this once the SDK is published.

Answer 3 · 2024-04-17T07:56:51.000Z

@TYMichaelChen
Thanks for this! Do you know roughly when these SDK will be rolled out for python and node.js?

Answer 4 · 2024-04-17T08:59:31.000Z

Like Michael says - we're still working on the feature, though some support in the Python SDK is available at HEAD. You can pip install git+https://github.com/google/generative-ai-python to get the latest version and try it out, but caveat emptor.

For the specific proto field you're trying to use, as long as you have google-ai-generativelanguage>=0.6.2, you should be good to go.

Answer 5 · 2024-04-17T14:18:55.000Z

The newest SDK on PyPI should now have the changes: https://pypi.org/project/google-generativeai/0.5.1/. Let us know if you are still seeing this issue