/Gemini

Primary LanguageJupyter Notebook

image

Welcome to the beginner's guide to using the Gemini AI API in Python! This guide will help you tap into Google's powerful Gemini large language models, enabling you to create intelligent applications that understand both text and images. With step-by-step instructions, you'll learn how to set up your development environment, access the API, and generate text responses from various inputs.


## Installation

To get started, you'll need to install the Python SDK for the Gemini API. It's conveniently packaged in the [`google-generativeai`](https://pypi.org/project/google-generativeai/) package. Use pip to install the dependency:

```bash
pip install -q -U google-generativeai

Getting Started

API Key Setup

Before you can use the Gemini API, you'll need an API key. You can obtain one easily by visiting the API Key Generation Page. Once you have your key, you can set it up in two ways:

  1. Environment Variable: Put your API key in the GOOGLE_API_KEY environment variable.
  2. Pass Key to SDK: Use genai.configure(api_key=YOUR_API_KEY) to pass your API key to the SDK.
import google.generativeai as genai

# Option 1: Use an environment variable
GOOGLE_API_KEY = 'YOUR_API_KEY'
os.environ['GOOGLE_API_KEY'] = GOOGLE_API_KEY

# Option 2: Pass key to SDK
genai.configure(api_key=GOOGLE_API_KEY)

Choose Your Model

You can select the appropriate Gemini model for your needs:

  • gemini-pro: Optimized for text-only prompts.
  • gemini-pro-vision: Optimized for text-and-images prompts.
# Example: Select the gemini-pro model
model = genai.GenerativeModel('gemini-pro')

Generate Text from Text Inputs

For text-only prompts, use the generate_content method with the gemini-pro model:

response = model.generate_content("What is the Gemini.AI?")

Response Formatting

You can access the response text using response.text. To display it in Markdown format, use the to_markdown function:

from IPython.display import Markdown
from IPython.display import display

display(Markdown(response.text))

Generate Text from Image and Text Inputs

Gemini's gemini-pro-vision model can handle multimodal prompts. To include an image, use the generate_content method:

import PIL.Image

image = PIL.Image.open('image.jpg')
model = genai.GenerativeModel('gemini-pro-vision')
response = model.generate_content(image)

Multimodal Prompts

You can pass both text and images in a prompt by using a list:

response = model.generate_content(["Write a short description", image], stream=True)
response.resolve()

Next Steps

Congratulations! You've learned the basics of using the Gemini AI API with Python. In our upcoming articles, we'll explore advanced use cases and dive into its wide-ranging applications. Stay tuned for exciting insights into the world of Gemini AI API!

License

This project is licensed under the MIT License - see the LICENSE file for details.


Feel free to customize the content and add any additional sections or details relevant to your project. Don't forget to replace `'YOUR_API_KEY'` with your actual API key and include any images or files specific to your project.