Welcome to the beginner's guide to using the Gemini AI API in Python! This guide will help you tap into Google's powerful Gemini large language models, enabling you to create intelligent applications that understand both text and images. With step-by-step instructions, you'll learn how to set up your development environment, access the API, and generate text responses from various inputs.
## Installation
To get started, you'll need to install the Python SDK for the Gemini API. It's conveniently packaged in the [`google-generativeai`](https://pypi.org/project/google-generativeai/) package. Use pip to install the dependency:
```bash
pip install -q -U google-generativeai
Before you can use the Gemini API, you'll need an API key. You can obtain one easily by visiting the API Key Generation Page. Once you have your key, you can set it up in two ways:
- Environment Variable: Put your API key in the
GOOGLE_API_KEY
environment variable. - Pass Key to SDK: Use
genai.configure(api_key=YOUR_API_KEY)
to pass your API key to the SDK.
import google.generativeai as genai
# Option 1: Use an environment variable
GOOGLE_API_KEY = 'YOUR_API_KEY'
os.environ['GOOGLE_API_KEY'] = GOOGLE_API_KEY
# Option 2: Pass key to SDK
genai.configure(api_key=GOOGLE_API_KEY)
You can select the appropriate Gemini model for your needs:
gemini-pro
: Optimized for text-only prompts.gemini-pro-vision
: Optimized for text-and-images prompts.
# Example: Select the gemini-pro model
model = genai.GenerativeModel('gemini-pro')
For text-only prompts, use the generate_content
method with the gemini-pro
model:
response = model.generate_content("What is the Gemini.AI?")
You can access the response text using response.text
. To display it in Markdown format, use the to_markdown
function:
from IPython.display import Markdown
from IPython.display import display
display(Markdown(response.text))
Gemini's gemini-pro-vision
model can handle multimodal prompts. To include an image, use the generate_content
method:
import PIL.Image
image = PIL.Image.open('image.jpg')
model = genai.GenerativeModel('gemini-pro-vision')
response = model.generate_content(image)
You can pass both text and images in a prompt by using a list:
response = model.generate_content(["Write a short description", image], stream=True)
response.resolve()
Congratulations! You've learned the basics of using the Gemini AI API with Python. In our upcoming articles, we'll explore advanced use cases and dive into its wide-ranging applications. Stay tuned for exciting insights into the world of Gemini AI API!
This project is licensed under the MIT License - see the LICENSE file for details.
Feel free to customize the content and add any additional sections or details relevant to your project. Don't forget to replace `'YOUR_API_KEY'` with your actual API key and include any images or files specific to your project.