This guide provides a comprehensive walkthrough on how to set up and use the Google Generative AI (GenAI) library in Python. It covers everything from initial setup to advanced features, enabling you to integrate Google's powerful generative models into your applications.
Before you begin, ensure you have the following:
python --version
in your terminal. If needed, download the latest version from python.org.It's recommended to use a virtual environment to manage dependencies for your project. This helps avoid conflicts with other Python projects. Here's how to set one up:
Use the following command to create a virtual environment named genai_env
:
python -m venv genai_env
On macOS and Linux, use:
source genai_env/bin/activate
On Windows, use:
genai_env\Scripts\activate
The primary package you need is google-generativeai
, which allows you to interact with Google's Generative AI models. You should also install python-dotenv
for managing your API key securely, and optionally Pillow
if you plan to work with images.
Install these packages using pip:
pip install -U google-generativeai python-dotenv Pillow
To use the Google Generative AI API, you need to configure your API key. Here’s how you can do it:
.env
File:
In your project directory, create a file named .env
.
Open the .env
file and add your Google API key as follows:
GOOGLE_API_KEY=your_google_api_key_here
Replace your_google_api_key_here
with your actual API key. This ensures your API key remains secure and is not hard-coded in your scripts.
Next, you need to import the required libraries and configure the Generative AI model using your API key.
import google.generativeai as genai
import os
from dotenv import load_dotenv
# Load API key from .env file
load_dotenv()
google_api_key = os.getenv('GOOGLE_API_KEY')
# Configure the Generative AI model
genai.configure(api_key=google_api_key)
This code loads the API key from the .env
file and configures the google-generativeai
library to use this key.
With the model set up, you can now generate content using the API. Here’s a simple example to get you started:
# Create an instance of the GenerativeModel
model = genai.GenerativeModel('gemini-pro')
# Generate content based on a prompt
response = model.generate_content("Hello, can you introduce yourself?")
print(response.text)
This code generates a friendly introduction based on the provided prompt and prints the response.
To create an interactive automated conversation, you can use a loop that continuously sends user inputs to the model and prints the responses. Here’s how you can do it:
# Create an instance of the GenerativeModel
model = genai.GenerativeModel('gemini-pro')
# Start a chat session
chat = model.start_chat()
def automated_conversation():
while True:
user_input = input("You: ")
if user_input.lower() == "exit":
print("Conversation ended.")
break
response = chat.send_message(user_input)
print("AI:", response.text)
# Start the automated conversation
automated_conversation()
This script initiates a chat session and allows for an ongoing conversation with the AI until the user types “exit”.
If you want to use the GenAI library for image-related tasks, you can load and process images using the Pillow
library. Here’s an example:
from PIL import Image
# Load an image
image_path = "path/to/your/image.jpg"
image = Image.open(image_path)
# Define a prompt for the image
prompt = "Describe the content of this image."
# Send the image and prompt to the API
response = model.generate_content(
contents=[prompt, image],
)
# Print the response
print("Image Analysis:")
print(response.text)
Here is a full example that includes all the steps mentioned above:
import google.generativeai as genai
import os
from dotenv import load_dotenv
from PIL import Image
# Load API key from .env file
load_dotenv()
google_api_key = os.getenv('GOOGLE_API_KEY')
# Configure the Generative AI model
genai.configure(api_key=google_api_key)
# Create an instance of the GenerativeModel
model = genai.GenerativeModel('gemini-pro')
# Generate content based on a prompt
response = model.generate_content("Hello, can you introduce yourself?")
print(response.text)
# Start a chat session
chat = model.start_chat()
def automated_conversation():
while True:
user_input = input("You: ")
if user_input.lower() == "exit":
print("Conversation ended.")
break
response = chat.send_message(user_input)
print("AI:", response.text)
# Start the automated conversation
automated_conversation()
# Load an image
image_path = "path/to/your/image.jpg"
try:
image = Image.open(image_path)
# Define a prompt for the image
prompt = "Describe the content of this image."
# Send the image and prompt to the API
response = model.generate_content(
contents=[prompt, image],
)
# Print the response
print("Image Analysis:")
print(response.text)
except FileNotFoundError:
print(f"Error: Image file not found at {image_path}")
except Exception as e:
print(f"An error occurred: {e}")
This example sets up the environment, configures the API key, generates content based on a prompt, initiates an automated conversation, and demonstrates image analysis.
For long responses, you can stream the output:
response = model.generate_content(
"Explain the theory of relativity in simple terms.",
stream=True
)
for chunk in response:
print(chunk.text, end="")
To estimate the number of tokens in a prompt:
token_count = genai.tokenize("This is a sample prompt.")
print(f"Token Count: {len(token_count)}")
Enable safety settings to filter inappropriate content:
response = model.generate_content(
"Generate a story for children.",
safety_settings={"category": "violence", "threshold": 0.5}
)
print(response.text)
You can provide system instructions to guide the model's behavior. For example, you can instruct the model to respond in a specific tone or style:
chat = model.start_chat()
response = chat.send_message("Explain quantum computing in simple terms.", system_instruction="You are a helpful and friendly AI assistant.")
print(response.text)
The library supports function calling, allowing you to pass Python functions directly to the model. The model can then call these functions and respond accordingly:
def add_numbers(a, b):
return a + b
response = chat.send_message("What is 2 + 2?", functions=[add_numbers])
print(response.text)
For applications requiring asynchronous operations, use Python’s asyncio
library:
import asyncio
async def main():
# Define the model and input prompt
model_id = "gemini-pro"
prompt = "What are the latest trends in AI?"
async with genai.client.aio.live.connect(model=model_id) as session:
await session.send(prompt, end_of_turn=True)
# Process responses
async for response in session.receive():
if not response.server_content.turn_complete:
for part in response.server_content.model_turn.parts:
print(part.text, end="", flush=True)
asyncio.run(main())
Always handle errors gracefully:
try:
response = model.generate_content(
"Generate a motivational quote.",
)
print(response.text)
except Exception as e:
print(f"An error occurred: {e}")
By following this guide, you can effectively set up and use the Google Generative AI library in your Python applications. This enables you to generate content, engage in automated conversations, analyze images, and more. Experiment with different models, prompts, and configurations to unlock the full potential of Google Generative AI. Happy coding!