Working with Google Gemini¶

DCAF supports Google's Gemini models through Vertex AI, providing zero-configuration deployment on Google Cloud Platform.

Overview¶

Google Gemini offers: - Gemini 3: Latest generation with advanced reasoning - Gemini 2.x: High-performance models with thinking budgets - Gemini 1.5: Large context windows and efficient inference - Vertex AI: Enterprise integration through Google Cloud Platform

Installation¶

Install the required Google AI dependencies:

pip install google-generativeai google-auth

# Or install DCAF with Gemini support
pip install dcaf[gemini]

Configuration¶

Zero Configuration on GCP (Recommended)¶

When running on GCP (GKE, GCE, Cloud Run), DCAF automatically detects your project and location:

from dcaf.core import Agent

# That's it! Project/location auto-detected on GCP
agent = Agent(
    provider="google",
    model="gemini-2.5-pro",
    system_prompt="You are a helpful assistant."
)

Environment Variables (Optional)¶

Override auto-detected values if needed:

export GOOGLE_CLOUD_PROJECT="your-project-id"
export DCAF_GOOGLE_MODEL_LOCATION="us-central1"  # Optional, defaults to us-central1

Quick Start¶

Basic Gemini Agent¶

from dcaf.core import Agent

agent = Agent(
    provider="google",
    model="gemini-2.5-pro",
    system_prompt="You are a helpful assistant."
)

response = agent.run([
    {"role": "user", "content": "What's the capital of France?"}
])

print(response.text)

Available Gemini Models¶

Gemini 3 (Latest)¶

gemini-3-pro-preview - Most capable model with advanced reasoning

agent = Agent(provider="google", model="gemini-3-pro-preview")

gemini-3-flash - Fast inference with strong reasoning

agent = Agent(provider="google", model="gemini-3-flash")

Gemini 2.x¶

gemini-2.5-flash - Fast model with thinking support

agent = Agent(provider="google", model="gemini-2.5-flash")

gemini-2.5-pro - More capable, supports thinking budget

agent = Agent(provider="google", model="gemini-2.5-pro")

gemini-2.0-flash - Previous generation flash

agent = Agent(provider="google", model="gemini-2.0-flash")

Gemini 1.5¶

gemini-1.5-flash - Lightweight, fast responses

agent = Agent(provider="google", model="gemini-1.5-flash")

gemini-1.5-pro - Large context window (2M tokens)

agent = Agent(provider="google", model="gemini-1.5-pro")

Model Configuration¶

Temperature and Max Tokens¶

Control generation behavior:

agent = Agent(
    provider="google",
    model="gemini-3-flash",
    model_config={
        "temperature": 0.7,      # 0.0 to 1.0 (default: 0.1)
        "max_tokens": 8192,      # Maximum output tokens
    }
)

Advanced Model Configuration¶

Pass additional Gemini-specific parameters:

agent = Agent(
    provider="google",
    model="gemini-3-pro-preview",
    model_config={
        "thinking_level": "high",     # "low" or "high" (Gemini 3 only)
        "top_p": 0.9,
        "top_k": 40,
    }
)

Note: Gemini 3 models use thinking_level, while Gemini 2.5 models use thinking_budget. See Agno's documentation for model-specific parameters.

Using Tools with Gemini¶

Gemini excels at tool use and function calling:

from dcaf.core import Agent
from dcaf.tools import tool
import os

@tool(description="Search for current information")
def search(query: str) -> str:
    """Search the web for information."""
    # Your search implementation
    return f"Results for: {query}"

@tool(description="Get weather information")
def get_weather(city: str) -> str:
    """Get current weather for a city."""
    return f"Weather in {city}: Sunny, 72°F"

agent = Agent(
    provider="google",
    model="gemini-2.5-flash",
    tools=[search, get_weather],
    system_prompt="You are a helpful assistant with access to search and weather tools."
)

response = agent.run([
    {"role": "user", "content": "What's the weather in Paris and any recent news?"}
])

print(response.text)

Streaming Responses¶

Use streaming for real-time token generation:

from dcaf.core import Agent

agent = Agent(
    provider="google",
    model="gemini-3-flash",
)

for event in agent.stream([
    {"role": "user", "content": "Write a short story about AI."}
]):
    if event.type == "text_delta":
        print(event.data.text, end="", flush=True)
    elif event.type == "complete":
        print("\n\nDone!")

REST Server with Gemini¶

Expose a Gemini agent as a REST API:

from dcaf.core import Agent, serve
from dcaf.tools import tool
import os

@tool(description="Analyze code for issues")
def analyze_code(code: str, language: str) -> str:
    """Analyze code and return suggestions."""
    return f"Analyzing {language} code..."

agent = Agent(
    name="code-reviewer",
    description="AI code review assistant",
    provider="google",
    model="gemini-3-flash",
    tools=[analyze_code]
)

# Start server with A2A support
serve(agent, port=8000, a2a=True)

Access via: - HTTP: POST http://localhost:8000/api/chat - A2A: GET http://localhost:8000/.well-known/agent.json

Multi-Agent Systems with Gemini¶

Use Gemini in multi-agent architectures:

from dcaf.core import Agent
from dcaf.core.a2a import RemoteAgent
import os

# Specialist agent using Gemini
research_agent = Agent(
    name="researcher",
    provider="google",
    model="gemini-2.5-flash",
    tools=[web_search],
    system_prompt="You are a research specialist. Gather information from the web."
)

# Orchestrator using Claude on Bedrock
orchestrator = Agent(
    name="orchestrator",
    provider="bedrock",
    model="anthropic.claude-3-sonnet-20240229-v1:0",
    aws_profile="my-profile",
    tools=[research_agent.as_tool()],  # Gemini agent as a tool
    system_prompt="Route research tasks to the specialist."
)

response = orchestrator.run([
    {"role": "user", "content": "Research the latest AI developments"}
])

Vertex AI (Default)¶

The Google provider always uses Vertex AI. Project and location are auto-detected on GCP:

from dcaf.core import Agent

# On GCP - this is all you need!
agent = Agent(
    provider="google",
    model="gemini-2.5-pro",
)

How Auto-Detection Works¶

DCAF automatically detects your GCP environment:

google.auth.default(): Gets project ID from ADC (works with Workload Identity)
Metadata service: Falls back to http://metadata.google.internal/ for project/zone
Default location: Uses us-central1 if location can't be detected

Explicit Configuration¶

Override auto-detected values if needed:

agent = Agent(
    provider="google",
    model="gemini-2.5-pro",
    google_project_id="my-project",      # Explicit project
    google_location="europe-west1",       # Explicit region
)

Or via environment variables:

export GOOGLE_CLOUD_PROJECT="my-project"
export GOOGLE_CLOUD_LOCATION="us-central1"

Requirements¶

GCP project with Vertex AI API enabled
Application Default Credentials configured:
Local dev: gcloud auth application-default login
GKE: Workload Identity with appropriate IAM bindings
GCE/Cloud Run: Attached service account
IAM role: roles/aiplatform.user on the service account

Model Selection Guide¶

Model	Best For	Context	Speed	Cost
gemini-3-pro-preview	Complex reasoning, multi-step tasks	Large	Slow	High
gemini-3-flash	General-purpose, balanced performance	Large	Fast	Low
gemini-2.5-pro	Advanced capabilities, thinking	Large	Medium	Medium
gemini-2.5-flash	Fast inference, good reasoning	Large	Very Fast	Low
gemini-1.5-pro	Huge context (2M tokens)	Massive	Medium	Medium
gemini-1.5-flash	Quick tasks, simple queries	Large	Very Fast	Very Low

Error Handling¶

Handle Gemini-specific errors:

from dcaf.core import Agent

agent = Agent(
    provider="google",
    model="gemini-3-flash",
)

try:
    response = agent.run([
        {"role": "user", "content": "Hello!"}
    ])
    print(response.text)
except ImportError as e:
    print("Google AI package not installed:")
    print("  pip install google-generativeai google-auth")
except ValueError as e:
    print(f"Configuration error: {e}")
    print("Ensure GOOGLE_CLOUD_PROJECT and GOOGLE_CLOUD_LOCATION are set")
except Exception as e:
    print(f"Error: {e}")

Best Practices¶

1. Choose the Right Model¶

# For production - use flash models for speed and cost
production_agent = Agent(
    provider="google",
    model="gemini-3-flash",  # Fast, cost-effective
)

# For complex reasoning - use pro models
research_agent = Agent(
    provider="google",
    model="gemini-3-pro-preview",  # Advanced reasoning
)

2. Monitor Token Usage¶

Gemini models have different context windows and pricing:

agent = Agent(
    provider="google",
    model="gemini-3-flash",
    model_config={
        "max_tokens": 2048,  # Limit output to control costs
    }
)

3. Test with Flash, Deploy with Pro¶

import os

# Development/testing
if os.getenv("ENV") == "development":
    model = "gemini-3-flash"
else:
    model = "gemini-3-pro-preview"

agent = Agent(
    provider="google",
    model=model,
)

Comparison: Gemini vs Claude vs GPT¶

Feature	Gemini	Claude (Bedrock)	GPT-4
Tool Use	Excellent	Excellent	Good
Reasoning	Strong (G3)	Excellent	Strong
Speed	Very Fast (Flash)	Fast	Medium
Context	2M (1.5 Pro)	200K	128K
Cost	Low (Flash)	Medium	High
Deployment	Direct API or Vertex	AWS Bedrock	OpenAI or Azure

Troubleshooting¶

Project or Location Not Found¶

# Check if environment variables are set
echo $GOOGLE_CLOUD_PROJECT
echo $GOOGLE_CLOUD_LOCATION

# Set them if not on GCP (for local development)
export GOOGLE_CLOUD_PROJECT="your-project-id"
export GOOGLE_CLOUD_LOCATION="us-central1"

# Or use gcloud to set up ADC
gcloud auth application-default login

Import Error¶

# Install the Google AI packages
pip install google-generativeai google-auth

# Or upgrade if already installed
pip install --upgrade google-generativeai

Rate Limiting¶

Gemini API has rate limits. Handle gracefully:

import time
from dcaf.core import Agent

agent = Agent(provider="google", model="gemini-3-flash")

for i in range(10):
    try:
        response = agent.run([{"role": "user", "content": f"Request {i}"}])
        print(response.text)
    except Exception as e:
        if "429" in str(e) or "rate limit" in str(e).lower():
            print("Rate limited, waiting...")
            time.sleep(60)
        else:
            raise

Examples¶

Complete examples available in the repository:

examples/gemini_basic.py - Basic Gemini usage
examples/gemini_tools.py - Tool use with Gemini
examples/gemini_multi_agent.py - Multi-agent with Gemini

Resources¶

Support¶

For issues with Gemini in DCAF:

Check this guide
Review Agno's Gemini docs
Verify ADC is configured: gcloud auth application-default login
Check Vertex AI quotas in GCP Console
Open an issue on GitHub with logs

Next Steps: - Building Tools - Multi-Agent Systems - Working with Bedrock