BedrockLLM API Reference (Legacy)¶

Legacy API

This documents the v1 API. For new projects, use the Core API which handles LLM configuration internally.

See Migration Guide to upgrade existing code.

The BedrockLLM class provides a unified interface for interacting with AWS Bedrock models using the Converse API. It handles message formatting, tool configuration, and response processing.

Overview¶

BedrockLLM wraps the AWS Bedrock Converse API, providing:

Consistent interface across all Bedrock-supported models
Automatic message normalization (role alternation)
Tool schema formatting
Streaming support
Configurable timeouts and retries

Import¶

from dcaf.llm import BedrockLLM

# Or from the base module
from dcaf import BedrockLLM

Class: BedrockLLM¶

class BedrockLLM(LLM):
    """
    A class for interacting with AWS Bedrock LLMs using the Converse API.
    Provides consistent interface across all Bedrock models.
    """

Constructor¶

def __init__(
    self,
    region_name: str = 'us-east-1',
    boto3_config: Optional[Config] = None
)

Parameters¶

Parameter	Type	Default	Description
`region_name`	`str`	`'us-east-1'`	AWS region for Bedrock service
`boto3_config`	`Optional[Config]`	`None`	Custom boto3 configuration

Configuration Priority¶

Explicit boto3_config - Full control, overrides everything
Environment Variables - Deployment-time configuration
Defaults - Sensible out-of-the-box settings

Environment Variables¶

When boto3_config=None, these environment variables are used:

Variable	Default	Description
`BOTO3_READ_TIMEOUT`	`20`	Read timeout in seconds
`BOTO3_CONNECT_TIMEOUT`	`10`	Connect timeout in seconds
`BOTO3_MAX_ATTEMPTS`	`3`	Maximum retry attempts
`BOTO3_RETRY_MODE`	`standard`	Retry mode (`standard`, `adaptive`, `legacy`)

Examples¶

# 1. Using defaults
llm = BedrockLLM()

# 2. Specify region
llm = BedrockLLM(region_name="us-west-2")

# 3. Using environment variables
# export BOTO3_READ_TIMEOUT=30
# export BOTO3_MAX_ATTEMPTS=5
llm = BedrockLLM()  # Picks up env vars

# 4. Custom boto3 config (takes precedence)
from botocore.config import Config

custom_config = Config(
    read_timeout=60,
    connect_timeout=15,
    retries={
        'max_attempts': 5,
        'mode': 'adaptive'
    }
)
llm = BedrockLLM(boto3_config=custom_config)

Methods¶

invoke()¶

Invoke the LLM and get a complete response.

def invoke(
    self,
    messages: List[Dict[str, Any]],
    model_id: str,
    max_tokens: int = 1000,
    temperature: float = 0.0,
    top_p: float = 0.9,
    system_prompt: Optional[str] = None,
    tools: Optional[List[Dict[str, Any]]] = None,
    tool_choice: Optional[Union[str, Dict[str, Any]]] = None,
    additional_model_request_fields: Optional[Dict[str, Any]] = None,
    performance_config: Optional[Dict[str, str]] = None,
    **kwargs
) -> Dict[str, Any]

Parameters¶

Parameter	Type	Required	Description
`messages`	`List[Dict]`	Yes	Conversation messages
`model_id`	`str`	Yes	Bedrock model ID
`max_tokens`	`int`	No	Maximum tokens to generate (default: 1000)
`temperature`	`float`	No	Sampling temperature 0-1 (default: 0.0)
`top_p`	`float`	No	Nucleus sampling parameter (default: 0.9)
`system_prompt`	`str`	No	System prompt for model behavior
`tools`	`List[Dict]`	No	Tool specifications
`tool_choice`	`str\|Dict`	No	Tool choice strategy
`additional_model_request_fields`	`Dict`	No	Model-specific parameters
`performance_config`	`Dict`	No	Performance settings

Returns¶

Returns the raw Converse API response:

{
    "output": {
        "message": {
            "role": "assistant",
            "content": [
                {"text": "Response text..."},
                # Or tool use blocks
                {
                    "toolUse": {
                        "toolUseId": "unique-id",
                        "name": "tool_name",
                        "input": {...}
                    }
                }
            ]
        }
    },
    "stopReason": "end_turn|tool_use|max_tokens",
    "usage": {
        "inputTokens": 100,
        "outputTokens": 50
    }
}

Example¶

# Simple invocation
response = llm.invoke(
    messages=[
        {"role": "user", "content": "Explain quantum computing in simple terms"}
    ],
    model_id="us.anthropic.claude-3-5-sonnet-20240620-v1:0",
    max_tokens=500,
    temperature=0.7
)

# Extract text response
text = response['output']['message']['content'][0]['text']
print(text)

invoke_stream()¶

Stream responses from the LLM.

def invoke_stream(
    self,
    messages: list,
    model_id: str,
    system_prompt: Optional[str] = None,
    tools: Optional[list] = None,
    max_tokens: int = 1000,
    temperature: float = 0.0,
    additional_params: Optional[Dict[str, Any]] = None
) -> Generator[Dict, None, None]

Parameters¶

Parameter	Type	Required	Description
`messages`	`list`	Yes	Messages in Converse format
`model_id`	`str`	Yes	Bedrock model ID
`system_prompt`	`str`	No	System prompt
`tools`	`list`	No	Tool configurations
`max_tokens`	`int`	No	Maximum tokens (default: 1000)
`temperature`	`float`	No	Temperature (default: 0.0)
`additional_params`	`Dict`	No	Additional inference config

Yields¶

Raw event dictionaries from the Bedrock stream:

# Text delta event
{"contentBlockDelta": {"delta": {"text": "Hello"}}}

# Content block start
{"contentBlockStart": {...}}

# Message complete
{"messageStop": {"stopReason": "end_turn"}}

Example¶

# Streaming response
for event in llm.invoke_stream(
    messages=[{"role": "user", "content": "Tell me a story"}],
    model_id="us.anthropic.claude-3-5-sonnet-20240620-v1:0",
    max_tokens=500
):
    if "contentBlockDelta" in event:
        delta = event["contentBlockDelta"].get("delta", {})
        if "text" in delta:
            print(delta["text"], end="", flush=True)

normalize_message_roles()¶

Normalize messages to ensure proper role alternation.

def normalize_message_roles(
    self,
    messages: List[Dict[str, Any]]
) -> List[Dict[str, Any]]

Purpose¶

Bedrock's Converse API requires strict role alternation between 'user' and 'assistant'. This method:

Merges consecutive messages with the same role
Removes empty messages
Handles both string and list content formats

Example¶

# Input: Consecutive user messages
messages = [
    {"role": "user", "content": "Hello"},
    {"role": "user", "content": "How are you?"},  # Same role!
    {"role": "assistant", "content": "Hi there!"},
    {"role": "user", "content": "Great!"}
]

# After normalization
normalized = llm.normalize_message_roles(messages)
# [
#     {"role": "user", "content": "Hello\nHow are you?"},  # Merged
#     {"role": "assistant", "content": "Hi there!"},
#     {"role": "user", "content": "Great!"}
# ]

Message Formats¶

Simple String Content¶

messages = [
    {"role": "user", "content": "Hello, how are you?"},
    {"role": "assistant", "content": "I'm doing well, thanks!"},
    {"role": "user", "content": "Great to hear!"}
]

List Content (Multiple Blocks)¶

messages = [
    {
        "role": "user",
        "content": [
            {"text": "Look at this image:"},
            {"image": {"format": "png", "source": {"bytes": b"..."}}}
        ]
    }
]

Tool Results¶

messages = [
    {"role": "user", "content": "What's the weather in NYC?"},
    {
        "role": "assistant",
        "content": [
            {
                "toolUse": {
                    "toolUseId": "tool123",
                    "name": "get_weather",
                    "input": {"location": "New York, NY"}
                }
            }
        ]
    },
    {
        "role": "user",
        "content": [
            {
                "toolResult": {
                    "toolUseId": "tool123",
                    "content": [{"text": "72°F, sunny"}]
                }
            }
        ]
    }
]

Tool Configuration¶

Tool Schema Format¶

tools = [
    {
        "name": "get_weather",
        "description": "Get the current weather for a location",
        "input_schema": {
            "type": "object",
            "properties": {
                "location": {
                    "type": "string",
                    "description": "City and state, e.g., San Francisco, CA"
                },
                "unit": {
                    "type": "string",
                    "enum": ["celsius", "fahrenheit"],
                    "description": "Temperature unit"
                }
            },
            "required": ["location"]
        }
    }
]

Tool Choice Options¶

# Auto (default) - Model decides whether to use tools
response = llm.invoke(messages=..., tools=tools, tool_choice="auto")

# Any - Model must use at least one tool
response = llm.invoke(messages=..., tools=tools, tool_choice="any")

# Specific tool - Force use of a specific tool
response = llm.invoke(
    messages=..., 
    tools=tools, 
    tool_choice={"type": "tool", "name": "get_weather"}
)

Complete Tool Example¶

from dcaf.llm import BedrockLLM

llm = BedrockLLM()

# Define tools
tools = [
    {
        "name": "get_stock_price",
        "description": "Get the current stock price for a ticker symbol",
        "input_schema": {
            "type": "object",
            "properties": {
                "ticker": {
                    "type": "string",
                    "description": "Stock ticker symbol (e.g., AAPL)"
                }
            },
            "required": ["ticker"]
        }
    }
]

# Invoke with tools
response = llm.invoke(
    messages=[{"role": "user", "content": "What's Apple's stock price?"}],
    model_id="us.anthropic.claude-3-5-sonnet-20240620-v1:0",
    system_prompt="You are a financial assistant.",
    tools=tools
)

# Check if model wants to use a tool
content = response['output']['message']['content']
for block in content:
    if 'toolUse' in block:
        tool_use = block['toolUse']
        print(f"Tool: {tool_use['name']}")
        print(f"Input: {tool_use['input']}")

Configuration Options¶

Performance Configuration¶

# Optimize for latency
response = llm.invoke(
    messages=...,
    model_id=...,
    performance_config={"latency": "optimized"}
)

Additional Model Fields¶

For model-specific parameters:

# Anthropic-specific parameters
response = llm.invoke(
    messages=...,
    model_id="us.anthropic.claude-3-5-sonnet-20240620-v1:0",
    additional_model_request_fields={
        "anthropic_version": "bedrock-2023-05-31"
    }
)

Examples¶

Basic Chat¶

from dcaf.llm import BedrockLLM

llm = BedrockLLM(region_name="us-east-1")

response = llm.invoke(
    messages=[
        {"role": "user", "content": "What is the capital of France?"}
    ],
    model_id="us.anthropic.claude-3-5-sonnet-20240620-v1:0",
    max_tokens=200,
    temperature=0
)

# Extract response text
text = response['output']['message']['content'][0]['text']
print(text)  # "The capital of France is Paris."

Multi-Turn Conversation¶

conversation = [
    {"role": "user", "content": "My name is Alice."},
    {"role": "assistant", "content": "Nice to meet you, Alice!"},
    {"role": "user", "content": "What is my name?"}
]

response = llm.invoke(
    messages=conversation,
    model_id="us.anthropic.claude-3-5-sonnet-20240620-v1:0",
    system_prompt="You are a helpful assistant with a good memory."
)

# "Your name is Alice!"

With System Prompt¶

response = llm.invoke(
    messages=[{"role": "user", "content": "Explain photosynthesis"}],
    model_id="us.anthropic.claude-3-5-sonnet-20240620-v1:0",
    system_prompt="""You are a biology teacher for 5th graders. 
    Explain concepts using simple language and fun analogies.""",
    max_tokens=500
)

Streaming Chat¶

import sys

# Stream a long response
for event in llm.invoke_stream(
    messages=[{"role": "user", "content": "Write a haiku about coding"}],
    model_id="us.anthropic.claude-3-5-sonnet-20240620-v1:0",
    max_tokens=100
):
    if "contentBlockDelta" in event:
        text = event["contentBlockDelta"].get("delta", {}).get("text", "")
        sys.stdout.write(text)
        sys.stdout.flush()

Error Handling¶

Common Exceptions¶

from botocore.exceptions import ClientError

try:
    response = llm.invoke(
        messages=[{"role": "user", "content": "Hello"}],
        model_id="us.anthropic.claude-3-5-sonnet-20240620-v1:0"
    )
except ClientError as e:
    error_code = e.response['Error']['Code']

    if error_code == 'ExpiredTokenException':
        print("AWS credentials expired. Refresh them.")
    elif error_code == 'ResourceNotFoundException':
        print("Model not found. Check model ID and region.")
    elif error_code == 'ThrottlingException':
        print("Rate limited. Implement backoff.")
    elif error_code == 'ValidationException':
        print("Invalid request. Check message format.")
    else:
        print(f"Error: {e}")

Retry Configuration¶

from botocore.config import Config

# Aggressive retry for production
config = Config(
    retries={
        'max_attempts': 10,
        'mode': 'adaptive'  # Smart backoff
    }
)

llm = BedrockLLM(boto3_config=config)

Model IDs¶

Common Bedrock Model IDs¶

Model	ID
Claude 3.5 Sonnet	`us.anthropic.claude-3-5-sonnet-20240620-v1:0`
Claude 3 Sonnet	`us.anthropic.claude-3-sonnet-20240229-v1:0`
Claude 3 Haiku	`us.anthropic.claude-3-5-haiku-20241022-v1:0`
Claude 3 Opus	`us.anthropic.claude-opus-4-20250514-v1:0`

Cross-Region Inference¶

Use the us. prefix for cross-region inference profiles:

# Cross-region (recommended)
model_id = "us.anthropic.claude-3-5-sonnet-20240620-v1:0"

# Single region
model_id = "anthropic.claude-3-5-sonnet-20240620-v1:0"

BedrockLLM API Reference (Legacy)¶

Table of Contents¶

Overview¶

Import¶

Class: BedrockLLM¶

Constructor¶

Parameters¶

Configuration Priority¶

Environment Variables¶

Examples¶

Methods¶

invoke()¶

Parameters¶

Returns¶

Example¶

invoke_stream()¶

Parameters¶

Yields¶

Example¶

normalize_message_roles()¶

Purpose¶

Example¶

Message Formats¶

Simple String Content¶

List Content (Multiple Blocks)¶

Tool Results¶

Tool Configuration¶

Tool Schema Format¶

Tool Choice Options¶

Complete Tool Example¶

Configuration Options¶

Performance Configuration¶

Additional Model Fields¶

Examples¶

Basic Chat¶

Multi-Turn Conversation¶

With System Prompt¶

Streaming Chat¶

Error Handling¶

Common Exceptions¶

Retry Configuration¶

Model IDs¶

Common Bedrock Model IDs¶

Cross-Region Inference¶

See Also¶