Skip to main content
Version: latest

Connect AWS Bedrock

This guide will help you configure Envoy AI Gateway to work with AWS Bedrock's foundation models, including Llama, Anthropic Claude, and other models available on AWS Bedrock.

Prerequisites

Authentication Methods

Envoy AI Gateway supports the AWS SDK default credential chain, which includes:

  1. EKS Pod Identity - Recommended for production on EKS (v1.24+)
  2. IRSA (IAM Roles for Service Accounts) - Recommended for production on EKS (all versions)
  3. Static Credentials - For development, testing, or non-EKS environments

Setup Instructions

EKS Pod Identity is the simplest way to grant AWS permissions to your pods without managing static credentials.

Step 1: Set up AWS IAM

Follow the official AWS documentation to configure EKS Pod Identity:

Your IAM policy needs these permissions:

{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"bedrock:InvokeModel",
"bedrock:InvokeModelWithResponseStream",
"bedrock:ListFoundationModels",
"aws-marketplace:ViewSubscriptions"
],
"Resource": "*"
}
]
}

The Pod Identity association should reference:

  • Namespace: envoy-gateway-system
  • ServiceAccount: ai-gateway-dataplane-aws

Step 2: Apply AI Gateway configuration

kubectl apply -f https://raw.githubusercontent.com/envoyproxy/ai-gateway/main/examples/basic/aws-pod-identity.yaml

kubectl wait pods --timeout=2m \
-l gateway.envoyproxy.io/owning-gateway-name=envoy-ai-gateway-basic \
-n envoy-gateway-system --for=condition=Ready

Step 3: Test

export GATEWAY_URL=$(kubectl get gateway envoy-ai-gateway-basic -n default -o jsonpath='{.status.addresses[0].value}')

curl -H "Content-Type: application/json" -d '{
"model": "us.meta.llama3-2-1b-instruct-v1:0",
"messages": [{"role": "user", "content": "Hello!"}]
}' http://$GATEWAY_URL/v1/chat/completions

Option 2: IRSA (IAM Roles for Service Accounts)

IRSA works on all EKS versions and uses OIDC federation for authentication.

Step 1: Set up AWS IAM

Follow the official AWS documentation to configure IRSA:

Your IAM policy needs the same Bedrock permissions as Pod Identity above.

The trust policy should allow the ServiceAccount system:serviceaccount:envoy-gateway-system:ai-gateway-dataplane-aws.

Step 2: Download and configure

curl -O https://raw.githubusercontent.com/envoyproxy/ai-gateway/main/examples/basic/aws-irsa.yaml

Edit aws-irsa.yaml and replace ACCOUNT_ID with your AWS account ID in the ServiceAccount annotation:

eks.amazonaws.com/role-arn: arn:aws:iam::YOUR_ACCOUNT_ID:role/AIGatewayBedrockRole

Step 3: Apply and test

kubectl apply -f aws-irsa.yaml

kubectl wait pods --timeout=2m \
-l gateway.envoyproxy.io/owning-gateway-name=envoy-ai-gateway-basic \
-n envoy-gateway-system --for=condition=Ready

# Test (use same curl command as Pod Identity above)
export GATEWAY_URL=$(kubectl get gateway envoy-ai-gateway-basic -n default -o jsonpath='{.status.addresses[0].value}')
curl -H "Content-Type: application/json" -d '{
"model": "us.meta.llama3-2-1b-instruct-v1:0",
"messages": [{"role": "user", "content": "Hello!"}]
}' http://$GATEWAY_URL/v1/chat/completions

Option 3: Static Credentials

Use AWS access key ID and secret for authentication. Suitable for development, testing, or non-EKS environments.

Production Warning

Static credentials are not recommended for production. Use EKS Pod Identity or IRSA instead.

Step 1: Download and configure

curl -O https://raw.githubusercontent.com/envoyproxy/ai-gateway/{vars.aigwGitRef}/examples/basic/aws.yaml

Edit aws.yaml and replace the credential placeholders:

  • AWS_ACCESS_KEY_ID: Your AWS access key ID
  • AWS_SECRET_ACCESS_KEY: Your AWS secret access key

Step 2: Apply and test

kubectl apply -f aws.yaml

kubectl wait pods --timeout=2m \
-l gateway.envoyproxy.io/owning-gateway-name=envoy-ai-gateway-basic \
-n envoy-gateway-system --for=condition=Ready

# Test (use same curl command as above)
export GATEWAY_URL=$(kubectl get gateway envoy-ai-gateway-basic -n default -o jsonpath='{.status.addresses[0].value}')
curl -H "Content-Type: application/json" -d '{
"model": "us.meta.llama3-2-1b-instruct-v1:0",
"messages": [{"role": "user", "content": "Hello!"}]
}' http://$GATEWAY_URL/v1/chat/completions

You can also access an Anthropic model with native Anthropic messages endpoint:

curl -H "Content-Type: application/json" \
-d '{
"model": "anthropic.claude-3-5-sonnet-20241022-v2:0",
"messages": [
{
"role": "user",
"content": "What is capital of France?"
}
],
"max_tokens": 100
}' \
$GATEWAY_URL/anthropic/v1/messages

Expected output:

{
"id": "msg_01XFDUDYJgAACzvnptvVoYEL",
"type": "message",
"role": "assistant",
"content": [
{
"type": "text",
"text": "The capital of France is Paris."
}
],
"model": "claude-3-5-sonnet-20241022",
"stop_reason": "end_turn",
"usage": {
"input_tokens": 13,
"output_tokens": 8
}
}

Troubleshooting

If you encounter issues:

  1. Verify authentication is configured correctly

    • For EKS Pod Identity: Check Pod Identity association exists
      aws eks list-pod-identity-associations --cluster-name YOUR_CLUSTER_NAME
    • For IRSA: Check ServiceAccount annotation
      kubectl get sa ai-gateway-dataplane-aws -n envoy-gateway-system -o yaml
    • For Static Credentials: Verify secret exists
      kubectl get secret -n default
  2. Check pod logs

    POD_NAME=$(kubectl get pod -n envoy-gateway-system \
    -l gateway.envoyproxy.io/owning-gateway-name=envoy-ai-gateway-basic \
    -o jsonpath='{.items[0].metadata.name}')
    kubectl logs -n envoy-gateway-system $POD_NAME -c extproc
  3. Common errors

    • 401/403: Invalid credentials or insufficient IAM permissions
    • 404: Model not found or not available in the specified region
    • AssumeRole errors: Check IAM role trust policy

For more details on AWS authentication, see:

Configuring More Models

To use additional models, add more AIGatewayRouteRules to your configuration with the model ID in the value field. For example, to use Claude 3 Sonnet:

apiVersion: aigateway.envoyproxy.io/v1alpha1
kind: AIGatewayRoute
metadata:
name: envoy-ai-gateway-basic-aws
namespace: default
spec:
parentRefs:
- name: envoy-ai-gateway-basic
kind: Gateway
group: gateway.networking.k8s.io
rules:
- matches:
- headers:
- type: Exact
name: x-ai-eg-model
value: anthropic.claude-3-sonnet-20240229-v1:0
backendRefs:
- name: envoy-ai-gateway-basic-aws

Using Anthropic Native API

When using Anthropic models on AWS Bedrock, you have two options:

  1. OpenAI-compatible format (/v1/chat/completions) - Works with most models but may not support all Anthropic-specific features
  2. Native Anthropic API (/anthropic/v1/messages) - Provides full access to Anthropic-specific features (only for Anthropic models)

Streaming with Native Anthropic API

The native Anthropic API also supports streaming responses:

curl -H "Content-Type: application/json" \
-d '{
"model": "anthropic.claude-3-5-sonnet-20241022-v2:0",
"messages": [
{
"role": "user",
"content": "Count from 1 to 5."
}
],
"max_tokens": 100,
"stream": true
}' \
$GATEWAY_URL/anthropic/v1/messages

Advanced Features with Anthropic Models

Since the gateway supports the native Anthropic API, you have full access to Anthropic-specific features:

Extended Thinking

curl -H "Content-Type: application/json" \
-d '{
"model": "anthropic.claude-3-5-sonnet-20241022-v2:0",
"messages": [
{
"role": "user",
"content": "Solve this puzzle: A farmer needs to cross a river with a fox, chicken, and bag of grain. The boat can only hold the farmer and one item. How does the farmer get everything across safely?"
}
],
"max_tokens": 1000,
"thinking": {
"type": "enabled",
"budget_tokens": 5000
}
}' \
$GATEWAY_URL/anthropic/v1/messages

Prompt Caching

curl -H "Content-Type: application/json" \
-d '{
"model": "anthropic.claude-3-5-sonnet-20241022-v2:0",
"system": [
{
"type": "text",
"text": "You are an AI assistant specialized in Python programming. You help users write clean, efficient Python code.",
"cache_control": {"type": "ephemeral"}
}
],
"messages": [
{
"role": "user",
"content": "Write a function to calculate fibonacci numbers."
}
],
"max_tokens": 500
}' \
$GATEWAY_URL/anthropic/v1/messages

Tool Use (Function Calling)

curl -H "Content-Type: application/json" \
-d '{
"model": "anthropic.claude-3-5-sonnet-20241022-v2:0",
"messages": [
{
"role": "user",
"content": "What is the weather in San Francisco?"
}
],
"max_tokens": 500,
"tools": [
{
"name": "get_weather",
"description": "Get the current weather in a given location",
"input_schema": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA"
}
},
"required": ["location"]
}
}
]
}' \
$GATEWAY_URL/anthropic/v1/messages