Version: latest

Connect GCP VertexAI

This guide will help you configure Envoy AI Gateway to work with GCP VertexAI's Gemini and Anthropic models.

Prerequisites

Before you begin, you'll need:

GCP credentials with access to GCP VertexAI
Basic setup completed from the Basic Usage guide
Basic configuration removed as described in the Advanced Configuration overview

GCP Credentials Setup

Ensure you have:

Your GCP project id and name.
In your GCP project, enable VertexAI API access.
Create a GCP service account and generate the JSON key file.

GCP Best Practices

Consider using GCP Workload Identity (Federation)/IAM roles and limited-scope credentials for production environments.

Configuration Steps

1. Download configuration template

curl -O https://raw.githubusercontent.com/envoyproxy/ai-gateway/main/examples/basic/gcp_vertex.yaml

2. Configure GCP Credentials

Edit the gcp_vertex.yaml file to replace these placeholder values:

GCP_PROJECT_NAME: Your GCP project name
GCP_REGION: GCP region
Update the generated service account key JSON string in the secret

Security Note

Make sure to keep your GCP service account credentials secure and never commit them to version control. The credentials will be stored in Kubernetes secrets.

3. Apply Configuration

Apply the updated configuration and wait for the Gateway pod to be ready. If you already have a Gateway running, then the secret credential update will be picked up automatically in a few seconds.

kubectl apply -f gcp_vertex.yaml

kubectl wait pods --timeout=2m \
  -l gateway.envoyproxy.io/owning-gateway-name=envoy-ai-gateway-basic \
  -n envoy-gateway-system \
  --for=condition=Ready

4. Test the Configuration

You should have set $GATEWAY_URL as part of the basic setup before connecting to providers. See the Basic Usage page for instructions.

To access a Gemini model with chat completion endpoint:

curl -H "Content-Type: application/json" \
  -d '{
    "model": "gemini-2.5-flash",
    "messages": [
      {
        "role": "user",
        "content": "Hi."
      }
    ]
  }' \
  $GATEWAY_URL/v1/chat/completions

To access an Anthropic model with chat completion endpoint:

curl -H "Content-Type: application/json" \
  -d '{
    "model": "claude-3-7-sonnet@20250219",
    "messages": [
      {
        "role": "user",
        "content": "What is capital of France?"
      }
    ],
    "max_completion_tokens": 100
  }' \
  $GATEWAY_URL/v1/chat/completions

Expected output:

{
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "message": {
        "content": "The capital of France is Paris. Paris is not only the capital city but also the largest city in France, known for its cultural significance, historic landmarks like the Eiffel Tower and the Louvre Museum, and its influence in fashion, art, and cuisine.",
        "role": "assistant"
      }
    }
  ],
  "object": "chat.completion",
  "usage": { "completion_tokens": 58, "prompt_tokens": 13, "total_tokens": 71 }
}

You can also access an Anthropic model with native Anthropic messages endpoint:

curl -H "Content-Type: application/json" \
  -d '{
    "model": "claude-3-7-sonnet@20250219",
    "messages": [
      {
        "role": "user",
        "content": "What is capital of France?"
      }
    ],
    "max_tokens": 100
  }' \
  $GATEWAY_URL/anthropic/v1/messages

Troubleshooting

If you encounter issues:

Verify your GCP credentials are correct and active

Check pod status:

kubectl get pods -n envoy-gateway-system

View controller logs:

kubectl logs -n envoy-ai-gateway-system deployment/ai-gateway-controller

Common errors:
- 401/403: Invalid credentials or insufficient permissions
- 404: Model not found or not available in a region
- 429: Rate limit exceeded

Configuring More Models

To use more models, add more AIGatewayRouteRules to the gcp_vertex.yaml file with the model ID in the value field. For example, to use [Claude 3 Sonnet]

apiVersion: aigateway.envoyproxy.io/v1alpha1
kind: AIGatewayRoute
metadata:
  name: envoy-ai-gateway-basic-gcp-gemini
  namespace: default
spec:
  schema:
    name: OpenAI
  parentRefs:
    - name: envoy-ai-gateway-basic
      kind: Gateway
      group: gateway.networking.k8s.io
  rules:
    - matches:
        - headers:
            - type: Exact
              name: x-ai-eg-model
              value: gemini-2.5-flash-pro
      backendRefs:
        - name: envoy-ai-gateway-basic-gcp

Prerequisites​

GCP Credentials Setup​

Configuration Steps​

1. Download configuration template​

2. Configure GCP Credentials​

3. Apply Configuration​

4. Test the Configuration​

Troubleshooting​

Configuring More Models​