Skip to main content
Version: Next

GCP Containerized (Cloud Run)

Deploy Agent Kernel agents as always-on containerized services on GCP Cloud Run with min_instance_count ≥ 1. Uses the ak-deployment/ak-gcp/containerized Terraform module.

Overview

ComponentGCP Service
ComputeCloud Run (always-on, min_instance_count≥1)
API GatewayGCP API Gateway (OpenAPI-based)
Container RegistryArtifact Registry
Session Store (Redis)Memorystore Redis
Session Store (Firestore)Firestore (Native Mode)
NetworkingVPC + VPC Access Connector + Cloud NAT
ObservabilityCloud Logging (built-in)

Key difference from GCP Serverless: This module defaults to min_instance_count = 1 — at least one instance is always running, eliminating cold starts and providing consistent low-latency responses.

Prerequisites

  • GCP CLI configured: gcloud auth application-default login
  • Terraform >= 1.9.5
  • Docker installed
  • An existing GCP project with billing enabled
  • Required APIs enabled:
    • run.googleapis.com
    • apigateway.googleapis.com
    • artifactregistry.googleapis.com
    • vpcaccess.googleapis.com
    • redis.googleapis.com (if using Redis)
    • firestore.googleapis.com (if using Firestore)

Agent Code

Use agentkernel.gcp.CloudRun as the entry point — identical to the serverless variant:

from agentkernel.gcp import CloudRun
from agentkernel.openai import OpenAIModule

OpenAIModule([...])

@CloudRun.register("/version", method="GET")
def version_handler() -> dict:
return {"version": "1.0.0"}

def main() -> None:
CloudRun.run()

if __name__ == "__main__":
main()

Dependencies

# pyproject.toml
dependencies = [
"agentkernel[openai,api,redis]>=0.4.0", # for Redis sessions
# or: "agentkernel[openai,api,gcp]>=0.4.0" # for Firestore sessions
]

Terraform Configuration

Basic Deployment (with Redis, always-on)

module "containerized_agent" {
source = "../../ak-deployment/ak-gcp/containerized"

project_id = var.project_id
region = var.region
product_alias = var.product_alias
env_alias = var.env_alias
module_name = var.module_name
product_display_name = "AK GCP Containerized"

package_path = "${path.module}/../dist"
min_instance_count = 1 # always-on
container_port = 8000

create_redis_cluster = true

environment_variables = {
OPENAI_API_KEY = var.openai_api_key
}

gateway_endpoints = [
{ path = "app", method = "GET", overwrite_path = "/app" },
{ path = "app_info", method = "POST", overwrite_path = "/app_info" }
]
}

With JWT Authentication

module "containerized_agent" {
source = "../../ak-deployment/ak-gcp/containerized"

project_id = var.project_id
region = var.region
product_alias = var.product_alias
env_alias = var.env_alias
module_name = var.module_name
product_display_name = "AK GCP Containerized Auth"

package_path = "${path.module}/../dist"
min_instance_count = 1
container_port = 8000

enable_jwt_auth = true
jwt_audiences = ["https://your-api-gateway-url"]

create_redis_cluster = true

environment_variables = {
OPENAI_API_KEY = var.openai_api_key
}
}

Session Configuration

Redis Sessions

# config.yaml
session:
type: redis
redis:
url: ${REDIS_URL}
prefix: "ak:myapp:"
ttl: 604800

Deployment

# 1. Build example package
./build.sh

# 2. Deploy infrastructure
cd deploy
terraform init
terraform plan
terraform apply

Key Variables

VariableRequiredDefaultDescription
project_idGCP project ID
regionGCP region (e.g. us-central1)
product_aliasShort name for resource naming
env_aliasEnvironment label (e.g. dev, prod)
module_nameModule identifier
package_pathPath to Docker build context
container_port8000Port the container listens on
create_redis_clusterfalseCreate Memorystore Redis
create_firestore_dbfalseCreate Firestore database
enable_jwt_authfalseEnable JWT auth on API Gateway
min_instance_count1Min Cloud Run instances (always-on)
max_instance_count10Max Cloud Run instances

Outputs

OutputDescription
agent_invoke_urlFull URL to the API Gateway endpoint
service_urlCloud Run service URL (direct, pre-gateway)

When to Choose Containerized vs Serverless

ContainerizedServerless
Cold startsNone (min≥1 always warm)Yes (scale-to-zero)
CostHigher (always-on billing)Lower (pay per request)
LatencyConsistentVariable (first request)
Best forConsistent traffic, latency-sensitiveSporadic workloads

Examples

See working examples in the repository:

Teardown

cd deploy
terraform destroy