Version: Next

Walled AI Guardrails

Walled AI guardrails provide safety validation and PII redaction for Agent Kernel interactions. This integration validates user inputs, masks sensitive values before agent execution, and restores placeholders in output responses.

What Walled AI Guardrails Provide

Input safety validation using Walled AI Protect
Input PII redaction using Walled AI Redact
Output unmasking using session-stored placeholder mappings
Provider-level integration without JSON guardrail rule files

Prerequisites

Install Agent Kernel with Walled AI extras:

pip install agentkernel[walledai]

Set required environment variables:

export WALLED_API_KEY=your-walledai-api-key
export AK_LOGGING__AK__LEVEL=DEBUG

Configuration

Configure Walled AI in your config.yaml:

guardrail:
  input:
    enabled: true
    type: walledai
    pii: true
  output:
    enabled: true
    type: walledai
    pii: true

Equivalent environment-variable configuration:

export AK_GUARDRAIL__INPUT__ENABLED=true
export AK_GUARDRAIL__INPUT__TYPE=walledai
export AK_GUARDRAIL__INPUT__PII=true
export AK_GUARDRAIL__OUTPUT__ENABLED=true
export AK_GUARDRAIL__OUTPUT__TYPE=walledai
export AK_GUARDRAIL__OUTPUT__PII=true

# Optional: disable WalledAI PII redaction/unmasking while keeping safety checks
# export AK_GUARDRAIL__INPUT__PII=false
# export AK_GUARDRAIL__OUTPUT__PII=false

How It Works

Input Guardrails

Iterate incoming request objects
For each text request, validate text with Walled AI Protect (safety)
For each text request, redact sensitive entities with Walled AI Redact
Preserve non-text requests unchanged (for example file/image inputs)
Store placeholder mapping in session cache
Forward masked text requests to the agent

If safety validation fails, Agent Kernel returns a safe refusal response.

Output Guardrails

Extract outgoing agent text
Load stored placeholder mapping from session state
Replace placeholders with original values
Return unmasked response

Session Mapping Behavior

Walled AI redaction placeholders are persisted in session cache to support follow-up turns and restart-tolerant flows when durable session storage is enabled.

Recommended controls for production:

Minimize retained mapping scope
Apply retention/TTL policies
Restrict storage access
Encrypt data at rest

Optional: Local WalledGuard-Edge Moderation

If you want to run local moderation experiments, you can use walledai/walledguard-edge from Hugging Face.

According to Walled AI's announcement, WalledGuard-Edge is a 0.6B open-source model (Apache-2.0) positioned as stronger than LlamaGuard3 (1B) across multilingual and multiple jailbreak categories.

API access and product updates: www.walled.ai

Manual setup

For local inference steps and the latest runnable example code, follow the model card:

Hugging Face model page: walledai/walledguard-edge

Typical local dependencies include torch and transformers.

This local flow is optional and separate from the default Agent Kernel Walled AI API integration.

Example

Input:

my name is john

Masked request sent to agent:

my name is [Person_1]

When the reply contains [Person_1], output guardrail restores it to john before returning response.

Troubleshooting

Guardrails not triggering

Ensure input/output guardrails are enabled in config
Verify type: walledai for both input and output
Confirm WALLED_API_KEY is set in the runtime environment
Enable debug logs with AK_LOGGING__AK__LEVEL=DEBUG

Missing API key or provider errors

Verify the shell environment used to start the runtime includes:

export WALLED_API_KEY=your-walledai-api-key

Unexpected masked placeholders in response

Ensure output guardrails are enabled
Ensure session ID is stable across turns
Verify session storage mode and persistence expectations

What Walled AI Guardrails Provide​

Prerequisites​

Configuration​

How It Works​

Input Guardrails​

Output Guardrails​

Session Mapping Behavior​

Optional: Local WalledGuard-Edge Moderation​

Manual setup​

Example​

Troubleshooting​

Guardrails not triggering​

Missing API key or provider errors​

Unexpected masked placeholders in response​

Related Resources​

Ready to Ship YourFirst Agent?