Skip to main content

What are Custom Guardrails?

Custom guardrails allow you to integrate your own security and content moderation systems with Unbound Security AI Gateway. This “Bring Your Own Guardrails” feature enables you to leverage your existing security infrastructure, custom policies, and specialized content analysis tools while maintaining the benefits of Unbound’s unified AI gateway.

How Custom Guardrails Work

Custom guardrails operate through webhook-based integration:
  1. Registration: You register your custom guardrail endpoint with API credentials
  2. Verification: Unbound sends a test request to verify your endpoint
  3. Integration: Once verified, your guardrail becomes available for applications
  4. Execution: Requests are sent to your endpoint for analysis during AI interactions
  5. Response Processing: Your endpoint’s response determines the action (allow, block, or redact)

Step-by-Step Setup Guide

Step 1: Access Settings and Register Custom Guardrail

Navigate to your Unbound Security settings to register your custom guardrail:
  1. Go to Settings: From the main dashboard, click on the “Settings” section
  2. Select Bring Your Own Guardrails Guardrails: Click on”Bring Your Own Guardrails”
  3. Add New Guardrail: Click “Add Custom Guardrail” to start the registration process

Step 2: Configure Your Custom Guardrail Endpoint

Fill out the custom guardrail registration form with the following information:

Required Configuration:

  • Guardrail Name: Enter a descriptive name for your custom guardrail
  • Endpoint URL: Provide the full URL of your guardrail service
  • API Key: Enter the API key for authenticating requests to your endpoint

Optional Configuration:

  • Custom Headers: Add any additional headers required by your endpoint

Step 3: Verification Process

After submitting your configuration, Unbound will automatically verify your endpoint:
  1. Test Request: Unbound sends a verification request to your endpoint
  2. Response Validation: Your endpoint must respond with the correct format
  3. Authentication Check: API key and headers are validated
  4. Success Confirmation: Once verified, your guardrail is registered

Verification Request Format:

{
  "messages": [
    {
      "role": "user",
      "content": "This is a test message for guardrail verification"
    }
  ],
  "model": "gpt-4",
  "temperature": 0.7,
  "max_tokens": 100
}

Required Verification Response:

Your endpoint must respond with one of these formats: Allow Response:
{
  "status": "allowed"
}
Block Response:
{
  "status": "blocked",
  "block_reason": "Content violates policy: contains sensitive information",
  "reason": "Content violates policy: contains sensitive information"
}
Redact Response:
{
  "status": "redacted",
  "redacted_content": "This content has been [REDACTED] for security reasons",
  "redact_reason": "Sensitive information detected and redacted",
  "reason": "Sensitive information detected and redacted"
}

Step 4: Enable Custom Guardrails in Applications

Once your custom guardrail is verified and registered, you can enable it for any application:
  1. Access Guardrails Tab: Click on the “Guardrails” tab
  2. Find Custom Guardrail: Your custom guardrail will appear in the list
  3. Enable Guardrail: Toggle the switch to enable your custom guardrail

Custom Guardrail Endpoint Requirements

Request Format

Your custom guardrail endpoint will receive requests in the following format:
{
  "messages": [
    {
      "role": "user",
      "content": "The user's message content here"
    }
  ],
  "model": "gpt-4",
  "temperature": 0.7,
  "max_tokens": 100
}

Response Format Options

Your endpoint must respond with one of these three response formats:

1. Allow Response

{
  "status": "allowed"
}

2. Block Response

{
  "status": "blocked",
  "block_reason": "Content violates policy: contains sensitive information",
  "reason": "Content violates policy: contains sensitive information"
}

3. Redact Response

{
  "status": "redacted",
  "redacted_content": "This content has been [REDACTED] for security reasons",
  "redact_reason": "Sensitive information detected and redacted",
  "reason": "Sensitive information detected and redacted"
}

Response Requirements

  • Status Field: Must be one of: "allowed", "blocked", or "redacted"
  • Response Time: Should respond within 5 seconds
  • HTTP Status: Must return HTTP 200 for successful processing
  • Content Type: Response must be application/json

Example Implementation

Here’s a simple example of a custom guardrail endpoint implementation:

Python Flask Example

from flask import Flask, request, jsonify
import re

app = Flask(__name__)

# Define patterns for content analysis
BLOCKED_PATTERNS = [
    r"password[s]?\b",
    r"credit.?card",
    r"ssn\b"
]

SENSITIVE_PATTERNS = [
    r"phone.?number",
    r"address",
    r"private"
]

def check_content(message):
    """Check content against patterns"""
    if not message:
        return False, None, False, None
    
    for pattern in BLOCKED_PATTERNS:
        if re.search(pattern, message, re.IGNORECASE):
            return True, f"Contains blocked content: {pattern}", False, None

    for pattern in SENSITIVE_PATTERNS:
        if re.search(pattern, message, re.IGNORECASE):
            redacted = re.sub(pattern, '[REDACTED]', message, flags=re.IGNORECASE)
            return False, None, True, redacted

    return False, None, False, None

@app.route('/webhook', methods=['POST'])
def guardrail_webhook():
    """Handle incoming webhook requests"""
    payload = request.get_json()
    messages = payload.get('messages', [])
    
    user_content = ""
    for message in messages:
        if message.get('role') == 'user':
            user_content = message.get('content', '')
            break
    
    is_blocked, block_reason, is_sensitive, redacted_content = check_content(user_content)

    if is_blocked:
        return jsonify({
            "status": "blocked",
            "block_reason": block_reason,
            "reason": block_reason
        })

    if is_sensitive:
        return jsonify({
            "status": "redacted",
            "redacted_content": redacted_content,
            "redact_reason": "Contains sensitive information",
            "reason": "Contains sensitive information"
        })

    return jsonify({"status": "allowed"})

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=5000)

Troubleshooting

Common Issues

Verification Failed
  • Check that your endpoint is accessible
  • Verify API key authentication is working
  • Ensure response format matches requirements exactly
Authentication Issues
  • Verify API key is correct and active
  • Ensure your endpoint accepts the provided authentication