ai_recipe_guardrails_prompt_safety

A Drupal recipe that installs a set of AI guardrails to protect public-facing AI interactions from two distinct categories of risk: structurally malicious input (injection attacks) and semantically harmful requests (topics with legal or reputational exposure). Apply this recipe as a baseline safety layer on any site where the AI module processes untrusted user input.

What This Recipe Does

This recipe installs ten individual guardrails and two guardrail sets into a Drupal site running the AI module.

Guardrail Set: Prompt Safety — Security

Contains seven guardrails applied to the pre-generate phase (user input). It covers two layers of protection:

Regex-based (six guardrails): fast, zero-cost checks that detect structurally malicious strings such as <script> tags, inline event handlers, javascript: URLs, dangerous HTML tags, CSS expression injection, and JavaScript execution function calls.
AI-based (one guardrail): topic classification that detects semantic prompt manipulation like jailbreak attempts, system prompt overrides, and role hijacking, where rigid patterns are insufficient because attack phrasing constantly evolves.

Guardrail Set: Prompt Safety — Liability

Contains three guardrails applied to the pre-generate phase (user input). All three use AI topic classification to detect requests covering domains where an automated response creates legal or reputational risk for the site operator:

Legal Advice: contract interpretation, litigation strategy, regulatory compliance.
Medical Advice: diagnosis, treatment recommendations, medication guidance.
Sensitive Topics: politically and socially divisive subjects (elections, religion, war, etc.).

When to use this recipe:

Any public-facing AI interaction (chatbots, AI assistants, content generation tools)
Sites where users can submit free-text prompts that reach an AI provider
Environments that need a documented, auditable safety baseline before deploying AI features

Requirements

Drupal 11.2 or later
drupal/ai ^1.3
A configured AI provider that supports topic classification (required by the four restrict_to_topic guardrails)

How to Apply

Run the following Drush command from your Drupal root:

drush recipe ../recipes/ai_recipe_guardrails_prompt_safety

The recipe does not configure a specific AI provider or model. The restrict_to_topic guardrails will use whichever provider and model your site has set as the default for the AI module.

Configuration Installed

Guardrail Sets

Machine name Label Guardrails included Phase prompt_safety_security Prompt Safety: Security 7 (see below) Pre-generate prompt_safety_liability Prompt Safety: Liability 3 (see below) Pre-generate

Stop threshold for both sets: 0.8

Guardrails

Machine name Label Plugin security_script_tag_injection Security: Script Tag Injection regexp_guardrail security_dangerous_html_tags Security: Dangerous HTML Tags regexp_guardrail security_html_event_handler_injection Security: HTML Event Handler Injection regexp_guardrail security_javascript_protocol Security: JavaScript Protocol regexp_guardrail security_javascript_execution_functions Security: JavaScript Execution Functions regexp_guardrail security_css_expression_injection Security: CSS Expression Injection regexp_guardrail security_prompt_manipulation Security: Prompt Manipulation restrict_to_topic liability_legal_advice Liability: Legal Advice restrict_to_topic liability_medical_advice Liability: Medical Advice restrict_to_topic liability_sensitive_topics Liability: Sensitive Topics restrict_to_topic