ai_recipe_guardrails_prompt_safety
A Drupal recipe that installs a set of AI guardrails to protect public-facing AI interactions from two distinct categories of risk: structurally malicious input (injection attacks) and semantically harmful requests (topics with legal or reputational exposure). Apply this recipe as a baseline safety layer on any site where the AI module processes untrusted user input.
What This Recipe Does
This recipe installs ten individual guardrails and two guardrail sets into a Drupal site running the AI module.
Guardrail Set: Prompt Safety — Security
Contains seven guardrails applied to the pre-generate phase (user input). It covers two layers of protection:
- Regex-based (six guardrails): fast, zero-cost checks that detect structurally malicious strings such as
<script>tags, inline event handlers,javascript:URLs, dangerous HTML tags, CSS expression injection, and JavaScript execution function calls. - AI-based (one guardrail): topic classification that detects semantic prompt manipulation like jailbreak attempts, system prompt overrides, and role hijacking, where rigid patterns are insufficient because attack phrasing constantly evolves.
Guardrail Set: Prompt Safety — Liability
Contains three guardrails applied to the pre-generate phase (user input). All three use AI topic classification to detect requests covering domains where an automated response creates legal or reputational risk for the site operator:
- Legal Advice: contract interpretation, litigation strategy, regulatory compliance.
- Medical Advice: diagnosis, treatment recommendations, medication guidance.
- Sensitive Topics: politically and socially divisive subjects (elections, religion, war, etc.).
When to use this recipe:
- Any public-facing AI interaction (chatbots, AI assistants, content generation tools)
- Sites where users can submit free-text prompts that reach an AI provider
- Environments that need a documented, auditable safety baseline before deploying AI features
Requirements
- Drupal 11.2 or later
drupal/ai^1.3- A configured AI provider that supports topic classification (required by the four
restrict_to_topicguardrails)
How to Apply
Run the following Drush command from your Drupal root:
drush recipe ../recipes/ai_recipe_guardrails_prompt_safety The recipe does not configure a specific AI provider or model. The restrict_to_topic guardrails will use whichever provider and model your site has set as the default for the AI module.
Configuration Installed
Guardrail Sets
Machine name Label Guardrails included Phaseprompt_safety_security
Prompt Safety: Security
7 (see below)
Pre-generate
prompt_safety_liability
Prompt Safety: Liability
3 (see below)
Pre-generate
Stop threshold for both sets: 0.8
Guardrails
Machine name Label Pluginsecurity_script_tag_injection
Security: Script Tag Injection
regexp_guardrail
security_dangerous_html_tags
Security: Dangerous HTML Tags
regexp_guardrail
security_html_event_handler_injection
Security: HTML Event Handler Injection
regexp_guardrail
security_javascript_protocol
Security: JavaScript Protocol
regexp_guardrail
security_javascript_execution_functions
Security: JavaScript Execution Functions
regexp_guardrail
security_css_expression_injection
Security: CSS Expression Injection
regexp_guardrail
security_prompt_manipulation
Security: Prompt Manipulation
restrict_to_topic
liability_legal_advice
Liability: Legal Advice
restrict_to_topic
liability_medical_advice
Liability: Medical Advice
restrict_to_topic
liability_sensitive_topics
Liability: Sensitive Topics
restrict_to_topic