rl_sorting

This module is included in DXPR CMS.

The Reinforcement Learning (RL) module implements A/B testing in the most efficient and effective way possible, minizing lost conversions using machine learning.

Thompson Sampling is a learning-while-doing method. Each time a visitor lands on your site the algorithm "rolls the dice" based on what it has learned so far. Variants that have performed well roll larger numbers, so they are shown more often, while weak copies still get a small chance to prove themselves. This simple trick means the system can discover winners very quickly without stopping normal traffic.

Traditional A/B tests run for a fixed horizon—say two weeks—during which half your visitors keep seeing the weaker version. Thompson Sampling avoids this waste. As soon as the algorithm has even a little evidence it quietly shifts most traffic to the better variant, saving conversions and shortening the wait for useful insights.

For full details of what goes on behind the curtains check the source code:
ThompsonCalculator.php.

RL Modules

RL Sorting - Intelligent content ordering/switching for Drupal Views

Features

Thompson Sampling Algorithm - Pure PHP implementation
Fast HTTP REST API - Optimized JSON endpoints for tracking and decisions
Administrative Reports - Experiment analysis interface
Service-based Architecture - Extensible design
Data Sovereignty, Privacy First - No cloud, just Drupal

You need RL if

A/B Testing - Test content variations
Content Optimization - Track content engagement
Feature Selection - Choose features to show users
Recommendations - Optimize content recommendations
Resource Allocation - Distribute resources across options

Thompson Sampling

Unlike traditional A/B testing, Thompson Sampling:

Adapts automatically - Shifts traffic to better options
Handles multiple options - Works with 2+ variations, even thousands of arms are handled well
Continuous learning - No fixed test duration
Bayesian approach - Incorporates uncertainty

Prefer a turnkey demo site?

Spin up DXPR CMS—Drupal pre-configured with DXPR Builder, DXPR Theme, RL (Reinforcement Learning) module, and security best practices.

Get DXPR CMS »

Installation

composer require drupal/rl
drush en rl

Verify rl.php Access

The RL module includes a .htaccess file that allows direct access to rl.php (following the same pattern as Drupal 11's contrib statistics module). Test that it's working:

curl -X POST -d "action=ping" http://example.com/modules/contrib/rl/rl.php

If the test fails:

Apache: Ensure .htaccess files are processed (AllowOverride All)
Nginx: Copy the rewrite rules from .htaccess to your server config
Security modules: Whitelist /modules/contrib/rl/rl.php

If server policies prevent direct access to rl.php, use the Drupal Routes API instead.

Drush Command Reference

Category Commands Description Discovery rl:list, rl:status, rl:performance, rl:trends List experiments, check phase/confidence, arm-level stats, historical trends Analysis rl:analyze, rl:export Full analysis with recommendations, export experiment data Experiment CRUD rl:experiment:create, rl:experiment:update, rl:experiment:delete Create, update, and delete experiments with --dry-run support Configuration rl:config:get, rl:config:set, rl:config:list, rl:config:reset Get/set module settings, list all with current values, reset to defaults Setup rl:setup-ai Install AI skill files for Claude Code, Codex, Gemini, Copilot, Cursor

AI Coding Assistant Integration

The RL module ships with a built-in Agent Skills file that teaches AI coding assistants how to manage experiments through natural language. Compatible with Claude Code, Codex CLI, Gemini CLI, GitHub Copilot, Cursor, and other tools supporting the standard.

After installing the module, run drush rl:setup-ai to enable AI assistant support. Your AI will then respond to natural language like:

"List all running experiments"
"Analyze the hero_cta_test experiment"
"Create a new A/B test for the homepage banner"
"What's the conversion rate for variant B?"

API

// Get the experiment manager
$experiment_manager = \Drupal::service('rl.experiment_manager');

// Record a trial (content shown)
$experiment_manager->recordTurn('my-experiment', 'variant-a');

// Record a success (user clicked)
$experiment_manager->recordReward('my-experiment', 'variant-a');

// Get Thompson Sampling scores
$scores = $experiment_manager->getThompsonScores('my-experiment');

// Select the best option
$ts_calculator = \Drupal::service('rl.ts_calculator');
$best_option = $ts_calculator->selectBestArm($scores);

// Override page cache for web components (optional)
$cache_manager = \Drupal::service('rl.cache_manager');
$cache_manager->overridePageCacheIfShorter(60); // 60 seconds

JavaScript API

Attach the rl/api library to get Drupal.rl on the page:

Drupal.rl.turn('hero_cta', 'v0');
Drupal.rl.reward('hero_cta', 'v0');

Drupal.rl.decide('hero_cta', ['v0', 'v1', 'v2']).then(function (armId) {
  showVariant(armId);
});

All three methods feed a shared 500 ms batch window, so every experiment on the page rides one POST to rl.php. See the README for the HTTP wire format and server-side integration patterns.

Cache Management

RL provides optional cache management for web components:

// Override page cache if experiment cache is shorter than site cache
\Drupal::service('rl.cache_manager')->overridePageCacheIfShorter(30);

How it works:

If site cache is 300s and experiment needs 30s → overrides to 30s
If site cache is 60s and experiment needs 300s → leaves at 60s
If site cache is disabled → no override

Use cases:

Views plugins using RL for content sorting
Blocks displaying A/B tested content
Components needing frequent RL score updates

Ready to get started? Install the module and begin implementing intelligent, adaptive decision-making in your Drupal applications today!

FAQ

Does RL store my experiment's variants?

It stores their performance data, not the authoritative list. Every variant that has received traffic has its own row in rl_arm_data with turn and reward counts. But "which variants are currently in play for experiment X" is owned by your module, not RL.

Different modules keep the live variant list in different places:

rl_sorting — the content returned by a View
rl_page_title — fields on a content entity
rl_menu_link — labels in a menu link
DXPR Builder — slots inside a block component

On each call, your module passes its current list — getThompsonScores($id, NULL, $arms) in PHP or Drupal.rl.decide(id, arms) in JS — and RL matches it against the stored stats to pick a winner. A newly added variant is automatically in play on the next render, and a removed one stops appearing. No second save step can fall out of sync with your module's UI.

When do I pick a winner and end an experiment?

Only when you want to. Thompson Sampling has no fixed horizon and no significance gate to wait out. It just shifts traffic to whatever variant is winning right now, and keeps adapting as evidence changes.

Two patterns, depending on what you are testing:

Converging tests — a better page title, a clearer checkout button, a stronger hero image. Once the report shows a confident winner, lock it in and move on to the next test.
Evergreen experiments — blog post lists where reader interest drifts week to week, banner ads that fade as returning visitors tune them out, seasonal calls to action. Leave them running. Thompson Sampling will follow the winner as it shifts.

In both cases the loser of a pair just stops receiving traffic on its own, so there is no urgency to declare a winner by hand. If you are used to fixed-horizon A/B tools, this is the biggest mental shift: there is no "test complete" flag to chase.