code_benchmarker
Code benchmarker is a dev-only dashboard for comparing the runtime cost of two (or more) implementations of the same operation — legacy vs optimized, version A vs version B, before/after a refactor. Other modules contribute scenarios by registering tagged services; the dashboard discovers and runs them all.
Not for production. The module exposes an admin route that runs the work hundreds of times against the live database.
Requirements
This module requires no contributed modules outside Drupal core. It runs on Drupal 10.3+ or Drupal 11 and uses the core User module (for the bundled example scenario).
Installation
Install as you would any contributed Drupal module. With DDEV and Drush: ddev drush en code_benchmarker -y.
Configuration
- Grant the Administer site configuration permission to the roles that should reach the dashboard.
-
Open the benchmarker dashboard
(Administration » Configuration » Development » Code benchmarker). The page lists every registered scenario; nothing runs yet. - Click a scenario title to run its iterations and see the timing table.
URL options
- The index page lists scenarios and runs nothing — cheap to load.
- The per-scenario page runs one scenario and shows per-implementation timings plus the speedup summary.
- Add
?iter=Nto either URL to override the iteration count (clamped to 10..1000, default 10). On the index page this propagates to the scenario links. - Quick-pick
iter=10/100/500/1000links appear next to each scenario and on the result page so you can re-run without typing.
Adding a scenario
A scenario implements \Drupal\code_benchmarker\Benchmark\BenchmarkScenarioInterface and is registered as a service tagged code_benchmarker.scenario — no plugin manager, no config, no UI registration.
- Write the scenario class under
your_module/src/Benchmark/. Implementid(),label(),description(),prepare()(throwBenchmarkSkipExceptionto skip),reset(), andimplementations()(label → callable, baseline first). - Register it as a tagged service:
tags: [{ name: code_benchmarker.scenario }]. - Rebuild caches:
ddev drush cr, then reload the dashboard.
{# The "[{ name: ... }]" in the next line is literal text, not a Twig tag. #}
See the bundled example and the README for a full, copy-pasteable scenario class.
How the runner works
For each scenario, prepare() runs once (or throws BenchmarkSkipException to skip). Then for each implementation, in declaration order:
- Warm-up:
reset()then one untimed call, priming OPcache, class loading, and the lazy DI graph. - Pre-iter-1 reset:
reset()runs once more so iteration 1 starts cold (the real cache-miss path). - Timed loop (N iterations, no reset between them): each call is timed with
hrtime(TRUE), the monotonic nanosecond clock.
No reset runs between iterations 2..N on purpose: if an optimized implementation adds a caching layer, those later iterations hit the cache and show the cumulative speedup, while a cacheless implementation pays full cost every time. The avg/median/total gap is the point.
What goes in reset()
Iteration 1 must look like a real cold-cache request, so reset everything the work touches:
- Every entity storage the implementation reads or writes (
->resetCache()). - Drupal statics the implementation populates — reset by named key (avoid the bare, expensive
drupal_static_reset()). - Specific cache-backend entries it reads/writes (delete the exact cids; avoid
deleteAll()). - Internal memos on injected services.
Reading the results
- Avg vs median: a large gap means noisy samples — re-run with
?iter=500for tighter numbers. - Max vs min on a caching impl: the max is usually iteration 1 (cold); min/median come from warm iterations. A large ratio is expected when caching is in play.
- Std-dev: small relative to mean on a cacheless impl; looks high on a caching impl because iteration 1 is much slower — that is the cache filling, not noise.
- Last result: should match across implementations. Mismatches are flagged — investigate before trusting the speedup.
- Speedup line: shown only when a scenario declares exactly two implementations.
When not to use this dashboard
- Production-class microbenchmarks — this is a request-time profiler; use PHPBench for sub-millisecond timings.
- True cold-backend simulation — the runner cannot touch PHP OPcache, the MySQL query cache, or the InnoDB buffer pool.
- Per-iteration cold-cache comparison — by design iterations 2..N share state with iteration 1.
- Implementations that mutate data — each iteration commits; wrap in a transaction and roll back, or clean up in
reset().
Troubleshooting
- "No benchmark scenarios are registered" — the service is not tagged
code_benchmarker.scenario, or you forgotddev drush cr. - "Same return value? NO" — the two implementations diverge (a real catch) or
reset()is incomplete and the second impl sees state left by the first. - Maximum execution time / memory exceeded — the controller already lifts PHP limits, but a fronting proxy may not. Run one scenario at a time, lower
?iter=N, or raise the proxy timeout.
Examples
UserRoleFilterExampleBenchmark (src/Benchmark/Examples/) counts the user accounts holding a role two ways: the legacy path loads every user with loadMultiple() and filters in PHP; the optimized path pushes the roles condition into the entity query and asks the database for a COUNT. It depends only on the entity type manager, so it runs on any site.