Drupal is a registered trademark of Dries Buytaert
drupal 11.4.0 Update released for Drupal core (11.4.0)! drupal 10.6.12 Update released for Drupal core (10.6.12)! drupal 11.3.13 Update released for Drupal core (11.3.13)! drupal 10.6.11 Update released for Drupal core (10.6.11)! drupal 11.3.12 Update released for Drupal core (11.3.12)! drupal 11.2.14 Update released for Drupal core (11.2.14)! drupal 10.5.12 Update released for Drupal core (10.5.12)! cms 2.1.3 Update released for Drupal core (2.1.3)! drupal 10.5.11 Update released for Drupal core (10.5.11)! drupal 11.3.11 Update released for Drupal core (11.3.11)! drupal 11.2.13 Update released for Drupal core (11.2.13)! drupal 10.6.10 Update released for Drupal core (10.6.10)! cms 2.1.2 Update released for Drupal core (2.1.2)! drupal 11.1.10 Update released for Drupal core (11.1.10)! drupal 10.5.10 Update released for Drupal core (10.5.10)! drupal 10.4.10 Update released for Drupal core (10.4.10)! drupal 11.2.12 Update released for Drupal core (11.2.12)! drupal 11.3.10 Update released for Drupal core (11.3.10)! drupal 10.6.9 Update released for Drupal core (10.6.9)! drupal 10.6.8 Update released for Drupal core (10.6.8)!

Universal AI Provider is a multi-instance, protocol-agnostic provider for the AI module. Instead of hardcoding one endpoint per provider module, it models your whole AI infrastructure as Drupal configuration: every server is a config entity, every model is a config entity — exportable, deployable, and overridable like any other Drupal config.

Run a local llama.cpp box, an Ollama instance, a vLLM moderation server, and a remote Fireworks or OpenAI account side by side, under a single provider.

AI Provider: Universal connects the AI module to any number of AI servers at once. Instead of one provider per vendor, it models your AI infrastructure as config entities:

-

Servers

— each with its own host, port, API key (Key entity), timeout and model filter. A local llama.cpp box, an Ollama instance, a vLLM moderation server and a remote Fireworks/OpenAI account can coexist under one provider.
-

Models

— discovered automatically from each server, exportable and deployable. Operation types (chat, embeddings, moderation, rerank, speech-to-text, text-to-image) are detected dynamically and can be overridden per model in the UI.
-

Backends

— protocol logic is pluggable. Ships with openai_compatible (llama.cpp, Ollama, vLLM, LM Studio, LiteLLM, OpenAI…) and fireworks (prefilled pricing and context lengths). Other modules can add native backends by implementing ServerBackendInterface.

Features

- Multiple server instances

— each with its own host, API key, timeout, and model filter. No plugin derivatives, no provider-dropdown pollution.

- Automatic model discovery

— models are discovered from each server and persisted as config entities. Re-discovery never overwrites your manual edits.

- Dynamic operation-type detection

— chat, embeddings, moderation, rerank, speech-to-text, and text-to-image are detected per model (llama.cpp server flags, HuggingFace pipeline tags, name heuristics) and can be overridden per model in the UI.

- Routing metadata per model

— cost per million tokens (input/output), quality tier (1–5), and context length, auto-detected where the server exposes them (llama.cpp --ctx-size, vLLM max_model_len).

- Backend plugin architecture

— protocol-specific logic lives in ServerBackend plugins. The module ships openai_compatible (llama.cpp, Ollama, vLLM, LM Studio, LiteLLM, Fireworks, OpenAI…) and fireworks (default endpoint, published pricing and context lengths prefilled at discovery). Other modules can contribute native backends by implementing a single interface.

Supported Operations

  • Chat completions (/v1/chat/completions)
  • Embeddings (/v1/embeddings)
  • Speech to Text (Whisper and similar)
  • Rerank
  • Moderation (native LlamaGuard3 and ShieldGemma support)
  • Text to Image (/v1/images/generations)

Included submodules

- Smart Router

— cost-aware routing as virtual models. Define routes with candidate models and quality thresholds for simple vs. complex prompts; the router picks the cheapest capable model per request — local first, remote only when needed — and logs every decision to a savings dashboard showing estimated spend and savings.

- Fact Check

— verifies answers claim by claim using a configurable checker model, optionally grounded in your own content through an AI Search vector index (RAG). Routes can require verification and automatically escalate to a stronger model when an answer fails the support-score threshold.

Together they implement a verify-and-escalate cascade: generate cheaply, check the answer against your content, and only pay for a frontier model when the cheap answer doesn't hold up.

Requirements

- Drupal 10.2+, 11, or 12
- AI ^1.2 and Key
- At least one OpenAI-compatible endpoint (local or remote)

Getting started

composer require drupal/ai_provider_universal:1.0.x-dev
drush pm:enable ai_provider_universal

# optional submodules:

drush pm:enable ai_provider_universal_router ai_provider_universal_factcheck

Recommended AI module patches

Two small bugs in the AI module affect this provider. Fixes ship in this module's patches/ directory and are declared in its composer.json:

- ai-support-optgrouped-model-options.patch — the AI settings form rejects models presented in optgroups (this provider groups models by server).
- ai-search-embeddings-engine-explode-limit.patch — ai_search breaks model ids containing double underscores (used here for server__model ids).

Composer does not apply patches from dependencies by default. To apply them:

composer require cweagans/composer-patches
composer config extra.enable-patching true
composer update drupal/ai

Without the patches the provider works, but model selects in the AI settings form may not validate, and ai_search cannot use this provider's embedding models. Upstream issues are being filed against the AI module.

Setup

1. Enable the module.
2. Go to Configuration → AI → AI Servers (/admin/config/ai/providers/universal) and add a server (host, port, optional API key).
3. Saving the server runs model discovery; review the detected models and adjust per-model operation types if needed.
4. Select provider/models per operation type in the AI module settings.

Discovery can be re-run any time with drush aip:discover-models [server_id] (alias aipdm) or by re-saving the server.

Relation to AI Provider: Llama.cpp

This module is the successor to AI Provider: Llama.cpp 2.x, which is no longer maintained. There is no migration path: configuration is not converted automatically. Both modules can be installed side by side — re-create your servers and model selections here, verify everything works, then uninstall the old provider.

Activity

Total releases
1
First release
Jul 2026
Latest release
1 day ago
Release cadence
Stability
0% stable

Releases

Version Type Release date
1.0.x-dev Dev Jul 2, 2026