ai_provider_universal
Universal AI Provider is a multi-instance, protocol-agnostic provider for the AI module. Instead of hardcoding one endpoint per provider module, it models your whole AI infrastructure as Drupal configuration: every server is a config entity, every model is a config entity — exportable, deployable, and overridable like any other Drupal config.
Run a local llama.cpp box, an Ollama instance, a vLLM moderation server, and a remote Fireworks or OpenAI account side by side, under a single provider.
AI Provider: Universal connects the AI module to any number of AI servers at once. Instead of one provider per vendor, it models your AI infrastructure as config entities:
-
Servers
— each with its own host, port, API key (Key entity), timeout and model filter. A local llama.cpp box, an Ollama instance, a vLLM moderation server and a remote Fireworks/OpenAI account can coexist under one provider.
-
Models
— discovered automatically from each server, exportable and deployable. Operation types (chat, embeddings, moderation, rerank, speech-to-text, text-to-image) are detected dynamically and can be overridden per model in the UI.
-
Backends
— protocol logic is pluggable. Ships with openai_compatible (llama.cpp, Ollama, vLLM, LM Studio, LiteLLM, OpenAI…) and fireworks (prefilled pricing and context lengths). Other modules can add native backends by implementing ServerBackendInterface.
Features
- Multiple server instances
— each with its own host, API key, timeout, and model filter. No plugin derivatives, no provider-dropdown pollution.
- Automatic model discovery
— models are discovered from each server and persisted as config entities. Re-discovery never overwrites your manual edits.
- Dynamic operation-type detection
— chat, embeddings, moderation, rerank, speech-to-text, and text-to-image are detected per model (llama.cpp server flags, HuggingFace pipeline tags, name heuristics) and can be overridden per model in the UI.
- Routing metadata per model
— cost per million tokens (input/output), quality tier (1–5), and context length, auto-detected where the server exposes them (llama.cpp --ctx-size, vLLM max_model_len).
- Backend plugin architecture
— protocol-specific logic lives in ServerBackend plugins. The module ships openai_compatible (llama.cpp, Ollama, vLLM, LM Studio, LiteLLM, Fireworks, OpenAI…) and fireworks (default endpoint, published pricing and context lengths prefilled at discovery). Other modules can contribute native backends by implementing a single interface.
Supported Operations
- Chat completions (
/v1/chat/completions) - Embeddings (
/v1/embeddings) - Speech to Text (Whisper and similar)
- Rerank
- Moderation (native LlamaGuard3 and ShieldGemma support)
- Text to Image (
/v1/images/generations)
Included submodules
- Smart Router
— cost-aware routing as virtual models. Define routes with candidate models and quality thresholds for simple vs. complex prompts; the router picks the cheapest capable model per request — local first, remote only when needed — and logs every decision to a savings dashboard showing estimated spend and savings.
- Fact Check
— verifies answers claim by claim using a configurable checker model, optionally grounded in your own content through an AI Search vector index (RAG). Routes can require verification and automatically escalate to a stronger model when an answer fails the support-score threshold.
Together they implement a verify-and-escalate cascade: generate cheaply, check the answer against your content, and only pay for a frontier model when the cheap answer doesn't hold up.
Requirements
- Drupal 10.2+, 11, or 12
- AI ^1.2 and Key
- At least one OpenAI-compatible endpoint (local or remote)
Getting started
composer require drupal/ai_provider_universal:1.0.x-dev drush pm:enable ai_provider_universal# optional submodules:
drush pm:enable ai_provider_universal_router ai_provider_universal_factcheck
Recommended AI module patches
Two small bugs in the AI module affect this provider. Fixes ship in this module's patches/ directory and are declared in its composer.json:
- ai-support-optgrouped-model-options.patch — the AI settings form rejects models presented in optgroups (this provider groups models by server).
- ai-search-embeddings-engine-explode-limit.patch — ai_search breaks model ids containing double underscores (used here for server__model ids).
Composer does not apply patches from dependencies by default. To apply them:
composer require cweagans/composer-patches composer config extra.enable-patching true composer update drupal/ai
Without the patches the provider works, but model selects in the AI settings form may not validate, and ai_search cannot use this provider's embedding models. Upstream issues are being filed against the AI module.
Setup
1. Enable the module.
2. Go to Configuration → AI → AI Servers (/admin/config/ai/providers/universal) and add a server (host, port, optional API key).
3. Saving the server runs model discovery; review the detected models and adjust per-model operation types if needed.
4. Select provider/models per operation type in the AI module settings.
Discovery can be re-run any time with drush aip:discover-models [server_id] (alias aipdm) or by re-saving the server.
Relation to AI Provider: Llama.cpp
This module is the successor to AI Provider: Llama.cpp 2.x, which is no longer maintained. There is no migration path: configuration is not converted automatically. Both modules can be installed side by side — re-create your servers and model selections here, verify everything works, then uninstall the old provider.