ai_provider_universal

Universal AI Provider is a multi-instance, protocol-agnostic provider for the AI module. Instead of hardcoding one endpoint per provider module, it models your whole AI infrastructure as Drupal configuration: every server is a config entity, every model is a config entity — exportable, deployable, and overridable like any other Drupal config.

Run a local llama.cpp box, an Ollama instance, a vLLM moderation server, and a remote Fireworks or OpenAI account side by side, under a single provider.

AI Provider: Universal connects the AI module to any number of AI servers at once. Instead of one provider per vendor, it models your AI infrastructure as config entities:

Servers

— each with its own host, port, API key (Key entity), timeout and model filter. A local llama.cpp box, an Ollama instance, a vLLM moderation server and a remote Fireworks/OpenAI account can coexist under one provider.
-

Models

— discovered automatically from each server, exportable and deployable. Operation types (chat, embeddings, moderation, rerank, speech-to-text, text-to-image) are detected dynamically and can be overridden per model in the UI.
-

Backends

— protocol logic is pluggable. Ships with openai_compatible (llama.cpp, Ollama, vLLM, LM Studio, LiteLLM, OpenAI…) and fireworks (prefilled pricing and context lengths). Other modules can add native backends by implementing ServerBackendInterface.

Features

- Multiple server instances

— each with its own host, API key, timeout, and model filter. No plugin derivatives, no provider-dropdown pollution.

- Automatic model discovery

— models are discovered from each server and persisted as config entities. Re-discovery never overwrites your manual edits.

- Dynamic operation-type detection

— chat, embeddings, moderation, rerank, speech-to-text, and text-to-image are detected per model (llama.cpp server flags, HuggingFace pipeline tags, name heuristics) and can be overridden per model in the UI.

- Routing metadata per model

— cost per million tokens (input/output), quality tier (1–5), and context length, auto-detected where the server exposes them (llama.cpp --ctx-size, vLLM max_model_len).

- Backend plugin architecture

— protocol-specific logic lives in ServerBackend plugins. The module ships openai_compatible (llama.cpp, Ollama, vLLM, LM Studio, LiteLLM, Fireworks, OpenAI…) and fireworks (default endpoint, published pricing and context lengths prefilled at discovery). Other modules can contribute native backends by implementing a single interface.

Supported Operations

Chat completions (/v1/chat/completions)
Embeddings (/v1/embeddings)
Speech to Text (Whisper and similar)
Rerank
Moderation (native LlamaGuard3 and ShieldGemma support)
Text to Image (/v1/images/generations)

Included submodules

- Smart Router

— cost-aware routing as virtual models. Define routes with candidate models and quality thresholds for simple vs. complex prompts; the router picks the cheapest capable model per request — local first, remote only when needed — and logs every decision to a savings dashboard showing estimated spend and savings.

- Fact Check

— verifies answers claim by claim using a configurable checker model, optionally grounded in your own content through an AI Search vector index (RAG). Routes can require verification and automatically escalate to a stronger model when an answer fails the support-score threshold.

Together they implement a verify-and-escalate cascade: generate cheaply, check the answer against your content, and only pay for a frontier model when the cheap answer doesn't hold up.

Requirements

- Drupal 10.2+, 11, or 12
- AI ^1.2 and Key
- At least one OpenAI-compatible endpoint (local or remote)

Getting started

composer require drupal/ai_provider_universal:1.0.x-dev
drush pm:enable ai_provider_universal

# optional submodules:

drush pm:enable ai_provider_universal_router ai_provider_universal_factcheck

Recommended AI module patches

Two small bugs in the AI module affect this provider. Fixes ship in this module's patches/ directory and are declared in its composer.json:

- ai-support-optgrouped-model-options.patch — the AI settings form rejects models presented in optgroups (this provider groups models by server).
- ai-search-embeddings-engine-explode-limit.patch — ai_search breaks model ids containing double underscores (used here for server__model ids).

Composer does not apply patches from dependencies by default. To apply them:

composer require cweagans/composer-patches
composer config extra.enable-patching true
composer update drupal/ai

Without the patches the provider works, but model selects in the AI settings form may not validate, and ai_search cannot use this provider's embedding models. Upstream issues are being filed against the AI module.

Setup

1. Enable the module.
2. Go to Configuration → AI → AI Servers (/admin/config/ai/providers/universal) and add a server (host, port, optional API key).
3. Saving the server runs model discovery; review the detected models and adjust per-model operation types if needed.
4. Select provider/models per operation type in the AI module settings.

Discovery can be re-run any time with drush aip:discover-models [server_id] (alias aipdm) or by re-saving the server.

Relation to AI Provider: Llama.cpp

This module is the successor to AI Provider: Llama.cpp 2.x, which is no longer maintained. There is no migration path: configuration is not converted automatically. Both modules can be installed side by side — re-create your servers and model selections here, verify everything works, then uninstall the old provider.