ai_rag_service
The AI RAG Service turns the Drupal AI module into a headless, authenticated, rate-limited HTTP API for Retrieval-Augmented Generation. What JSON:API is to content, this is to AI: it wraps and serves the AI engine as clean JSON endpoints for grounded Q&A, multi-turn chat and semantic search — with citations, streaming and cost controls built in.
Features
- Three ready-made endpoints: grounded Q&A, multi-turn chat, and LLM-free semantic search, plus a status/diagnostics endpoint.
- Real token-by-token streaming over Server-Sent Events, with citations and token usage on the final event.
- Grounded answers with source documents, URLs, similarity scores and snippets on every response.
- Swappable RAG backend (in-process via the AI module, or proxy to an external service) — backends are plugins.
- No-code endpoint builder in the admin UI, plus attribute-plugin resources for code-defined endpoints, each with auto-generated per-HTTP-method permissions.
- Security first: Basic Auth/cookie auth, per-method permissions, CORS allow-list, prompt-injection hardening, and a relevance-score floor.
- Cost control: per-consumer rate limiting and token budgets with reliable metering, plus answer caching.
- Per-consumer isolation, opt-in audit logging, and an optional submodule for throttled failure/flagged-question email alerts.
Post-Installation
Configure at least one chat-capable AI provider via the AI module, then visit Configuration → AI → RAG Service to choose your provider/model, retrieval index and limits. Grant the "use ai rag service" permission to roles that may call the API. See the README below for full setup, including optional semantic (vector) retrieval via Search API + a vector database.
Additional Requirements
Requires the AI module, the Key module, core Basic Auth, and at least one configured AI chat provider. For semantic retrieval, also Search API, the AI module's AI Search submodule, and a vector-DB provider such as Milvus.
Similar projects
Unlike provider-specific chatbot modules, this is provider-agnostic (it depends only on the AI module, not on any specific LLM vendor) and exposes RAG as a clean, authenticated, guardrailed HTTP API rather than a site-facing UI — closer to "JSON:API for AI" than to a chat widget.