Drupal is a registered trademark of Dries Buytaert
cms 2.1.2 Update released for Drupal core (2.1.2)! drupal 11.1.10 Update released for Drupal core (11.1.10)! drupal 10.5.10 Update released for Drupal core (10.5.10)! drupal 10.4.10 Update released for Drupal core (10.4.10)! drupal 11.2.12 Update released for Drupal core (11.2.12)! drupal 11.3.10 Update released for Drupal core (11.3.10)! drupal 10.6.9 Update released for Drupal core (10.6.9)! drupal 10.6.8 Update released for Drupal core (10.6.8)! drupal 11.3.9 Update released for Drupal core (11.3.9)! drupal 11.3.8 Update released for Drupal core (11.3.8)! drupal 11.3.7 Update released for Drupal core (11.3.7)! drupal 11.2.11 Update released for Drupal core (11.2.11)! drupal 10.6.7 Update released for Drupal core (10.6.7)! drupal 10.5.9 Update released for Drupal core (10.5.9)! cms 2.1.1 Update released for Drupal core (2.1.1)! drupal 11.3.6 Update released for Drupal core (11.3.6)! drupal 10.6.6 Update released for Drupal core (10.6.6)! cms 2.1.0 Update released for Drupal core (2.1.0)! linkit 7.0.15 Minor update available for module linkit (7.0.15). views_data_export 8.x-1.10 Minor update available for module views_data_export (8.x-1.10).

drupal_rag

No security coverage
View on drupal.org

Drupal RAG

Transforms your Drupal site into a Retrieval-Augmented Generation (RAG) system. Content entities are indexed as vector embeddings and retrieved at query time to provide relevant context to large language models - all without sending your data to third-party services.

Requirements

  • Drupal 11
  • PostgreSQL with the pgvector extension
  • Ollama running locally or on your network

How it works

Indexing pipeline

When content is created, updated, or deleted, the module intercepts the entity hook events and pushes them into a processing queue:

  1. Entity event - entity_insert, entity_update, or entity_delete fires (node, media, file, etc.)
  2. Filter - only entity types selected in the configuration form are queued; unpublished entities are skipped
  3. Extract - the queue worker extracts plain text from the entity. File entities (PDF, DOCX, TXT, etc.) are parsed by the FileTextExtractor. All other entities are rendered using the configured view mode and stripped of HTML.
  4. Chunk - text is split into overlapping chunks (configurable size and overlap). The chunker respects sentence boundaries to keep logical units intact.
  5. Embed - each chunk is sanitized, prepended with a search_document: prefix, and sent to Ollama's /api/embed endpoint using the configured model (e.g. nomic-embed-text) to generate a vector embedding.
  6. Store - chunks and their embeddings are stored in a pgvector-enabled PostgreSQL table with a native vector column and HNSW index for fast similarity search. Old embeddings for the same entity are deleted before insertion (upsert pattern).

Query pipeline

Three API endpoints are available:

POST /api/rag/query - accepts a query string and returns the most semantically similar chunks with similarity scores.

  1. The query text is sanitized, prepended with a search_query: prefix, and embedded using the same Ollama model
  2. A cosine similarity search is executed against the vector store using pgvector's <=> operator
  3. Results are filtered by min_score and limited by the limit parameter
  4. Each result includes entity_type, entity_id, entity_label, bundle, chunk text, embedding model, similarity score, and language code

POST /api/rag/prompt - returns an assembled prompt ready to be sent to any LLM.

  1. Retrieves relevant chunks (same as /api/rag/query)
  2. Formats each chunk as [Document: entity_type:entity_id `label`] followed by the text
  3. Loads the configurable prompt template from the settings form
  4. Replaces the {{context}} and {{query}} placeholders
  5. Returns the assembled prompt string plus source metadata

POST /api/rag/augment - assembles the prompt, sends it to Ollama for generation, and returns the LLM response along with source metadata and the assembled prompt.

  1. Builds the prompt (same as /api/rag/prompt)
  2. Sends the prompt to Ollama's /api/chat endpoint using the configured chat model (or embedding model as fallback)
  3. Returns the generated response plus source metadata and the full prompt

Configuration

The admin can configure everything via the settings form at /admin/config/search/drupal-rag:

  • Enabled entity types - select which content types should be indexed
  • Chunk size - maximum characters per chunk (100–10000)
  • Chunk overlap - characters shared between consecutive chunks (0–5000)
  • Ollama base URL - the Ollama server address
  • Embedding model - the model used for generating embeddings (fetched live from Ollama)
  • Chat model - the model used for response generation via /api/rag/augment (optional, defaults to embedding model)
  • View mode - how entities are rendered for text extraction
  • RAG prompt template - custom template for the augmented prompt with {{context}} and {{query}} placeholders

Status page

A read-only status page at /admin/reports/drupal-rag shows indexed entities grouped by type with chunk counts.

Drush commands

  • drupal-rag:queue-all (alias rag:qa) - queue all published entities of enabled types for indexing

File text extraction

Supported file formats for text extraction from file/media entities:

  • Plain text: txt, csv, json, xml, md, markdown, log, yml, yaml
  • Office documents: docx, xlsx, pptx, odt, ods, odp
  • PDF: pdf (via prinsfrank/pdf-parser)

Database

The module requires a PostgreSQL connection named 'pgvector' in settings.php. It uses two tables:

  • drupal_rag_embeddings - stores entity metadata, chunk text, and native pgvector embeddings with an HNSW index for cosine similarity search
  • drupal_rag_queue - dedicated queue table for entity processing, isolated from the default Drupal queue

Deduplication

  • Entity types are filtered by the enabled_entity_types configuration before queuing
  • Only published entities are indexed (entities implementing EntityPublishedInterface)
  • The entity_presave hook is disabled to prevent double queuing on updates
  • storeEmbedding() deletes existing chunks for an entity before inserting new ones, so reprocessing is idempotent

Permissions

  • access rag query - allows access to the API endpoints
  • administer drupal rag - allows configuration of the module (restricted)

Services

The module is fully object-oriented with registered Drupal services:

  • OllamaClient - HTTP client for the Ollama API (embed, chat, model listing)
  • FileTextExtractor - parses files into plain text (TXT, DOCX, XLSX, PPTX, ODF, PDF, and more)
  • EntityExtractor - converts Drupal entities into text (file entities via FileTextExtractor, others via view mode)
  • Chunker - splits text into overlapping chunks with sentence boundary awareness
  • EmbeddingService - sanitizes text and generates embeddings via Ollama
  • VectorStorage - pgvector database operations (table management, store, similarity search)
  • RagQueryService - retrieval logic: query → embed → similarity search → results
  • AugmentService - prompt assembly with configurable template + Ollama chat generation
  • EntityHooks - event-to-queue bridge with entity type and published status filtering

Activity

Total releases
5
First release
May 2026
Latest release
1 day ago
Release cadence
0 days
Stability
0% stable

Release Timeline

Releases

Version Type Release date
1.0.0-alpha5 Pre-release May 21, 2026
1.0.0-alpha4 Pre-release May 21, 2026
1.0.0-alpha3 Pre-release May 21, 2026
1.0.0-alpha2 Pre-release May 21, 2026
1.0.0-alpha1 Pre-release May 21, 2026