ai_vdb_provider_elasticsearch
Integrates Elasticsearch as a native Vector Database (VDB) for Drupal AI. Enable high-performance semantic search and RAG using your existing Elastic Stack infrastructure and native kNN.
Features
Basic Functionality
Connects Drupal’s AI Search to Elasticsearch, managing `dense_vector` indices and performing high-speed approximate kNN searches using the HNSW algorithm.
Unique Features
Enables Native Hybrid Search with RRF, which automatically merges lexical and semantic rankings for unmatched relevance—a capability unique to this provider. It also supports scalar quantization, compressing vectors to reduce RAM requirements by up to 75%.
When and Why to Use
Ideal for organizations already invested in the Elastic Stack. It eliminates the operational overhead and cost of maintaining secondary vector databases like Milvus or Pinecone while leveraging Elasticsearch’s proven enterprise security and scalability.
Use Cases
Grounding Retrieval-Augmented Generation (RAG) for accurate AI assistants, enabling natural language discovery in documentation, and providing intelligent content recommendations based on semantic meaning.
Post-Installation
How the Elasticsearch VDB Provider works
This module does not create content types or text formats. It is a backend infrastructure plugin — it gives the Drupal AI Search module a place to store and query vector embeddings. Think of it as a database driver, not a user-facing feature.
Step-by-step configuration
1. Prerequisites (before enabling the module)
You need these in place first:
- A running Elasticsearch 8.x cluster reachable from Drupal (local DDEV: http://elasticsearch:9200, cloud: https://your-cluster.es.io:9200)
- The Key module enabled — used to store your Elasticsearch API token securely
- The AI module + AI Search module enabled
- An AI Provider configured (e.g., OpenAI) that can generate embeddings — this is separate from this module
2. Store your Elasticsearch API key securel
Go to: Admin → Configuration → System → Keys (/admin/config/system/keys)
Create a new Key entity with your Elasticsearch API token. This keeps the credential out of the config database and allows it to live in environment variables or a secrets manager.
If your Elasticsearch cluster has no authentication (local dev), you can skip this and leave the API Key field set to "None" on the config form.
3. Configure this module
Go to: Admin → Configuration → AI → VDB Providers → Elasticsearch Configuration
(/admin/config/ai/vdb_providers/elasticsearch)
Fill in four fields:
Field → What to enter
Elasticsearch Host URL → Full URL with port, e.g. http://localhost:9200 or https://my-cluster.es.io:9200
API Key → Select the Key entity you created above (or "None")
Index Prefix → A namespace prefix for all indices, e.g. drupal_ — prevents collisions if the cluster is shared
Similarity Metric → Cosine for OpenAI/commercial embeddings (recommended), L2 Norm for unnormalized vectors, Dot Product for unit-normalized vectors
Important: The similarity metric is baked into the Elasticsearch index at creation time. If you change it after indexing, you must delete and rebuild the index.
4. Configure Search API to use this provider
This is where you wire it all together:
- Go to Admin → Configuration → Search and Metadata → Search API (/admin/config/search/search-api)
- Add a new Server, choose AI Search (VDB) as the backend
- In the server settings, select Elasticsearch as the VDB Provider and choose your embedding AI Provider (e.g., OpenAI)
- Add a new Index attached to that server, select the entity types to index (e.g., Content)
- Map the fields you want to embed (typically the body field)
- Index the content — this triggers chunking, embedding generation, and storage in Elasticsearch
- The module creates Elasticsearch indices automatically on first use via ensureIndex(). You do not need to create indices manually in Kibana.
5. Query
Once indexed, semantic search works through Search API Views or programmatically via the AI module's APIs. End users won't see anything different unless you've built a search interface — the VDB layer is invisible to them.
Additional Requirements
Requires the AI module, AI Search submodule, and Key module. Composer dependency: elastic/elasticsearch ^8.0. A running Elasticsearch 8.x cluster and an LLM provider (e.g. OpenAI) for generating embeddings.
Recommended modules/libraries
None required. Optionally pair with any Drupal AI Provider module (OpenAI, Anthropic, Ollama) for embedding generation, and Kibana for cluster monitoring and index management.
Similar projects
Similar VDB providers exist for Qdrant, Milvus, Pinecone, pgvector, and SQLite. This module targets teams already running Elasticsearch who want semantic search without provisioning a separate vector store.
Supporting this Module
No formal funding channel. Contributions via patches, issue queue reports, and code reviews on drupal.org are welcome.
Community Documentation
No external documentation yet. See Architecture.md in the repository for implementation details and the inline PHPDoc for method-level guidance.