ai_vdb_provider_typesense

AI VDB Provider Typesense connects Drupal's AI module ecosystem to Typesense, an open-source search engine with native vector search. Use it as the vector database backend for AI Search to index content as embedding vectors and run semantic similarity queries — ideal for AI chatbots, RAG (Retrieval Augmented Generation), related content, and natural-language site search.

Features

AI Search integration — Works as a Vector Database (VDB) provider for the AI Search backend. Content is chunked, embedded via the AI module's embedding providers, and stored in Typesense collections.
Automatic schema management — Collection schemas are created and kept in sync when Search API index fields change. Filterable attributes become faceted fields in Typesense.
Search API–driven collection naming — The Typesense collection name follows the Search API index machine name. An event (TypesenseCollectionEvents::ALTER_NAME) lets you prefix or transform names — useful when sharing one Typesense cluster across multiple Drupal sites.
Type-safe document insertion — Values are validated and cast to match the Typesense collection schema before insertion, reducing type mismatch errors during indexing.
Programmatic similarity search — The TypesenseVectorStoreManager service provides a simple API for vector similarity search in custom modules (chatbots, related content, RAG pipelines).
Dry run mode — Test the full indexing pipeline (content extraction, chunking, metadata) without calling the embedding API, using zero vectors instead.
Search API processor — Includes a metadata fields processor for enriching indexed documents with entity metadata.

Typical use cases:

Semantic site search that understands meaning, not just keywords
Powering AI chatbots with site-specific knowledge (RAG)
“Related content” blocks based on vector similarity
Self-hosted vector search with Typesense (local, Docker, or Typesense Cloud)

Post-Installation

Install with Composer (recommended):

composer require 'drupal/ai_vdb_provider_typesense'
drush en ai_vdb_provider_typesense

This installs the module and its PHP dependencies (typesense/typesense-php, php-http/curl-client).

Configure the Typesense connection at Configuration → AI → Vector Database Providers → Typesense (/admin/config/ai/vdb_providers/typesense):
- Admin API key (read-write)
- Host, port (default 8108), and protocol (http or https)
- Embedding model for programmatic similarity search (defaults to the AI module's default embedding provider)
Create a Search API server with the AI Search backend and select Typesense as the vector database provider.
Create a Search API index, add fields, and configure indexing options (Main Content, Contextual Content, Filterable Attributes).
Index content via Search API (cron, Drush, or the Search API UI).

Collection naming: There is no separate collection name field on the server config. The Search API index machine name is used as the Typesense collection name. To add a site prefix or otherwise alter names, subscribe to TypesenseCollectionEvents::ALTER_NAME.

Programmatic search:

$manager = \Drupal::service('ai_vdb_provider_typesense.vector_store_manager');
$results = $manager->similaritySearch('What are your opening hours?', 'my_index', 10);

Debug: Enable Dry run mode on the settings page to exercise the indexing pipeline without embedding API calls.

Additional Requirements

Drupal 10.4+ or Drupal 11
AI module (with an embedding provider configured, e.g. OpenAI)
AI Search module
Key module (for secure storage of the Typesense API key)
typesense/typesense-php (^4.5) — installed via Composer when using composer require drupal/ai_vdb_provider_typesense
php-http/curl-client (^2.2) — HTTP client required by the Typesense PHP library
A running Typesense server (self-hosted, Docker, or Typesense Cloud)

Recommended modules/libraries

AI — Core AI framework; configure embedding providers (OpenAI, Ollama, etc.) before indexing.
AI Search — Required backend that orchestrates chunking, embedding, and vector storage.
Search API — Underlying search framework used by AI Search.
Key — Secure credential storage for the Typesense admin API key.
Search API Typesense — Optional; provides typesense_* Search API data types (e.g. typesense_string, typesense_int32) for filterable index fields. Not required for basic vector search.
AI Agents or custom chatbot modules — consume similarity search results for RAG workflows.

Similar projects

Other Vector Database providers exist for Drupal's AI module, including:

AI VDB Provider Pinecone — Managed cloud vector database (Pinecone).
AI VDB Provider Milvus — Milvus / Zilliz Cloud vector database.

Why Typesense? Typesense combines full-text and vector search in one engine, can be self-hosted or used via Typesense Cloud, and is often simpler to operate than dedicated vector-only databases. This module is a good fit when you already use Typesense for search, want to avoid vendor lock-in, or prefer an open-source stack you control.

For traditional (non-vector) Typesense integration with Search API, see the community Search API Typesense module. AI VDB Provider Typesense is specifically for the AI module's embedding-based vector search workflow.

Supporting this Module

This module is maintained as open source. Report bugs and feature requests in the project issue queue. Contributions via merge requests are welcome.

Community Documentation

Drupal AI module documentation — Overview of AI providers, embeddings, and VDB providers.
AI Search documentation — Setting up Search API indexes with AI Search.
VDB Providers documentation — How vector database providers work in the AI ecosystem.
Typesense documentation — Server setup, vector search, and API reference.
Module README — Installation, configuration, collection naming events, and programmatic similarity search examples.