ai_more_like_this
This module leverages Semantic Proximity measures to find the nodes that have content semantically close to the currently displayed node. It is equivalent to Solr More Like This (MLT) functionality, but uses AI RAG Vector DB to find the nodes that have the vectors with minimal distance from the current node.
Features
The module provides a Views Contextual Filter that sends a single query to the Vector DB to get a list of nodes semantically closest to the currently displayed content. No LLMs are used to do a vector search to find the More Like This nodes, therefore reducing the cost to zero.
Post-Installation
1. Add Node Id field to your AI Index (Property path nid) as a Filterable attribute if it doesn't have such a field.
2. Add Context Filter "AI More Like This" to a Views Block of your choice (any Content-based View can be used) that will show a list of semantically related nodes.
The following configuration parameters must be specified:
- AI Search Index that populates Vector DB
- Similarity Metric used for similarity calculations. By default, the metric used in the AI Index is selected, but it can be changed for the better MLT results
- Maximum 'distance' between the current node's vector and the vectors of MLT nodes. Range 0 - 1, defaults to 0.35 which will be a good starting point.
3. Configure Maximum Number of results to be returned using the Block Pager "Display a specified number of items" configuration.
Note: the nodes with the similarity distance bigger than "Distance Threshold" will be filtered out.
4. Add the Views Block to a region of your choice on a node page layout.
Additional Requirements
The module currently supports only open source PostgreSQL Vector DB, so the required module is Postgres VDB Provider
Note: PostgreSQL is available for every OS and every type of Linux, the functions pg_query() and pg_fetch_all() are built-in PHP functions and even on AWS EC2 small instance or an average laptop they provide very fast results for the More Like This list of nodes.
Similar projects
AI Related Content:
This project also builds a list of nodes contextually similar to the currently displayed node. However, it does it in a way that is substantially different:
- It is leveraging LLMs to do a vector search and that may incur significant costs if a site has a lot of content and plenty of users viewing the content. AI More Like This module uses only Vector DB to get a list of similar nodes.
- With no RAG distance threshold setting, for a genuinely unique content AI Related Content may produce a list of completely unrelated nodes.
- For an Index with Enriched Embedding Strategy (default) AI Related Content looks for the N chunks closest to the current node vector, and if all N chunks belong to the same node the other nodes get ignored, and after grouping you may end up with only one related node. Using appropriate SQL query the AI More Like This module finds N distinct nodes that have chunks closest to the current node.