content_cannibalization_detector
About
The Content Cannibalization Detector module identifies SEO keyword cannibalization issues on your Drupal site. Keyword cannibalization
occurs when multiple pages target the same search keywords, causing them to compete against each other in search engine results. This splits
ranking authority and can lower the ranking of all competing pages.
How It Works
The module extracts keywords from node titles, body content, URL path aliases, and meta tag fields using TF-IDF scoring. It compares every
pair of published nodes using cosine similarity to detect overlapping keyword profiles. Issues are classified by severity and paired with
actionable recommendations.
Features
- TF-IDF keyword extraction with bigram (two-word phrase) detection
- Cosine similarity-based content comparison
- Source-weighted scoring: title (3x), meta tags (2.5x), URL paths (2x), body (1x)
- Severity classification: Critical, High, Medium, Low
- Actionable recommendations: Merge, Redirect, Set Canonical, Differentiate
- Visual admin dashboard with summary cards and color-coded severity table
- Per-node detail pages with keyword breakdown and competitor list
- Drush CLI commands for analysis and reporting
- Configurable content types, similarity thresholds, and custom stop words
- 130+ built-in English stop words
Requirements
- Drupal 10.x or 11.x
- PHP 8.1 or higher
- Node module (core)
- Path Alias module (core)
Recommended Modules
- Metatag - Enables keyword extraction from meta description and keywords fields
- Drush (12+) - Enables CLI commands for analysis and reporting
Installation
Install as you would normally install a contributed Drupal module.
composer require drupal/content_cannibalization_detector
drush en content_cannibalization_detector
Configuration
1. Navigate to Administration > Configuration > Search and metadata > Content Cannibalization Detector
(/admin/config/search/cannibalization)
2. Select the content types to include in analysis
3. Adjust the similarity threshold (default 40%)
4. Configure keyword sources (title, body, path alias, meta tags)
5. Optionally add custom stop words to exclude site-specific common terms
Permissions
- Administer Content Cannibalization Detector - Configure module settings and trigger analysis
- View cannibalization reports - Access the dashboard and node detail pages
Usage
1. Navigate to Administration > Reports > Content Cannibalization (/admin/reports/cannibalization)
2. Click Run Analysis to scan all configured content
3. Review the results table sorted by severity
4. Click View details on any row to see keyword breakdown and competing pages
Severity Levels
- Critical (80-100%) - Pages are nearly identical. Merge into one page.
- High (60-79%) - Significant overlap. Redirect the weaker page.
- Medium (40-59%) - Moderate overlap. Set canonical URL on secondary page.
- Low (below 40%) - Minor overlap. Differentiate keyword targeting.
Drush Commands
- drush ccd:analyze - Run a full cannibalization analysis
- drush ccd:report - Display summary report with all detected issues
- drush ccd:node - Show keywords and competing pages for a specific node