content_archive
Content Archive lets you bulk-remove nodes by date range in a way that looks like a permanent delete but is fully recoverable.
Unlike unpublishing (which still shows nodes to admin users at /admin/content), archived nodes are completely removed from all Drupal tables — invisible on the front end, in /admin/content, in sitemaps, in REST APIs, and to any search index. They are stored as JSON blobs in a custom shadow table and can be restored at any time with one click, or permanently deleted when no longer needed.
This module was built for large news and media sites that need to retire years of old content without permanently losing it. It has been tested with hundreds of thousands of nodes on Drupal 10 / MariaDB.
The problem it solves
- Unpublishing is not enough — unpublished nodes are still visible to editors and administrators at /admin/content, appear in admin searches, and can be accidentally restored. Content Archive removes nodes from Drupal entirely.
- Permanent delete is too risky — once deleted there is no recovery. Content Archive gives you a safety net: the node is gone from the site but you can bring it back in seconds.
- Bulk operations on large sites time out — Content Archive uses Drupal's Batch API so every operation is timeout-safe, even on sites with hundreds of thousands of nodes.
Features
- True soft delete — nodes completely removed from Drupal, not just unpublished. Invisible to all users including administrators.
- Date-range bulk archive — select a From/To date range and archive thousands of nodes at once via the admin UI.
- Batch API processing — timeout-saands of nodes without hitting PHP or web serverlimits.
- Full one-click restore — nodes rede IDs, path aliases, all field values, and associatedmedia re-published.
- Permanent delete — after review, nodes, their media entities, and physical files fromdisk.
- Cross-referenced file protection ontent are automatically detected and moved to apreserved/ directory before any deletion happens.
- CSV node export — optional spreadnodes (nid, title, type, URL alias, author, dates).
- ZIP file backup — optional archive of physical files before permanent delete.
- Audit log table — history of everde count, download links for backups, and Restore /Permanent Delete action buttons.
- Resume interrupted runs — if a babutton in the history table picks up where it leftoff.
- Drush commands — every operation mation and scripting.
- Configurable media reference fields — tell the module which fields on your node types reference media entities, via a
simple settings form.
How it works
Archive (soft delete)
- Admin selects a date range and clicks Archive Content
- For each node:
$node->toArray()captus native format - JSON blob + original node ID + path alias + referenced media IDs stored in the
content_archive_itemstable - Referenced media entities are unpublished (inaccescords preserved for restore)
- Node deleted via entity API — completely gone from all Drupal tables
Restore
- Admin clicks ↺ Restore in the history tab
- JSON decoded back to Drupal field array
Node::create($data)+enforceIs()— original node ID preserved- Path alias restored
- Media entities re-published
Permanent Delete
- Files shared with newer content moved to preserved/ directory (cross-reference protection)
- Optionally creates a ZIP backup of all physical fi
- Media entities deleted via entity API
- Orphaned file records and physical files deletedArchive table rows removed, log status updated
Admin UI
Main page: admin/content/content-archive
- Date range picker with live AJAX report showing node, media, and file counts before committing
- Optional CSV backup checkbox
- Archive history table with status, download links, and per-run action buttons
Settings page: admin/config/content/content-archive
- Media reference field names (one per line — configure for your node types)
- Nodes per batch pass (tune for your server)
- Backup directory (stream wrapper path)
- Preserved files directory (stream wrapper path)
Drush commands
# Preview counts for a date range (dry run — no changes) drush content-archive:report 2020-01-01 2022-12-31 # Move files shared with newer content to preserved/ folder drush content-archive:preserve-files 2020-01-01 2022-12- # Export a CSV record of all nodes in range drush content-archive:backup-nodes 2020-01-01 2022-12-31 # Create a ZIP backup of physical files drush content-archive:backup-files 2020-01-01 2022-12-31 # Archive (soft-delete) all nodes — stores in DB, remove drush content-archive:archive 2020-01-01 2022-12-31 # After deciding on permanent delete: drush content-archive:delete-media 2020-01-01 2022-12-31 drush content-archive:delete-files 2020-01-01 2022-12-31
Database tables
Table Purposecontent_archive_log
One row per archive run — tracks status, node count, CSV/ZIP paths, timestamps
content_archive_items
Oores full JSON blob of node data
Compared to similar modules
Module Difference Trash / Recycle Bin Those modules soft-delete one node at a time and archived nodes are still visible to admins. ContentArchive removes nodes in bulk by date range and hides thministrators. Unpublish Unpublished nodes remain visible at /admin/content. Content Archive nodes are completely gone from
Drupal. Node Delete / VBO Permanent deletion only — no restore. Content Archive is always recoverable until you choose permanent
delete.
Installation
composer require drupal/content_archive drush en content_archive drush cr
Requirements
- Drupal 10 or 11
- PHP 8.1+
- Core modules: node, media, file, path_alias
- Drush 12+ (optional, for CLI commands)
Configuration
After enabling the module visit admin/config/content/content-archive and set the media reference field names used by your node types. The defaults (field_image, fiey standard installations.