metadata_sanitizer
Every file uploaded to your Drupal site may silently carry GPS coordinates, camera serial numbers, author names, and other hidden metadata that can expose your users and your organisation without anyone realising it.
Metadata Sanitizer strips that data automatically, at upload time and in bulk, using the battle-tested exiftool binary. Once removed, the metadata cannot be recovered from the sanitized file.
Why Metadata Sanitizer?
- Privacy by default — sanitize on every upload, not as an afterthought.
- GDPR / data minimisation — supports Article 5(1)(c) obligations by removing personal data embedded in files before it is ever stored or served.
- Any file type — not limited to images. Works on PDFs, Office documents, and any format
exiftoolsupports. - Bulk remediation — clean an existing library of thousands of files with a single Drush command.
- Drupal AI ecosystem — optional submodules integrate with the Drupal AI module (AI Agents + Tool API), enabling AI-assisted configuration, file profiling, and autonomous bulk-clean operations.
Features
- Automatic sanitization on upload (configurable, toggleable).
- Re-sanitization on file replacement when the underlying URI changes.
- Bulk-clean command:
drush metadata_sanitizer:cleanwith filters for extensions, regex patterns, MIME types, and entity field references. - Configurable file extensions and optional timestamp preservation (
exiftool -P). - Admin UI at Configuration → Media → Metadata Sanitizer.
- Dedicated
administer metadata sanitizerpermission. - Runtime verification of
exiftoolavailability.
Requirements
- Drupal 10 or 11.
exiftoolinstalled on the system path.
Debian/Ubuntu: sudo apt-get install -y libimage-exiftool-perl
RHEL/Rocky/Alma: sudo dnf install -y perl-Image-ExifTool
macOS: brew install exiftool
DDEV: add libimage-exiftool-perl to webimage_extra_packages in .ddev/config.yaml.
Bulk Clean — Drush Examples
drush metadata_sanitizer:clean --extensions='jpg,jpeg,png,pdf' drush metadata_sanitizer:clean --mime='image/jpeg,application/pdf' drush metadata_sanitizer:clean --pattern='/^invoice_/' drush metadata_sanitizer:clean --field=field_document
Note: Bulk cleaning is irreversible for removed metadata. For large libraries, run via CLI, because UI requests may time out.
Optional: Drupal AI Integration
Metadata Sanitizer AI Agents (metadata_sanitizer_ai_agents)
Adds an AI advisor tab under the module settings page. The bundled AI Agent can check your environment, profile your managed files, recommend settings, estimate the scope of bulk cleaning, preview metadata for verification, generate a conservative Drush command, and perform confirmed bulk cleaning via natural language.
Metadata Sanitizer Tool API (metadata_sanitizer_tool_api)
Exposes module operations as Tool-API-aligned plugins, making them discoverable and invocable by Tool-API connectors such as tool_ai_connector. Compatible with the Drupal AI ecosystem on both Drupal 10 and 11.
Permissions
administer metadata sanitizer — controls access to the admin UI, Tool API wrappers, and AI advisor. Must be explicitly granted; not assigned by default.