Drupal is a registered trademark of Dries Buytaert
cms 2.1.3 Update released for Drupal core (2.1.3)! drupal 10.5.11 Update released for Drupal core (10.5.11)! drupal 11.3.11 Update released for Drupal core (11.3.11)! drupal 11.2.13 Update released for Drupal core (11.2.13)! drupal 10.6.10 Update released for Drupal core (10.6.10)! cms 2.1.2 Update released for Drupal core (2.1.2)! drupal 11.1.10 Update released for Drupal core (11.1.10)! drupal 10.5.10 Update released for Drupal core (10.5.10)! drupal 10.4.10 Update released for Drupal core (10.4.10)! drupal 11.2.12 Update released for Drupal core (11.2.12)! drupal 11.3.10 Update released for Drupal core (11.3.10)! drupal 10.6.9 Update released for Drupal core (10.6.9)! drupal 10.6.8 Update released for Drupal core (10.6.8)! drupal 11.3.9 Update released for Drupal core (11.3.9)! drupal 11.3.8 Update released for Drupal core (11.3.8)! drupal 11.3.7 Update released for Drupal core (11.3.7)! drupal 11.2.11 Update released for Drupal core (11.2.11)! drupal 10.6.7 Update released for Drupal core (10.6.7)! drupal 10.5.9 Update released for Drupal core (10.5.9)! cms 2.1.1 Update released for Drupal core (2.1.1)!

ai_automator_pandoc

4 sites No security coverage
View on drupal.org

An AI Automators plugin that converts uploaded Word (.docx), PDF, ODT, and RTF files to clean HTML using the pandoc command-line tool.

Once configured, the automator runs automatically when a node is saved, writing the converted HTML into any text_long target field — no custom code required.

Features

  • Converts Word (.docx / .doc), PDF, ODT, RTF, HTML, and plain text files to HTML5 or HTML4
  • Input format is detected automatically from the file MIME type and extension
  • Configurable output options: text wrapping, standalone document wrapper, embedded resources (base64 images), section numbering, table of contents, and freeform extra pandoc arguments
  • Uses proc_open() with an argument array — no shell injection risk
  • Settings form at /admin/config/content/pandoc with live pandoc version display and validation
  • Appears in the Drupal status report with OK / Warning / Error indicators
  • Post-install warning banner with a direct link to the configuration page
  • Drush command drush ai-automator-pandoc:test for command-line testing and diagnosis

Requirements

  • Drupal 10.4, 11, or 12
  • AI module with the AI Automators sub-module enabled
  • pandoc installed on the server and the binary path configured in the module settings

Installation

drush en ai_automator_pandoc

After enabling, visit Administration → Content authoring → Pandoc settings and enter the full path to the pandoc binary (e.g. /usr/local/bin/pandoc). Run which pandoc on the server to find the correct path.

Installing pandoc

  • macOS (Homebrew): brew install pandoc
  • Ubuntu / Debian: sudo apt-get install pandoc
  • Alpine Linux / Docker: apk add --no-cache pandoc
  • RHEL / CentOS: sudo yum install pandoc
  • Windows: download the installer from pandoc.org/installing.html

Docker note: images based on Alpine Linux use apk, not apt-get. Use apk add --no-cache pandoc or add it to your Dockerfile.

Drush test command

The module ships with a Drush command for end-to-end testing of the conversion without going through the UI:

drush ai-automator-pandoc:test /path/to/document.docx

The command prints the resolved pandoc path, version, exit code, conversion time, and an HTML preview. Use --save to write the full HTML output next to the source file.

Related modules

  • AI — required base module and automator framework
  • Doc HTML Chunker — splits the converted HTML into JSON-encoded chunks for AI processing
  • AI Document Proofreader — full node-based AI proofreading workflow using this module for document conversion

Activity

Total releases
2
First release
Mar 2026
Latest release
3 months ago
Release cadence
0 days
Stability
0% stable

Releases

Version Type Release date
1.0.0-rc1 Pre-release Mar 9, 2026
1.0.x-dev Dev Mar 9, 2026