Drupal is a registered trademark of Dries Buytaert
drupal 11.3.8 Update released for Drupal core (11.3.8)! drupal 11.3.7 Update released for Drupal core (11.3.7)! drupal 11.2.11 Update released for Drupal core (11.2.11)! drupal 10.6.7 Update released for Drupal core (10.6.7)! drupal 10.5.9 Update released for Drupal core (10.5.9)! cms 2.1.1 Update released for Drupal core (2.1.1)! drupal 11.3.6 Update released for Drupal core (11.3.6)! drupal 10.6.6 Update released for Drupal core (10.6.6)! cms 2.1.0 Update released for Drupal core (2.1.0)! linkit 7.0.14 Minor update available for module linkit (7.0.14). masquerade 8.x-2.2 Minor update available for module masquerade (8.x-2.2). video_embed_field 3.1.0 Minor update available for module video_embed_field (3.1.0). bootstrap 8.x-3.40 Minor update available for theme bootstrap (8.x-3.40). menu_link_attributes 8.x-1.7 Minor update available for module menu_link_attributes (8.x-1.7). editoria11y 3.0.0 Major update available for module editoria11y (3.0.0). trash 3.0.27 Minor update available for module trash (3.0.27). twig_tweak 4.0.0-alpha2 New alpha version released for module twig_tweak (4.0.0-alpha2). twig_tweak 4.0.0-alpha1 First alpha version released for module twig_tweak (4.0.0-alpha1). node_revision_delete 2.1.1 Minor update available for module node_revision_delete (2.1.1). commerce_paypal 2.1.2 Minor update available for module commerce_paypal (2.1.2).

ai_automator_pandoc

3 sites No security coverage
View on drupal.org

An AI Automators plugin that converts uploaded Word (.docx), PDF, ODT, and RTF files to clean HTML using the pandoc command-line tool.

Once configured, the automator runs automatically when a node is saved, writing the converted HTML into any text_long target field — no custom code required.

Features

  • Converts Word (.docx / .doc), PDF, ODT, RTF, HTML, and plain text files to HTML5 or HTML4
  • Input format is detected automatically from the file MIME type and extension
  • Configurable output options: text wrapping, standalone document wrapper, embedded resources (base64 images), section numbering, table of contents, and freeform extra pandoc arguments
  • Uses proc_open() with an argument array — no shell injection risk
  • Settings form at /admin/config/content/pandoc with live pandoc version display and validation
  • Appears in the Drupal status report with OK / Warning / Error indicators
  • Post-install warning banner with a direct link to the configuration page
  • Drush command drush ai-automator-pandoc:test for command-line testing and diagnosis

Requirements

  • Drupal 10.4, 11, or 12
  • AI module with the AI Automators sub-module enabled
  • pandoc installed on the server and the binary path configured in the module settings

Installation

drush en ai_automator_pandoc

After enabling, visit Administration → Content authoring → Pandoc settings and enter the full path to the pandoc binary (e.g. /usr/local/bin/pandoc). Run which pandoc on the server to find the correct path.

Installing pandoc

  • macOS (Homebrew): brew install pandoc
  • Ubuntu / Debian: sudo apt-get install pandoc
  • Alpine Linux / Docker: apk add --no-cache pandoc
  • RHEL / CentOS: sudo yum install pandoc
  • Windows: download the installer from pandoc.org/installing.html

Docker note: images based on Alpine Linux use apk, not apt-get. Use apk add --no-cache pandoc or add it to your Dockerfile.

Drush test command

The module ships with a Drush command for end-to-end testing of the conversion without going through the UI:

drush ai-automator-pandoc:test /path/to/document.docx

The command prints the resolved pandoc path, version, exit code, conversion time, and an HTML preview. Use --save to write the full HTML output next to the source file.

Related modules

  • AI — required base module and automator framework
  • Doc HTML Chunker — splits the converted HTML into JSON-encoded chunks for AI processing
  • AI Document Proofreader — full node-based AI proofreading workflow using this module for document conversion

Activity

Total releases
2
First release
Mar 2026
Latest release
1 month ago
Release cadence
0 days
Stability
0% stable

Releases

Version Type Release date
1.0.0-rc1 Pre-release Mar 9, 2026
1.0.x-dev Dev Mar 9, 2026