ai_doc_proofread
AI Document Proofreader lets editors upload Word (.docx) documents and have them automatically reviewed by an AI model. Results are presented as an annotated document where every suggested correction is highlighted inline. Hovering over a suggestion opens a small tooltip showing the original text, the proposed replacement, and the AI's reasoning. Suggestions can be accepted or rejected one by one or in bulk per section.
No external proofreading service account is needed. The module works with any AI provider already configured through the AI module.
Features
- Works with any content type — enable proofreading per content type and map your own field names in the settings UI
- Node-based workflow with a status progress page and a local task tab on every configured document node
- AI proofreading runs as a background Drupal Queue job — the browser can be closed after starting; processing continues on cron
- Ten configurable proofreading criteria: spelling, grammar, consistency, clarity, style, formatting, facts, repetition, flow, and special elements
- Each criterion's AI prompt is fully editable — tailor instructions to your house style or industry
- Inline
<del>/<ins>suggestion markup with colour-coded hover tooltips (original in red, suggestion in green, reasoning in italic) - The status page shows the number of AI suggestions before the reviewer opens the document
- Accept or reject suggestions individually via the tooltip, or in bulk per document section
- Accepted HTML is automatically exported to DOCX via pandoc after saving the review and optionally attached to a configured file field on the node
- Notification emails when proofreading is ready for review and when the export is complete
- Automatic retry logic for transient AI API errors (5xx, rate limits, timeouts)
Requirements
- Drupal 10.4, 11, or 12
- AI module with at least one configured AI provider (OpenAI, Anthropic, Ollama, …) and the AI Automators sub-module enabled
- ai_automator_pandoc — converts uploaded Word files to HTML
- doc_html_chunker — splits the HTML into chunks for AI processing
- pandoc installed on the server
Installation
drush en ai_automator_pandoc doc_html_chunker ai_doc_proofreadAfter enabling, complete the configuration steps below.
Step 1 — Configure pandoc
Visit Administration → Content authoring → Pandoc settings (/admin/config/content/pandoc) and enter the full path to the pandoc binary on your server. Run which pandoc in a terminal to find the correct path.
Step 2 — Configure the AI provider and criteria
Visit Administration → Configuration → AI → AI Document Proofreader (/admin/config/ai/proofread) to:
- Select the AI provider and model to use for proofreading
- Enable and customise the proofreading criteria
- Enable proofreading for one or more content types and map each required field
Step 3 — Set up a content type and its fields
Create or use an existing content type and add the following fields (field names are freely chosen — you map them in the settings UI):
- A File field — the uploaded Word document
- A Text (long) field — HTML produced by the Pandoc automator
- A Text (long) field — JSON chunk array produced by the Chunker automator
- A Text (long) field — annotated HTML with
<del>/<ins>markup - A Text (plain) or List (text) field — workflow status tracker
- Optionally, a File field — the exported DOCX will be attached here after review
Go back to /admin/config/ai/proofread, enable your content type in the Content types section, and map each field using the dropdowns.
Step 4 — Configure two AI Automators on the content type
Open the field automator settings for your content type and add two automators:
Automator 1 — Pandoc: Word to HTML
- Automator type: Pandoc: Word to HTML (provided by ai_automator_pandoc)
- Source field: the File field (input)
- Target field: the HTML content field
- Recommended options: Output format
html5, Text wrappingnone
When this automator runs it populates the HTML field and the workflow status advances to word_to_html.
Automator 2 — Doc HTML Chunker
- Automator type: Doc HTML Chunker (provided by doc_html_chunker)
- Source field: the HTML content field
- Target field: the chunked content field
- Recommended options: Algorithm
By heading, then size, Max chunk size3000
When this automator runs it populates the chunk field, the workflow status advances to chunking, and the node is automatically queued for AI proofreading — no manual trigger is needed.
Workflow
- Upload — create a node with a Word file; saving triggers the Pandoc automator
- Convert — the Pandoc automator converts the file to HTML; status →
word_to_html - Chunk — the HTML Chunker splits the HTML into sections; status →
chunking - Proofread — the node is queued automatically; run
drush queue:run ai_doc_proofread_nodesor wait for cron; status →ready - Review — visit
/node/{nid}/ai_proofread, see how many suggestions were found, open the review page and accept or reject each one; click Save & finish; status →exported - Download — the accepted HTML is exported to DOCX automatically and attached to the output file field (if configured); download from
/node/{nid}/ai_proofread/download
Drush
drush queue:run ai_doc_proofread_nodesProcesses all queued proofreading jobs immediately without waiting for cron.
Related modules
- AI — required base module and automator framework
- AI Automator: Pandoc — Word to HTML conversion automator
- Doc HTML Chunker — HTML chunking automator