Drupal is a registered trademark of Dries Buytaert
drupal 10.5.11 Update released for Drupal core (10.5.11)! drupal 11.3.11 Update released for Drupal core (11.3.11)! drupal 11.2.13 Update released for Drupal core (11.2.13)! drupal 10.6.10 Update released for Drupal core (10.6.10)! cms 2.1.2 Update released for Drupal core (2.1.2)! drupal 11.1.10 Update released for Drupal core (11.1.10)! drupal 10.5.10 Update released for Drupal core (10.5.10)! drupal 10.4.10 Update released for Drupal core (10.4.10)! drupal 11.2.12 Update released for Drupal core (11.2.12)! drupal 11.3.10 Update released for Drupal core (11.3.10)! drupal 10.6.9 Update released for Drupal core (10.6.9)! drupal 10.6.8 Update released for Drupal core (10.6.8)! drupal 11.3.9 Update released for Drupal core (11.3.9)! drupal 11.3.8 Update released for Drupal core (11.3.8)! drupal 11.3.7 Update released for Drupal core (11.3.7)! drupal 11.2.11 Update released for Drupal core (11.2.11)! drupal 10.6.7 Update released for Drupal core (10.6.7)! drupal 10.5.9 Update released for Drupal core (10.5.9)! cms 2.1.1 Update released for Drupal core (2.1.1)! drupal 11.3.6 Update released for Drupal core (11.3.6)!

AI Audio Translator translates audio and video media files into selected languages using AI-powered speech recognition, translation, and voice synthesis. Upload an audio file or video, select a target language, and get a fully translated version.

AI Disclosure.

Note that This module is created with the help of AI agents.

Translated videos are not perfect like audio might not sync properly with video.

Features

  • Audio Translation Pipeline: Transcribes audio → translates text → generates speech in target language
  • Video Translation with Timing: Preserves original video with translated audio track, maintaining lip-sync timing through intelligent segment processing
  • Flexible AI Provider Support: Works with any AI module provider (Currently only tested with Gemini)
  • One-Click Translation: "Translate" button appears directly on media entity operations
  • Format Support: Audio (MP3, WAV, OGG, FLAC, M4A) and Video (MP4, WebM, MOV, AVI)

Post-Installation

  1. Configure AI Providers: Go to /admin/config/ai/providers and set up at least one provider supporting Speech-to-Text, Chat (for translation), and Text-to-Speech
  2. Create Language Vocabulary: Create a taxonomy vocabulary (e.g., "Languages") with terms like "Spanish", "Hindi", "Sanskrit"
  3. Module Settings: Visit /admin/config/ai/audio-translator and:
    • Select your language vocabulary
    • Optionally customize the translation prompt
    • Optionally override AI providers for specific operations
    • Configure FFmpeg path if not auto-detected (video translation only)
  4. Translate Media: Go to any audio/video media entity, click "Translate" in operations, select language, and submit

Configuration Tips:

  • For video translation, ensure FFmpeg is installed on your server (see Status Report)
  • Test with short audio files first to verify AI provider configuration
  • Monitor the queue at /admin/config/ai/audio-translator
  • Check watchdog logs for translation errors: /admin/reports/dblog

Additional Requirements

Required Drupal Modules:

  • AI (^1.0) : Provides AI provider management and operation types
  • Media (core): For media entity management
  • Taxonomy (core): For language vocabulary

Required AI Provider Capabilities:

  • Speech-to-Text (transcription)
  • Chat or Text Generation (translation)
  • Text-to-Speech (voice synthesis)

System Requirements:

  • PHP 8.1 or higher
  • Drupal 10.4+ or 11.x
  • FFmpeg (required for video translation only; audio works without it)
    • Ubuntu/Debian: sudo apt-get install ffmpeg
    • DDEV: Add webimage_extra_packages: [ffmpeg] to .ddev/config.yaml
    • Verify: ffmpeg -version

Technical Details

Video Translation Pipeline:

  1. Extract audio track from video using FFmpeg
  2. Detect speech segments via silence analysis
  3. Transcribe each segment with AI speech-to-text
  4. Translate segments with duration constraints to maintain timing
  5. Generate TTS audio for each translated segment
  6. Time-stretch segments to match original duration (prevents desync)
  7. Rebuild audio timeline with proper timing
  8. Mux translated audio back into original video

File Size Limits:

  • Audio: 25 MB maximum
  • Video: 500 MB maximum

Activity

Total releases
2
First release
Apr 2026
Latest release
1 month ago
Release cadence
1 day
Stability
0% stable

Releases

Version Type Release date
1.0.0-alpha1 Pre-release Apr 15, 2026
0.1.0-rc1 Pre-release Apr 14, 2026