AI Audio Translator translates audio and video media files into selected languages using AI-powered speech recognition, translation, and voice synthesis. Upload an audio file or video, select a target language, and get a fully translated version.

AI Disclosure.

Note that This module is created with the help of AI agents.

Translated videos are not perfect like audio might not sync properly with video.

Features

Audio Translation Pipeline: Transcribes audio → translates text → generates speech in target language
Video Translation with Timing: Preserves original video with translated audio track, maintaining lip-sync timing through intelligent segment processing
Flexible AI Provider Support: Works with any AI module provider (Currently only tested with Gemini)
One-Click Translation: "Translate" button appears directly on media entity operations
Format Support: Audio (MP3, WAV, OGG, FLAC, M4A) and Video (MP4, WebM, MOV, AVI)

Post-Installation

Configure AI Providers: Go to /admin/config/ai/providers and set up at least one provider supporting Speech-to-Text, Chat (for translation), and Text-to-Speech
Create Language Vocabulary: Create a taxonomy vocabulary (e.g., "Languages") with terms like "Spanish", "Hindi", "Sanskrit"
Module Settings: Visit /admin/config/ai/audio-translator and:
- Select your language vocabulary
- Optionally customize the translation prompt
- Optionally override AI providers for specific operations
- Configure FFmpeg path if not auto-detected (video translation only)
Translate Media: Go to any audio/video media entity, click "Translate" in operations, select language, and submit

Configuration Tips:

For video translation, ensure FFmpeg is installed on your server (see Status Report)
Test with short audio files first to verify AI provider configuration
Monitor the queue at /admin/config/ai/audio-translator
Check watchdog logs for translation errors: /admin/reports/dblog

Additional Requirements

Required Drupal Modules:

AI (^1.0) : Provides AI provider management and operation types
Media (core): For media entity management
Taxonomy (core): For language vocabulary

Required AI Provider Capabilities:

Speech-to-Text (transcription)
Chat or Text Generation (translation)
Text-to-Speech (voice synthesis)

System Requirements:

PHP 8.1 or higher
Drupal 10.4+ or 11.x
FFmpeg (required for video translation only; audio works without it)
- Ubuntu/Debian: sudo apt-get install ffmpeg
- DDEV: Add webimage_extra_packages: [ffmpeg] to .ddev/config.yaml
- Verify: ffmpeg -version

Technical Details

Video Translation Pipeline:

Extract audio track from video using FFmpeg
Detect speech segments via silence analysis
Transcribe each segment with AI speech-to-text
Translate segments with duration constraints to maintain timing
Generate TTS audio for each translated segment
Time-stretch segments to match original duration (prevents desync)
Rebuild audio timeline with proper timing
Mux translated audio back into original video

File Size Limits:

Audio: 25 MB maximum
Video: 500 MB maximum

Version	Type	Release date
1.0.0-alpha1	Pre-release	Apr 15, 2026
0.1.0-rc1	Pre-release	Apr 14, 2026

Ai Audio Translate

Features

Post-Installation

Additional Requirements

Technical Details

Activity

Releases