markitdown
MarkItDown
MarkItDown is a Drupal 10 module that converts PDF, DOCX, XLSX, and other file formats to Markdown. This module uses the open-source tool [markitdown](https://github.com/microsoft/markitdown) developed by Microsoft for file conversion.
Features
- Supports converting PDF, DOCX, XLSX, PPT, and other file formats to Markdown.
- Provides two output modes: text mode and file mode.
- Offers a "Convert to Markdown" operation on the file details page.
- Configurable markitdown tool path.
Dependencies
This module depends on the following tool:
- **markitdown**: The file conversion tool developed by Microsoft.
Installing markitdown
To install markitdown using pip:
pip install 'markitdown[all]~=0.1.0a1'
Alternatively, you can install from the source:
git clone [email protected]:microsoft/markitdown.git
cd markitdown
pip install -e packages/markitdown[all]Ensure that the markitdown command is available in your system's PATH, or specify the full path in the module's settings.
Installation
1. Place the module in the `/web/modules/` directory.
2. Enable the module in the Drupal administration interface.
3. Visit `/admin/config/content/markitdown` to configure module settings.
Usage
### Basic Usage
1. Upload a file to Drupal.
2. Visit the file details page.
3. Click the "Convert to Markdown" button.
4. By default, the conversion result will be displayed directly in the browser.
### Output Modes
The module supports two output modes:
1. **Text Mode**: Directly returns the converted Markdown content.
- Use the URL parameter: `?output_mode=text` (default).
2. **File Mode**: Saves the conversion result to a specified path.
- Use the URL parameter: `?output_mode=file&output_path=/path/to/save/file.md`.
### API Usage
You can also use the MarkdownConverter service in your own code:
// Get the service
$converter = \Drupal::service('markitdown.converter');
// Text Mode - Returns content
$markdown = $converter->convertToMarkdown('/path/to/file.pdf', [
'output_mode' => 'text',
]);
// File Mode - Save to file
$result = $converter->convertToMarkdown('/path/to/file.pdf', [
'output_mode' => 'file',
'output_path' => '/path/to/save/output.md',
]);## Permissions
- **administer markitdown**: Allows users to configure module settings.
- **convert files to markdown**: Allows users to convert files to Markdown.
## Troubleshooting
If the conversion fails, check:
1. Whether the markitdown tool is correctly installed.
2. Whether the markitdown path is correctly configured in the module settings.
3. Whether Drupal has sufficient permissions to execute the markitdown command.
4. Check error messages in the Drupal logs.