llm_runtime_monitor
No security coverage
LLM Runtime Monitor provides a lightweight Drupal 11 dashboard for monitoring external and self-hosted LLM runtimes.
It helps developers and administrators check runtime availability, latency, model information, slot activity, throughput, and context usage from a configured LLM server.
Features
- Monitor external LLM runtime endpoints.
- View runtime health, latency, model name, slot usage, throughput, and context usage.
- Collect background samples at configurable intervals.
- Display live dashboard updates while the dashboard is open.
- Use runtime metrics when supported by the provider.
- Permission-protected administration pages.
- Adapter-based architecture for runtime providers.
Supported runtimes
- llama.cpp / llama-server
- Ollama
Additional HTTP-based runtime adapters may be added in future releases.
How to use
- Install and enable the module.
- Configure the runtime provider and base URL.
- Set the health endpoint, metrics endpoint, timeout, polling interval, and retention period.
- Save the configuration.
- Open the dashboard to monitor current runtime status and recent samples.
Example base URL:
http://localhost:8080For containerized environments, use a hostname that is reachable from the Drupal application.
Requirements
- Drupal 11
- A reachable LLM runtime endpoint
- Permission to access the monitoring dashboard
Scope
This module is focused on runtime monitoring only.
It does not provide chat completions, prompt storage, response logging, AI provider execution, cost tracking, or a full observability stack.