ai_empathy
AI Empathy Evaluation
The AI Empathy Evaluation module provides a framework for evaluating and improving how AI models handle ethical dilemmas with empathetic reasoning. It integrates with Drupal's AI module to benchmark LLM responses across scenarios using four research-backed metrics.
Summary:
Evaluate AI empathetic decision-making using ethical dilemma scenarios, scoring responses on accuracy, empathy alignment, explanation quality, and consistency.
Features
- Scenario Management — Ships with 20 default ethical dilemma scenarios across four categories (Military, Medical, Emotion-based, Cultural), each with reference decisions and difficulty levels (1–5). Create your own scenarios through the admin UI.
- AI-Powered Evaluation — Send scenarios to any configured AI provider (OpenAI, Anthropic, etc.) via the Drupal AI module and receive structured responses analysed for empathetic reasoning.
- Four-Metric Scoring — Responses are scored by a second AI model on:
- Decision Accuracy (0–100%)
- Empathy Alignment (1–5)
- Explanation Quality (1–5)
- Consistency Index (0–100%, computed across multiple runs)
- Human Rating — Allow human raters to independently score AI responses on empathy and explanation quality, providing a corrective layer alongside AI-generated scores.
- Training Mode — A multi-step wizard that presents scenarios in order of increasing difficulty, using cumulative feedback from prior responses to prompt-engineer improvement over a session.
- Batch Evaluation — Run multiple scenarios with multiple repetitions via Drupal's Batch API, with clear cost estimates (API calls) before execution.
- Dashboard & Reporting — Overview of aggregate metrics, provider/model comparisons, category breakdowns, and recent results at a glance.
- Configurable Thresholds — Set minimum acceptable scores for each metric to define pass/fail criteria.
Use cases:
- Researchers benchmarking LLM empathy capabilities across providers
- Organisations evaluating AI models before deploying them in sensitive contexts (healthcare, counselling, support)
- Educators using scenarios to teach ethical decision-making with AI
- Developers testing prompt engineering strategies for empathetic responses
Post-Installation:
- Enable the module: drush en ai_empathy
- Navigate to Administration > Configuration > AI > AI Empathy Evaluation (/admin/config/ai/empathy)
- Go to the Settings tab and select your AI provider/model for both evaluation and scoring (these can be different models)
- Set your scoring thresholds (defaults: 70% accuracy, 3.0/5 empathy, 3.0/5 quality, 75% consistency)
- Visit the Scenarios tab to review the 20 pre-installed scenarios or add your own
- Use Run Evaluation to batch-evaluate scenarios, or Training Mode to step through them interactively
- View results on the Dashboard and Results tabs
- Use the Rate action on any result to submit human ratings
All configuration lives under /admin/config/ai/empathy with local task tabs for each section.
Additional Requirements:
- AI module (v1.3+) — Provides the provider abstraction layer. This is the only hard dependency.
- At least one AI provider module — e.g., OpenAI Provider, or any other provider compatible with the AI module's chat operation type.
- Key module — Required by the AI module for secure API key storage.
Supporting this Module
This module is based on the research paper "Evaluating Empathetic Decision-Making in AI". If you find it useful, consider:
- Contributing scenarios, translations, or code improvements via the issue queue
- Citing the research paper in academic work that uses this module
- Sharing your evaluation results to help build community benchmarks
Community Documentation
- Research Paper: Evaluating Empathetic Decision-Making in AI (IJFMR)
- Drupal AI Module Documentation: AI module docs