Problem or motivation
Currently, agentevals supports Jaeger JSON and OTLP trace files as inputs. Users must export traces to files before evaluating. This adds friction — especially for teams whose traces already live in an observability backend (Dynatrace, Grafana, Datadog, etc.).
Proposed solution
Add an MCP-based trace loader that pulls traces directly from any observability backend that exposes an MCP server.
Usage
# Pull traces from Dynatrace via MCP
agentevals run --source mcp \
--mcp-url "http://localhost:3000" \
--mcp-tool "execute_dql" \
--query 'fetch spans | filter service.name == "my-agent" | sort start_time desc | limit 100' \
--eval-set my-eval.json
# Pull from any MCP server
agentevals run --source mcp \
--mcp-url "http://grafana-mcp:8080" \
--mcp-tool "query_traces" \
--query '...' \
--eval-set my-eval.json
Why MCP?
MCP (Model Context Protocol) is becoming the standard for AI tool integration. Multiple observability vendors already have MCP servers:
By supporting MCP as a trace source, agentevals becomes instantly compatible with any backend that has an MCP server — without writing vendor-specific integrations.
Implementation
New file: src/agentevals/loader/mcp.py
The loader would:
- Connect to the MCP server via HTTP
- Call the specified tool with the query
- Parse the response as OTLP spans
- Convert to agentevals internal trace format
Alternatives considered
No response
Additional context
Benefits
- No file export needed — evaluate directly from production traces
- Vendor-agnostic — works with any MCP-compatible backend
- CI/CD ready —
agentevals run --source mcp ... in pipelines
- Real-time — evaluate the latest traces, not stale exports
Human confirmation
Problem or motivation
Currently, agentevals supports Jaeger JSON and OTLP trace files as inputs. Users must export traces to files before evaluating. This adds friction — especially for teams whose traces already live in an observability backend (Dynatrace, Grafana, Datadog, etc.).
Proposed solution
Add an MCP-based trace loader that pulls traces directly from any observability backend that exposes an MCP server.
Usage
Why MCP?
MCP (Model Context Protocol) is becoming the standard for AI tool integration. Multiple observability vendors already have MCP servers:
execute_dqltoolBy supporting MCP as a trace source, agentevals becomes instantly compatible with any backend that has an MCP server — without writing vendor-specific integrations.
Implementation
New file:
src/agentevals/loader/mcp.pyThe loader would:
Alternatives considered
No response
Additional context
Benefits
agentevals run --source mcp ...in pipelinesHuman confirmation