Skip to content

Feature — MCP-based trace source for observability backends #67

@henrikrexed

Description

@henrikrexed

Problem or motivation

Currently, agentevals supports Jaeger JSON and OTLP trace files as inputs. Users must export traces to files before evaluating. This adds friction — especially for teams whose traces already live in an observability backend (Dynatrace, Grafana, Datadog, etc.).

Proposed solution

Add an MCP-based trace loader that pulls traces directly from any observability backend that exposes an MCP server.

Usage

# Pull traces from Dynatrace via MCP
agentevals run --source mcp \
  --mcp-url "http://localhost:3000" \
  --mcp-tool "execute_dql" \
  --query 'fetch spans | filter service.name == "my-agent" | sort start_time desc | limit 100' \
  --eval-set my-eval.json

# Pull from any MCP server
agentevals run --source mcp \
  --mcp-url "http://grafana-mcp:8080" \
  --mcp-tool "query_traces" \
  --query '...' \
  --eval-set my-eval.json

Why MCP?

MCP (Model Context Protocol) is becoming the standard for AI tool integration. Multiple observability vendors already have MCP servers:

By supporting MCP as a trace source, agentevals becomes instantly compatible with any backend that has an MCP server — without writing vendor-specific integrations.

Implementation

New file: src/agentevals/loader/mcp.py

The loader would:

  1. Connect to the MCP server via HTTP
  2. Call the specified tool with the query
  3. Parse the response as OTLP spans
  4. Convert to agentevals internal trace format

Alternatives considered

No response

Additional context

Benefits

  • No file export needed — evaluate directly from production traces
  • Vendor-agnostic — works with any MCP-compatible backend
  • CI/CD readyagentevals run --source mcp ... in pipelines
  • Real-time — evaluate the latest traces, not stale exports

Human confirmation

  • I am a human (not a bot, agent, or AI) filing this issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions