Skip to content

docs: add KQL query library section to telemetry.md#109

Closed
Dongbumlee wants to merge 3 commits into
developfrom
feature/kql-query-library
Closed

docs: add KQL query library section to telemetry.md#109
Dongbumlee wants to merge 3 commits into
developfrom
feature/kql-query-library

Conversation

@Dongbumlee
Copy link
Copy Markdown
Collaborator

Closes #89

Summary

Adds a new "Querying Traces in Azure Monitor (KQL)" section to docs/telemetry.md with 5 ready-to-use KQL queries for users who send eval traces to Azure Monitor / Application Insights.

What's Added

New section inserted between "Sending Traces to Azure Monitor" and "Evaluation Tracing vs. Agent Execution Tracing" containing:

  1. Table mapping — explains which AgentOps spans land in requests vs dependencies tables
  2. Query 1: Slowest evaluation rows — top N eval_item spans by duration
  3. Query 2: Failed evaluators — filters by agentops.eval.evaluator.passed == False
  4. Query 3: Pass rate over time — trends agentops.eval.pass_rate from root spans with timechart
  5. Query 4: Token usage per run — sums gen_ai.usage.input_tokens + output_tokens per operation
  6. Query 5: Evaluator score distribution — statistical summary (avg, min, max, p50, p90) by evaluator name

Verification

  • All span attribute names verified against src/agentops/utils/telemetry.py source code
  • OTLP → App Insights attribute preservation confirmed (dotted names preserved as-is in customDimensions)
  • KQL syntax uses correct bracket notation: customDimensions["attribute.name"]
  • Correct table assignments based on SpanKind (SERVER → requests, CLIENT/INTERNAL → dependencies)

Adds a new 'Querying Traces in Azure Monitor (KQL)' section with 5 ready-to-use KQL queries for users who send eval traces to Azure Monitor / Application Insights:

1. Slowest evaluation rows (top N eval_item spans by duration)
2. Failed evaluators (filter by passed == false with scores and thresholds)
3. Pass rate over time (trend from root spans with timechart render)
4. Token usage per run (sum input + output tokens by operation_Id)
5. Evaluator score distribution (stats by evaluator name)

Includes a table mapping explaining which AgentOps spans land in which App Insights tables (requests vs dependencies). All attribute names verified against telemetry.py source code.

Closes #89

Generated with [Amplifier](https://github.com/microsoft/amplifier)

Co-Authored-By: Amplifier <240397093+microsoft-amplifier@users.noreply.github.com>
@Dongbumlee Dongbumlee force-pushed the feature/kql-query-library branch from 5a87299 to 2fafd68 Compare April 29, 2026 19:05
Dongbumlee and others added 2 commits April 29, 2026 13:44
Validated end-to-end against Jaeger and Azure Monitor (OTel Collector
proxy + App Insights KQL queries 1-5). Adjustments:

- Use 1-based eval_item indices and include the input snippet that the
  runner actually puts in the span name (eval_item N - '<input>').
- Add cicd.pipeline.task.run.id / .run.result to the eval_item example;
  these are emitted by telemetry.py but were missing from the doc.
- Remove agentops.eval.item.expected from the trace tree and the
  attribute table; the attribute is never populated because the runner
  does not pass expected_text into eval_item_span.
- Clarify that gen_ai.provider.name varies by backend
  (azure.ai.inference for Foundry, local.callable for local adapter).
- Note that item.index is 1-based.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Option B told users to set AGENTOPS_OTLP_ENDPOINT directly to a
'https://<region>.applicationinsights.azure.com' URL, but our exporter
sends plain OTLP/HTTP with no Authorization header. App Insights does
not accept that:

- The Azure Monitor OpenTelemetry distro
  (https://learn.microsoft.com/azure/azure-monitor/app/opentelemetry-configuration?tabs=python)
  requires a connection string and configure_azure_monitor(), not a
  raw OTLP endpoint.
- The preview 'Microsoft.Insights/OtlpApplicationInsights' direct OTLP
  ingestion requires Entra ID Bearer-token auth (scope
  https://monitor.azure.com/.default), which telemetry.py does not
  inject today.

Replace the two-option layout with a single recommended path (the
Collector proxy, validated end-to-end against App Insights) and an
explanatory subsection covering why direct export from AgentOps is
not supported.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@Dongbumlee
Copy link
Copy Markdown
Collaborator Author

Follow-up: E2E validation + doc fixes

I ran the entire telemetry.md end-to-end against both Jaeger (local) and Azure Monitor / Application Insights (via the OTel Collector proxy) using a minimal local-adapter eval, then pushed two follow-up commits with documentation fixes for the discrepancies I found.

Commits pushed

  • fae2d41docs(telemetry): correct trace tree and attribute table
  • a0d6098docs(telemetry): drop misleading 'Option B' Azure Monitor path

What was validated end-to-end (all ✅)

Doc claim How it was tested
Quick Start (Jaeger Docker → install OTel → env var → agentops eval run) Walked all 5 steps
Single env var AGENTOPS_OTLP_ENDPOINT controls everything Toggled on/off; service appears in Jaeger only when set
/v1/traces appended automatically Verified in telemetry.py:66 and confirmed working
Zero overhead when disabled Service absent in Jaeger after off-run; eval exits 0
Graceful degradation when OTel packages missing Shadowed opentelemetry via PYTHONPATH; eval still ran with exit 0
Span tree (SERVER root → INTERNAL item → CLIENT invoke + INTERNAL evaluator) 16 spans observed in correct hierarchy
CICD / GenAI / AgentOps semantic-convention layers All documented attribute keys present
OTel Collector → Azure Monitor exporter Spans landed in App Insights
Table mapping (requests for SERVER, dependencies for CLIENT/INTERNAL) Confirmed via KQL
KQL Query 1 — slowest rows 5 rows returned with correct shape
KQL Query 2 — failed evaluators 10 rows after intentional threshold-fail run
KQL Query 3 — pass rate over time passRate=1.0 returned
KQL Query 4 — token usage per run totalInputTokens=142, totalOutputTokens=87 (synthetic Foundry-style span)
KQL Query 5 — score distribution avg/min/max/p50/p90 per evaluator
Exit code contract (0 pass / 2 threshold-fail / 1 error) EXIT=0 on pass, EXIT=2 on threshold fail
GenAI attrs (gen_ai.agent.id/name/version, gen_ai.request/response.model, gen_ai.usage.input/output_tokens) Verified via synthetic span using agent_invoke_span + set_agent_invoke_result

Doc fixes applied (fae2d41)

  • eval_item indices are 1-based, not 0-based; span name actually includes the input snippet (eval_item N - '<input>').
  • Added cicd.pipeline.task.run.id and cicd.pipeline.task.run.result to the eval_item example — they're emitted by telemetry.py but were missing from the doc.
  • Removed agentops.eval.item.expected from the trace tree and attribute table — the attribute is never populated because runner.py does not pass expected_text into eval_item_span().
  • Clarified that gen_ai.provider.name varies by backend (azure.ai.inference for Foundry, local.callable for the local adapter).

Doc fixes applied (a0d6098) — Option B was misleading

The original "Option B: Use Azure Monitor's OTLP Endpoint Directly" told users to set:

export AGENTOPS_OTLP_ENDPOINT=https://<region>.applicationinsights.azure.com

…but our OTLPSpanExporter(endpoint=...) POSTs plain application/x-protobuf with no Authorization header. App Insights does not accept that:

  • The Azure Monitor OpenTelemetry distro for Python (the Microsoft Learn page the original Option B linked to) requires APPLICATIONINSIGHTS_CONNECTION_STRING and is invoked via configure_azure_monitor(), not a raw OTLP endpoint.
  • Application Insights also has a preview feature (Microsoft.Insights/OtlpApplicationInsights) that exposes per-resource OTLP ingestion URLs, but it requires Entra ID Bearer-token auth (scope https://monitor.azure.com/.default), which our exporter does not inject.

The Collector proxy (former Option A, now the only documented path) handles auth on AgentOps's behalf and was end-to-end validated against App Insights in this work. I added a short "Why not export from AgentOps directly?" subsection explaining the constraints with a link to the MS docs.

Tests

python3 -m pytest tests/ -x -q282 passed

@Dongbumlee Dongbumlee requested a review from placerda April 29, 2026 20:59
@Dongbumlee Dongbumlee closed this May 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant