Skip to content

XML: Token-Saving Inspection Tools (High Priority) #61

@jwesleye

Description

@jwesleye

Overview

Add memory-efficient inspection tools for XML files to help agents avoid loading entire documents into context, similar to the CSV and JSON token-saving tools.

Motivation

XML files can be massive (thousands of elements, deeply nested). Agents often need to inspect structure and extract specific data without parsing entire documents, which wastes tokens and may exceed context windows.

Proposed Functions

High Priority - Inspection Tools

  • get_xml_structure - Get element hierarchy/schema without loading content
  • count_xml_elements - Count elements by tag name without loading content
  • get_xml_element_at_path - Extract specific element by XPath
  • get_xml_attributes - List attributes for element type without loading content
  • search_xml_tags - Find all paths containing tags matching pattern

High Priority - Selective Extraction

  • select_xml_elements - Get only specific elements by tag name
  • filter_xml_elements - Filter elements by attribute/text criteria (6 operators)
  • preview_xml_elements - Get first N elements of a specific type
  • slice_xml_elements - Get range of elements (pagination support)

Medium Priority - Analysis

  • get_xml_namespace_info - List all namespaces without loading content
  • get_xml_element_stats - Statistics for element type (count, attributes, depth)
  • validate_xml_structure_simple - Quick validation without full parse

Design Principles

  • Google ADK compliant (JSON-serializable types, no defaults)
  • @strands_tool decorator
  • Memory-efficient (streaming/iterative parsing where possible)
  • XPath support for path notation
  • Consistent with CSV/JSON token-saving patterns
  • Process data without loading entire documents

Related

Module

xml/parsing.py or new xml/inspection.py

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions