Skip to content

Excel: Token-Saving Inspection Tools (High Priority) #60

@jwesleye

Description

@jwesleye

Overview

Add memory-efficient inspection tools for Excel files to help agents avoid loading entire spreadsheets into context, similar to the CSV and JSON token-saving tools.

Motivation

Excel files can contain thousands of rows and dozens of columns. Agents often need to inspect structure and metadata before deciding what to process, but loading entire sheets wastes tokens and may exceed context windows.

Proposed Functions

High Priority - Inspection Tools

  • get_sheet_info - Get metadata (row count, column count, file size) without loading data
  • get_sheet_schema - Get column names and inferred data types by sampling
  • preview_sheet_rows - Get first N rows without loading entire sheet
  • get_sheet_names - List all sheet names in workbook

High Priority - Selective Reading

  • select_sheet_columns - Read only specific columns, discard others
  • filter_sheet_rows - Read only rows matching filter criteria (6 operators)
  • get_sheet_row_range - Get specific range of rows (pagination support)
  • sample_sheet_rows - Get representative sample (first/random/systematic)

Medium Priority - Statistics

  • get_sheet_column_stats - Get statistics for a column (unique count, null count, sample values, min/max)
  • count_sheet_rows - Count rows with optional filter
  • get_sheet_value_counts - Get frequency distribution for column values

Design Principles

  • Google ADK compliant (JSON-serializable types, no defaults)
  • @strands_tool decorator
  • Memory-efficient (stream processing where possible)
  • Consistent with CSV/JSON token-saving patterns
  • Process data without loading entire sheets

Related

Module

excel/reading.py (or new excel/inspection.py)

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions