adk-scope is a tool designed to extract semantic metadata (features) from Agent Development Kit (ADK) repositories. It utilizes Tree-sitter to parse source code across multiple languages (Python, Java, Go, TypeScript) and generates structured metadata in Protocol Buffers format.
- Multi-Language Support: Extracts features from Python, Java, Go, and TypeScript.
- Semantic Extraction: Identifying Agents, Tools, and other high-level constructs.
- Flexible Input: Supports extraction from single files, directories, or full repositories.
- Structured Output: Generates
FeatureRegistryobjects serialized to JSON.
This project uses pyproject.toml for dependency management.
-
Create and activate a virtual environment:
python3 -m venv .venv source .venv/bin/activate -
Install dependencies:
pip install . # OR for development pip install -e ".[dev]"
You can run the extractor using the provided shell script or directly via Python.
The extract.sh wrapper helps set up the PYTHONPATH correctly.
The script requires a --language argument to specify the target language (py or ts).
# For Python
./extract.sh --language py --input-repo /path/to/adk-python output_dir
# For TypeScript
./extract.sh --language ts --input-repo /path/to/adk-js output_dir| Argument | Description |
|---|---|
--language <lang> |
Required. Language to extract (python or typescript). |
--input-file <path> |
Path to a single file to process. |
--input-dir <path> |
Path to a directory containing files. |
--input-repo <path> |
Path to the root of an ADK repository. Recursive search in src (Python) or core/src (TS). |
output |
Required. Path to the output directory. |
Examples:
# Process a single file
./extract.sh --language python --input-file src/my_agent.py output_dir
# Process a directory
python3 -m google.adk.scope.extractors.python.extractor \
--input-dir src/google/adk \
output_dir
### Feature Matching & Reporting
Once you have extracted features from two languages (e.g., Python and TypeScript), you can compare them using the `report.sh` script.
```bash
./report.sh \
--base output/py.txtpb \
--target output/ts.txtpb \
--output output/ \
--report-type md
| Argument | Description |
|---|---|
--base <path> |
Required. Path to the "source of truth" feature registry (e.g., Python). |
--target <path> |
Required. Path to the comparison registry (e.g., TypeScript). |
--output <dir> |
Required. Path for the output directory. The report filename is auto-generated. |
--report-type <type> |
md (default) for Markdown Parity Report, or raw for CSV. |
TODO: This needs updating
adk-scope generates two types of reports to help you understand the feature overlap between two languages.
This report generates a human-readable Markdown file detailing the feature parity between two SDKs.
- Gap Analysis List: A summary table that breaks down features into "Common Shared", "Exclusive to [Base Language]", and "Exclusive to [Target Language]".
- Jaccard Score: It calculates an overall similarity score using the Jaccard Index (Intersection over Union), providing a global metric of feature parity.
- Module Breakdown: It provides score details and status links on a per-module basis, highlighting exact matches, potential near-matches, and missing features.
This report provides a simple CSV output of all features (matched and unmatched) from both the base and target registries. It is useful for programmatic analysis or for importing the data into other tools.
We use pytest for testing.
# Run all tests
pytest
# Run with coverage report
pytest --cov=google.adk.scope --cov-report=term-missingWe use ruff for linting.
ruff check .If you modify proto/features.proto, you need to regenerate the Python code:
./proto2py.shsrc/google/adk/scope/: Main source code.extractors/: Language-specific extractors (currently Python).utils/: Utility modules (strings, args).
proto/: Protocol Buffer definitions (features.proto).test/: Unit tests.