Summary
tree-sitter-language-pack 1.8+ changed two APIs the threat-model pipeline depends on:
parser.parse() argument type: 1.5/1.6/1.7 accepted bytes; 1.8 requires str and exposes a separate parse_bytes(bytes) method for the old contract.
tree.root_node: 1.5–1.7 was a property returning a Node; 1.8 is a callable method (tree.root_node()).
darnit's threat_model/ code is built against the 1.5.x shape (parser.parse(bytes) + property-style root_node). 23 call sites in production code access tree.root_node as a property — every one of them would need a wrapper or shim to support both APIs.
Reproducer
uv venv .v && source .v/bin/activate
uv pip install tree-sitter-language-pack==1.8.1 tree-sitter==0.25
python -c "
from tree_sitter_language_pack import get_parser
p = get_parser('go')
p.parse(b'package main\nfunc main(){}\n') # TypeError: source ... 'bytes' not str
"
Real impact: parsing.py:121 raises immediately on the first file the pipeline tries to parse, so discover_all() bails before writing any output. The OSPS-SA-03.02 remediation handler surfaces this as "Executed 1 remediation handler(s)" without a useful traceback.
Stop-gap (current)
packages/darnit-baseline/pyproject.toml pins tree-sitter-language-pack>=1.5,<1.8. Anyone who picks up darnit via pip install/uv tool install will resolve to a working version. This will land via PR for fix/parse-source-language-pack-compat.
parse_source() also got a defensive fallback (try bytes → fall back to decoded str on TypeError) so the parse step survives the API split, but the downstream tree.root_node.* accesses still don't.
Proper fix (this issue)
To support 1.8+, every tree.root_node access in threat_model/ needs to handle both forms. Sketch:
def get_root_node(tree):
"""Compat: tree-sitter-language-pack 1.8+ made root_node a method."""
rn = tree.root_node
return rn() if callable(rn) else rn
Touchpoints (23 of them, all under packages/darnit-baseline/src/darnit_baseline/threat_model/):
parsing.py (the parse_source caller's check + queries)
ts_discovery.py (~20 call sites across all extractors)
grouping.py, ranking.py, ts_generators.py (a handful each)
Plus the upgrade unlocks parse_bytes() as the canonical bytes entry point — we should switch the parser call to parse_bytes when available and stop straddling the API.
Acceptance
pyproject.toml removes the <1.8 cap.
darnit installs cleanly with tree-sitter-language-pack==1.8.1 (the latest at the time of this issue) and runs generate_threat_model successfully against the same fixtures + gittuf reference.
- Existing tests pass against both 1.5.x and 1.8.x bindings; ideally via a
tox-style or matrix test.
Related
Summary
tree-sitter-language-pack1.8+ changed two APIs the threat-model pipeline depends on:parser.parse()argument type: 1.5/1.6/1.7 acceptedbytes; 1.8 requiresstrand exposes a separateparse_bytes(bytes)method for the old contract.tree.root_node: 1.5–1.7 was a property returning aNode; 1.8 is a callable method (tree.root_node()).darnit's
threat_model/code is built against the 1.5.x shape (parser.parse(bytes)+ property-styleroot_node). 23 call sites in production code accesstree.root_nodeas a property — every one of them would need a wrapper or shim to support both APIs.Reproducer
Real impact:
parsing.py:121raises immediately on the first file the pipeline tries to parse, sodiscover_all()bails before writing any output. The OSPS-SA-03.02 remediation handler surfaces this as "Executed 1 remediation handler(s)" without a useful traceback.Stop-gap (current)
packages/darnit-baseline/pyproject.tomlpinstree-sitter-language-pack>=1.5,<1.8. Anyone who picks up darnit viapip install/uv tool installwill resolve to a working version. This will land via PR forfix/parse-source-language-pack-compat.parse_source()also got a defensive fallback (try bytes → fall back to decoded str on TypeError) so the parse step survives the API split, but the downstreamtree.root_node.*accesses still don't.Proper fix (this issue)
To support 1.8+, every
tree.root_nodeaccess inthreat_model/needs to handle both forms. Sketch:Touchpoints (23 of them, all under
packages/darnit-baseline/src/darnit_baseline/threat_model/):parsing.py(the parse_source caller's check + queries)ts_discovery.py(~20 call sites across all extractors)grouping.py,ranking.py,ts_generators.py(a handful each)Plus the upgrade unlocks
parse_bytes()as the canonical bytes entry point — we should switch the parser call toparse_byteswhen available and stop straddling the API.Acceptance
pyproject.tomlremoves the<1.8cap.darnitinstalls cleanly withtree-sitter-language-pack==1.8.1(the latest at the time of this issue) and runsgenerate_threat_modelsuccessfully against the same fixtures + gittuf reference.tox-style or matrix test.Related
fix/parse-source-language-pack-compat.