v0.1.0: DSN binary parser#47
Merged
Merged
Conversation
…al path support Move OLE/CFBF reader from parsers/altium/ to parsers/ole-reader/ so it can be shared between Altium and Cadence DSN parsers. Add hierarchical directory tree traversal (childId/leftSiblingId/rightSiblingId), readStreamByPath(), and listAllEntries() for nested CFBF containers like .DSN files.
Validate OleReader hierarchical path support against BeagleBone-Black DSN fixture. Container has 11 page streams under Views/BeagleBoneBlack/Pages/, a Library stream, and Packages Directory stream.
Port of DataStream.cpp from OpenOrCadParser. Little-endian integer reads, ASCII string reads (zero-terminated, length-prefixed, length+zero), position tracking, bounds checking, and data assertion.
Port GenericParser.cpp and FutureData.hpp from OpenOrCadParser. Implements prefix chain detection (auto-detect count 10 down to 1), long/short prefix reading, preamble magic (0xff 0xe4 0x5c 0x39), and FutureDataList checkpoint system for structure boundary validation.
Port 11 structure parsers from OpenOrCadParser: SymbolDisplayProp, Alias, Wire (scalar/bus), T0x10 (pin instances), PlacedInstance, GraphicInst base, Global, Port, OffPageConnector, Device, and Package. Each follows the prefix-preamble-checkpoint pattern from the C++ reference.
Parses Page streams from DSN CFBF containers to extract nets and components using coordinate-based wire-to-pin matching. Handles primitive T0x34/T0x35 structures, net tables, globals, ports, and off-page connectors. Smoke test validates 331 nets and 414 components from BeagleBone-Black fixture.
cadenceHandler.parse() now falls back to direct DSN binary parsing when .dat export files are not available. Comparison tests against BeagleBone-Black show 94.9% net coverage and 99.7% component coverage versus the DAT parser output.
Cadence's Allegro netlist export uppercases all net names in .dat files. The DSN schematic preserves original mixed case (e.g. SYS_RESETn vs SYS_RESETN). Uppercasing at parse time ensures consistent net names across both parsing paths.
Add DSN Parser Coverage test suite that compares direct DSN binary parsing against DAT-derived golden output for all Cadence fixtures. Component coverage is 100% across all 10 designs. Net coverage ranges from 59.8% to 100%, with gaps from unnamed auto-generated nets and unresolved multi-segment wire junctions. Regenerate BEAGLEBONEBLK_C3 golden from freshly exported dat files.
Replace coordinate-only net matching with netId-based pin grouping
and page-scoped coordinate resolution. This fixes cross-page coordinate
collisions and enables unnamed net synthesis via N{netId}, dramatically
improving DSN parser coverage (e.g. BeagleBone-Black 95.5% -> 99.7%).
Add reusable debug scripts: dsn-coverage-report.ts for comparing DSN
vs DAT golden output, dsn-inspect.ts for low-level binary inspection.
- Add Union-Find wire graph to propagate net names through connected
wire segments, replacing per-coordinate lookups
- Collect ALL wire aliases (not just first) to resolve dual-name nets
like USBDM_1/USBDM2 on the same wire
- Use alphabetical-first resolution when a wire group has multiple
candidate names, matching Cadence CIS export behavior
- Discover wire segmentId field (previously skipped 4 bytes): this is
the per-segment database object ID that Cadence uses for auto-generated
net names (N{minSegmentId})
- Synthesize N{minSegmentId} for unnamed wire groups, matching DAT export
- Add sentinel netId handling: netId=0 maps to NC, 0xFFFFFFFF skipped
- Exclude global/port/OPC symbol names from net resolution (they contain
symbol type names like VCC_BAR, not net names like VDD_CORE)
- Refactor buildNetConnectivity into focused sub-functions:
buildPageCoordMap, collectPins, assembleNets
Coverage improvement (DSN vs DAT golden):
7/10 fixtures at 100% (was 0/10)
Aggregate: 97.2% (was 95.8%), missing 128 (was 193), extra 35 (was 1134)
Add DSN debug scripts: dsn-gap-analysis, dsn-check-ports, dsn-wire-trace,
dsn-find-wire, dsn-wire-fields
- OPC midpoint matching: connect OPCs to wires via 5 candidate edge midpoints (right, left, top, bottom, loc), avoiding false unions from overlapping bboxes on dense schematics - OPC pairing: union OPCs sharing the same pairingId for intra-page and cross-page net equivalence - Multi-name netId: store all net table names per wireId (not just the last one) so hierarchy preference can resolve ties (e.g., VOLUP vs GPIO2 on same netId, hierarchy picks VOLUP) - Sentinel pin handling: process 0xFFFFFFFF pins that have a coordinate-resolved net name instead of dropping them - Library strLst: fall back to uint32 count when uint16 fails - PropPairs threading: capture short prefix name/value pairs in GraphicInst for OPC label extraction - Add 20 DSN debug/inspection scripts
The OPC label extraction via propPairs was a dead end: the labels are already in the net table (multiple names per netId), not in the short prefix property pairs. Remove the propPairs parameter from autoReadPrefixes, readPrefixes, readSinglePrefixShort, parseGraphicInstBase, and the GraphicInst interface.
Add disambiguateCrossPageNets() to resolve duplicate net names across pages using hierarchy-suffixed names matched by sort order (two-pointer). Brings reServer J401 v11 from 89.1% to 100% net coverage. Cleanup: remove unused parseLibraryStrLst, no-op sanitizeCheckpoints, dead symbolDisplayProps loop in buildComponents, stale comments. Fix in-place mutation of pageCoordMaps in buildNetConnectivity. Aggregate DSN coverage: 99.8% (4594/4601), 0 extra.
Handle direct pin-to-pin overlaps (no wire) by grouping sentinel pins
at shared coordinates and matching them to unmatched N{number} hierarchy
names. This closes the last 7 coverage gaps (LAUNCHXL 2, CutiePi 5),
bringing DSN-to-DAT parity to 100% (4601/4601 nets across 10 fixtures).
Refactor assembleNets into composed functions: classifyPins separates
pins into 4 categories, resolveNetIdName and resolveWirelessSentinelNets
handle naming independently, and assembleNets orchestrates.
Also remove stale sanitizeCheckpoints calls from debug scripts.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Delete 24 one-off debug scripts written during DSN parser development. All functionality absorbed into dsn-inspect.ts as 8 new commands: nettable, symbols, wire, wiretrace, conflicts, hierarchy, streams, stream. Three scripts remain: dsn-inspect.ts (13 commands covering full DSN visibility), dsn-coverage-report.ts, and dsn-gap-analysis.ts. Also adds tsconfig.check.json for type-checking scripts alongside src, fixes no-explicit-any lint errors in coverage/gap scripts, and updates AGENTS.md documentation.
Update package.json scripts to use tsconfig.check.json for type-checking and include scripts/**/*.ts in lint targets. Fix zod indentation. Remove obsolete cadence.zip test fixture archive.
Route parseCadenceDesign on file extension: .dsn goes to DSN binary parser, .dat goes to DAT parser. No internal fallback; the agent controls which path to use. Remove discovery error for DSN designs without DAT files since DSN parsing no longer requires them. Golden tests always parse via DAT (gold standard) by resolving .dsn paths to their pstxnet.dat.
…N output
Parse Packages/{name} OLE streams to extract Device.pinMap, resolving
physical pin numbers (e.g., BGA A1, QFN pad numbers) instead of using
sequential indices. Set component mpn to sourcePackage as a baseline.
Add explore-packages and verify-pin-numbers scripts.
… matching Add findPinMap with three matching strategies: direct sourcePackage, multi-unit pkgName extraction (handles doubled unit refs like AA→A), and version suffix normalization (_0_ → _0.0_). Pin number match rate improves from 86% to 90% aggregate across all fixtures.
…t fields Add dns?: boolean to ComponentDetails, set by parsers via isDnsComponent(). Strip DNS/DNI/DNP/DNF/NF marker tokens from mpn and value fields after detection. Altium parser now checks Assembly Info parameter for NF markers. Service and traversal layers read component.dns instead of recomputing. Add --all flag to gen-golden.ts for batch regeneration.
Add NC (Not Connected) and DNM (Do Not Mount) to DNS_PATTERN and stripDnsMarkers. CutiePi uses _NC suffix, LAUNCHXL-CC1310 uses _DNM. Strip marker and trailing content from mid-string positions too.
Parse Library stream strLst string table and resolve PlacedInstance prefix property pairs to extract real MPN and Value fields. Parse LibraryPart SymbolPins from Package streams for functional pin names. - Extract PrefixPropertyPair from short prefix in generic-parser - Add library-parser.ts for Library stream strLst extraction - Add parseSymbolPin/parseLibraryPart to structures.ts - Use T0x10.pinIndex (1-based) for accurate pin map lookup - Value: 3-source priority (prefix > partValueIdx > LibraryPart default) - MPN: prefix properties with fallback to sourcePackage - Pin names: LibraryPart SymbolPin lookup via pinIndex - Add field-level coverage (Value, PinNum, PinName, MPN) to coverage report
The hook was scanning every word in a command, including arguments and string literals. Words like 'more' in a commit message triggered false positives. Now it breaks after finding the first command name per segment.
Three complementary fixes: - Scan Cache stream for Package/Device structures as fallback pinMap source - Merge multi-unit component pins (e.g. resistor packs with multiple PlacedInstances) - Expand findPinMap matching with trailing _N suffix stripping and dual-key indexing
…Name to 85.9% Replace brute-force byte scanning with sequential parsing based on OpenOrCadParser StreamCache.cpp reference. The Cache is now parsed entry-by-entry from header to EOF, extracting both Package (pin maps) and LibraryPart (pin names) structures. BB-Black PinNum jumps from 66.7% to 99.3%, PinName from 0% to 98.4%.
Cache recovery scanner, findCachedPart matching, pin name uppercasing/disambiguation, DNS value cleanup, and unit "A" fallback bring coverage to PinNum 99.1%, PinName 96.0%, Value 99.9%. Update docs/dsn-format.md with all findings. Add DSN parser reference section to CLAUDE.md pointing to C++ source and format spec.
DSN binary stores some values uppercased (e.g., 100PF) while pstchip.dat preserves original case (100pF). Count these as matches but flag them as "case-transformed" in the report output. Also add scripts/tsconfig.json to resolve IDE diagnostics for scripts.
…um 99.1% → 99.8%) Multi-section components (resistor packs, transistor arrays) where all sections share the same pkgName now get the correct pinMap via positional dbId-order assignment instead of falling back to device "A" for all sections.
…nNum 99.8% → 100.0%) Sentinel pins (netId=0xFFFFFFFF) that overlap power/ground global symbols were unresolvable because they connect via geometric overlap, not wires. Three fixes: (1) bbox containment matching connects globals to wire graph, (2) pairingId-based fallback resolves net names for pins inside global symbol bboxes across pages, (3) netId=0 pins with wire connections are no longer silently dropped. Also adds point-on-segment matching for pins on wire bodies (not just endpoints).
Pins connected directly to off-page connectors (no wire) were missing because the OPC had no wire endpoint to match. Resolves net names via OPC pairingId cross-page lookup: if the same OPC appears on another page with a wire connection, the net name propagates back.
…→ 100.0%) GraphicInst first 8 bytes are strLst indices: name_str_idx (net name) and lib_str_idx (source library path), not unknown/pairing bytes. OPC net names are now resolved directly via strLst lookup, fixing pins connected to OPCs with no wire (e.g., U3.30/LOL, U3.32/LOR).
…inNum 100%) When a physical package has more pads than the schematic symbol exposes (e.g., a 2-pin crystal in a 4-pad XTAL-CM200S package), the Packages/ stream pinMap contains all physical pads while the Cache stream pinMap contains only the schematic-level pins. The parser now stores Cache pinMaps separately and falls back to them when the Packages/ pinMap has more entries than the instance's T0x10 count. Also: coverage report now shows missing PinNum details in verbose mode, and CLAUDE.md updated with mandatory C++ reference workflow.
Extract 5 modules from the 1699-line monolith: - cache-parser.ts: Cache stream parsing - page-parser.ts: Page, Package, Hierarchy stream parsing - pin-resolver.ts: pin number resolution (shared leaf module) - net-builder.ts: wire graph connectivity and net assembly - component-builder.ts: MPN, value, and pin name enrichment Introduce PinMapData context type to replace threading 3 separate Maps through ~10 function signatures. Add CachedLibraryPart to structure-types.ts. dsn-parser.ts reduced to ~145-line orchestrator. Add OpenOrCadParser attribution to NOTICE and README.md.
Extract coverage analysis from dev script into src/coverage.ts so the compiled binary can compare DSN parser output against DAT netlist exports. Reports field-level parity (nets, comps, value, MPN, DNS, pinNum, pinName) with markdown file export and optional --verbose mode.
list_designs now returns pstxnet.dat as path (preferred) with source pointing to the .DSN schematic when exported .dat files exist. Extracts all server/tool description strings into src/descriptions.ts.
Extract the monolithic service.ts into focused modules under src/service/: - service/index.ts (re-export hub) - service/load-netlist.ts (design loading) - service/component-grouping.ts (component aggregation) - service/regex-helpers.ts (input validation) - service/tools/ (one file per MCP tool handler)
Its only consumer is cadence-export.ts, so colocate it there.
…bar unescaping Prefix matching used startsWith(), causing L to match LED, C to match CON, etc. Now uses getRefdesPrefix() with Set lookup for exact matches. Altium record parser now tries UTF-8 first and falls back to latin1 for Windows-1252 encoded files (fixes corrupted µ, ±, ° characters). Altium net names with overbar notation (\V\C\C) are now unescaped to plain text (VCC) so they can be queried normally.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
.DSNschematic files (CFBF/OLE container format), providing direct netlist extraction without Cadence's exported.datfiles--export-json,--coverage), DNS detection, Altium encoding/overbar fixesTest plan
npm run type-checkpassesnpm run lintpassesnpm testpasses (393 tests)