Skip to content

feat(parser): add Erlang/OTP parser for app.src, rebar.config, rebar.lock#762

Merged
mstykow merged 6 commits intomainfrom
feat/erlang-otp-parser
Apr 22, 2026
Merged

feat(parser): add Erlang/OTP parser for app.src, rebar.config, rebar.lock#762
mstykow merged 6 commits intomainfrom
feat/erlang-otp-parser

Conversation

@mstykow
Copy link
Copy Markdown
Owner

@mstykow mstykow commented Apr 22, 2026

Summary

  • Add and harden the Erlang / OTP parser family for *.app.src, rebar.config, and rebar.lock, including map-aware metadata handling, alias-aware Rebar package identity, and bounded %PLACEHOLDER% template parsing for canonical OTP manifests such as diameter.app.src
  • Verify scorecard row 44 end to end with compare-outputs against processone/ejabberd, erlang/otp, and vernemq/vernemq, then record benchmark rows and mark the row 🟢 Verified
  • Refresh the generated parser-planning and supported-format docs so the shipped Erlang / OTP support is reflected in repo documentation
  • Keep weak bare-word GPL shorthand as clue-only evidence while confirming that stronger GPL and GPL-with-exception findings in the OTP compare lane still remain hard detections

Issues

Scope and exclusions

  • Included:
    • *.app.src parsing for OTP application metadata, stdlib-filtered dependencies, runtime dependency extraction, and template-placeholder resilience
    • rebar.config parsing for Hex, git, alias, and profile-scoped dependencies
    • rebar.lock parsing for resolved package identity, git refs, and hash coverage
    • sibling merge coverage for rebar.config + rebar.lock
    • generated docs updates in docs/SUPPORTED_FORMATS.md, docs/implementation-plans/package-detection/PARSER_PLAN.md, docs/BENCHMARKS.md, and docs/benchmarks/scan-duration-vs-files.svg
    • compare-output verification runs for:
      • .provenant/compare-runs/20260422T183347Z-ejabberd-26578
      • .provenant/compare-runs/20260422T182619Z-otp-15523
      • .provenant/compare-runs/20260422T182920Z-vernemq-20484
  • Explicit exclusions:
    • no Erlang expression evaluation, variable expansion, or plugin execution
    • conditional dependency wrappers like {if_var_true, ...} stay skipped instead of being guessed at

Intentional differences from Python

  • This remains a net-new Provenant parser family with no Python ScanCode package-parser reference implementation for Erlang / OTP
  • Weak bare-word GPL shorthand such as the clue-only gpl_bare_word_only.RULE stays demoted to clue evidence; OTP triage confirmed that stronger GPL and GPL-with-exception findings still surface as detections rather than being demoted with it

Follow-up work

  • Created or intentionally deferred:
    • none

Expected-output fixture changes

  • Files changed: testdata/erlang-otp-golden/*.expected, testdata/assembly-golden/erlang-otp-basic/expected.json
  • Why the new expected output is correct:
    • the parser and assembly expectations were generated from current Rust output for representative Erlang fixtures, and rerunning the Erlang parser goldens plus scan/assembly contract tests after the %PLACEHOLDER% parser fix required no further golden updates

mstykow and others added 6 commits April 22, 2026 19:03
…lock

Add net-new Erlang/OTP package metadata support with three parsers
backed by a native Erlang term parser. No Python ScanCode reference
exists for this ecosystem.

Signed-off-by: Maxim Stykow <stykowmaxim@meta.com>
Keep map-bearing OTP metadata from falling back and preserve real Hex package identity for aliased rebar dependencies.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Signed-off-by: Maxim Stykow <maxim.stykow@gmail.com>
Lock in the rebar.config plus rebar.lock contract so dependency hoisting and assembly output stay stable across refactors.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Signed-off-by: Maxim Stykow <maxim.stykow@gmail.com>
Document the map, alias, and git_subdir behavior shipped with the parser fixes so the improvement notes stay aligned with the code.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Signed-off-by: Maxim Stykow <maxim.stykow@gmail.com>
Strip bounded %PLACEHOLDER% macro runs outside strings so canonical OTP app.src templates keep parsing while weak bare-word GPL hits remain clue-only evidence.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Signed-off-by: Maxim Stykow <maxim.stykow@gmail.com>
Record the ejabberd, OTP, and VerneMQ compare runs, regenerate the benchmark chart, and mark scorecard row 44 verified after triaging the remaining deltas.

Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)

Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
Signed-off-by: Maxim Stykow <maxim.stykow@gmail.com>
@mstykow mstykow force-pushed the feat/erlang-otp-parser branch from 94c3318 to da94654 Compare April 22, 2026 18:40
@mstykow mstykow enabled auto-merge (rebase) April 22, 2026 18:44
@mstykow mstykow merged commit 2c08a71 into main Apr 22, 2026
15 checks passed
@mstykow mstykow deleted the feat/erlang-otp-parser branch April 22, 2026 18:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant