Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 10 additions & 7 deletions docs/BENCHMARKS.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ The chart below uses a log-log scatter plot: file count on the x-axis, wall-cloc

![Scan duration vs. file count for Provenant and ScanCode](benchmarks/scan-duration-vs-files.svg)

> Provenant is faster on 136 of 138 recorded runs, with a **11.6× median speedup** and **10.1× geometric-mean speedup** overall; the median gap grows from **6.4×** on sub-100-file targets to **19.7×** on 10k+ file targets.
> Provenant is faster on 139 of 141 recorded runs, with a **11.7× median speedup** and **10.2× geometric-mean speedup** overall; the median gap grows from **6.4×** on sub-100-file targets to **20.1×** on 10k+ file targets.
> Generated from the benchmark timing rows in this document via `cargo run --manifest-path xtask/Cargo.toml --bin generate-benchmark-chart`.

## Current benchmark examples
Expand Down Expand Up @@ -53,13 +53,16 @@ The tables below provide the per-target detail behind the chart. Each row is one
| [r-lib/devtools @ a3447b9](https://github.com/r-lib/devtools/tree/a3447b9f3d59abb6cc8b63a54db3435819324c1e)<br>266 files | 2026-04-19 · devtools-24729 · macOS 26.3.1 · Apple M1 Max · 32 GB · arm64 · 4 proc | Provenant: 9.28s<br>ScanCode: 80.85s<br>**8.71× faster (-88.5%)** | Far broader CRAN package and dependency extraction (`14` vs `1` packages, `45` vs `1` dependencies) from the root `DESCRIPTION` plus committed test-package fixtures, with correct filtering of fake `pkg:cran/R` dependency noise and cleaner maintainer or URL normalization |
| [tidyverse/ggplot2 @ 7d79c95](https://github.com/tidyverse/ggplot2/tree/7d79c956b5707cb7c762d834caf842dc6496b032)<br>1,154 files | 2026-04-19 · ggplot2-95481 · macOS 26.3.1 · Apple M1 Max · 32 GB · arm64 · 4 proc | Provenant: 14.46s<br>ScanCode: 178.35s<br>**12.33× faster (-91.9%)** | Direct CRAN package visibility on the root `DESCRIPTION` plus declared dependency extraction (`41` vs `0`) across `Imports`, `Suggests`, and `Enhances`, with correct hyphenated CRAN version constraints such as `sf (>= 0.7-3)` and cleaner Rd or roxygen URL recovery |

#### Hex / Elixir
#### Hex / Elixir / Erlang / OTP

| Target snapshot | Run context | Timing snapshot | Advantages over ScanCode |
| -------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------- | -------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| [elixir-ecto/ecto @ 28d9282](https://github.com/elixir-ecto/ecto/tree/28d928267388018d5b0bb1f83e04368b7e8cae50)<br>156 files | 2026-04-22 · ecto-26520 · macOS 26.3.1 · Apple M1 Max · 32 GB · arm64 · 10 proc | Provenant: 14.03s<br>ScanCode: 135.56s<br>**9.66× faster (-89.7%)** | Broader Hex dependency extraction (`16` vs `0`) from the repo-root `mix.lock` plus `examples/friends/mix.lock`, with direct locked package identities for entries such as `ecto_sql`, `postgrex`, and `telemetry` that ScanCode leaves dependency-blind |
| [elixir-plug/plug @ 47649aa](https://github.com/elixir-plug/plug/tree/47649aa7bb910f481b66cc3e98c14b2c3b761c3c)<br>104 files | 2026-04-22 · plug-22829 · macOS 26.3.1 · Apple M1 Max · 32 GB · arm64 · 10 proc | Provenant: 10.77s<br>ScanCode: 92.08s<br>**8.55× faster (-88.3%)** | Direct Hex package visibility on `mix.lock` (`1` vs `0`) plus locked dependency extraction (`9` vs `0`) for `plug_crypto`, `telemetry`, `ex_doc`, and sibling Hex pins that ScanCode leaves at zero, with Unicode-preserving `Loïc Hoguin` holder normalization |
| [phoenixframework/phoenix @ e7b8081](https://github.com/phoenixframework/phoenix/tree/e7b8081792fa51c9fede6d0fb9ddb610bac3f26f)<br>476 files | 2026-04-22 · phoenix-13265 · macOS 26.3.1 · Apple M1 Max · 32 GB · arm64 · 10 proc | Provenant: 12.80s<br>ScanCode: 149.17s<br>**11.66× faster (-91.4%)** | Direct Hex package visibility on the repo-root, `installer/mix.lock`, and `integration_test/mix.lock` surfaces (`3` vs `0` file-level package records), while keeping top-level package and dependency counts aligned elsewhere and preserving structured npm party metadata |
| Target snapshot | Run context | Timing snapshot | Advantages over ScanCode |
| -------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------- | ---------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| [elixir-ecto/ecto @ 28d9282](https://github.com/elixir-ecto/ecto/tree/28d928267388018d5b0bb1f83e04368b7e8cae50)<br>156 files | 2026-04-22 · ecto-26520 · macOS 26.3.1 · Apple M1 Max · 32 GB · arm64 · 10 proc | Provenant: 14.03s<br>ScanCode: 135.56s<br>**9.66× faster (-89.7%)** | Broader Hex dependency extraction (`16` vs `0`) from the repo-root `mix.lock` plus `examples/friends/mix.lock`, with direct locked package identities for entries such as `ecto_sql`, `postgrex`, and `telemetry` that ScanCode leaves dependency-blind |
| [elixir-plug/plug @ 47649aa](https://github.com/elixir-plug/plug/tree/47649aa7bb910f481b66cc3e98c14b2c3b761c3c)<br>104 files | 2026-04-22 · plug-22829 · macOS 26.3.1 · Apple M1 Max · 32 GB · arm64 · 10 proc | Provenant: 10.77s<br>ScanCode: 92.08s<br>**8.55× faster (-88.3%)** | Direct Hex package visibility on `mix.lock` (`1` vs `0`) plus locked dependency extraction (`9` vs `0`) for `plug_crypto`, `telemetry`, `ex_doc`, and sibling Hex pins that ScanCode leaves at zero, with Unicode-preserving `Loïc Hoguin` holder normalization |
| [erlang/otp @ 264def5](https://github.com/erlang/otp/tree/264def545b8214ea7100bfede1a4629c676ff1c0)<br>11,749 files | 2026-04-22 · otp-15523 · macOS 26.3.1 · Apple M1 Max · 32 GB · arm64 · 4 proc | Provenant: 135.93s<br>ScanCode: 3197.26s<br>**23.52× faster (-95.7%)** | Direct OTP application package visibility (`11` vs `0`) across committed `lib/*/src/*.app.src` templates, with bounded `%PLACEHOLDER%` handling that keeps canonical manifests such as `diameter.app.src` scannable and preserves the same non-stdlib runtime dependency inventory ScanCode finds |
| [phoenixframework/phoenix @ e7b8081](https://github.com/phoenixframework/phoenix/tree/e7b8081792fa51c9fede6d0fb9ddb610bac3f26f)<br>476 files | 2026-04-22 · phoenix-13265 · macOS 26.3.1 · Apple M1 Max · 32 GB · arm64 · 10 proc | Provenant: 12.80s<br>ScanCode: 149.17s<br>**11.66× faster (-91.4%)** | Direct Hex package visibility on the repo-root, `installer/mix.lock`, and `integration_test/mix.lock` surfaces (`3` vs `0` file-level package records), while keeping top-level package and dependency counts aligned elsewhere and preserving structured npm party metadata |
| [processone/ejabberd @ 87475d8](https://github.com/processone/ejabberd/tree/87475d813b974492f338720eab5c9c3d4646a4ce)<br>623 files | 2026-04-22 · ejabberd-26578 · macOS 26.3.1 · Apple M1 Max · 32 GB · arm64 · 4 proc | Provenant: 16.74s<br>ScanCode: 214.30s<br>**12.80× faster (-92.2%)** | Broader Erlang/Rebar package and dependency extraction (`2` vs `1` packages, `43` vs `3` dependencies) from the root `rebar.config`, `rebar.lock`, nested `_checkouts/configure_deps` manifests, and committed Dockerfiles, with the bundled `priv/mod_invites/copyright` notice kept as clue-level license evidence instead of being overstated as Debian package metadata |
| [vernemq/vernemq @ 4681e54](https://github.com/vernemq/vernemq/tree/4681e5490cc42e6cc26a504bb4b3c5413315c21f)<br>441 files | 2026-04-22 · vernemq-20484 · macOS 26.3.1 · Apple M1 Max · 32 GB · arm64 · 4 proc | Provenant: 13.90s<br>ScanCode: 149.29s<br>**10.74× faster (-90.7%)** | Broader Erlang/Rebar dependency extraction (`119` vs `0`) from the repo-root and per-app `rebar.config` / `.app.src` manifests, plus direct `.gitmodules` package visibility and mixed Hex or git package identity across the VerneMQ app tree where ScanCode stays manifest-blind |

#### JavaScript / TypeScript / web stacks

Expand Down
3 changes: 3 additions & 0 deletions docs/SUPPORTED_FORMATS.md
Original file line number Diff line number Diff line change
Expand Up @@ -82,7 +82,10 @@ Provenant supports package manifests, installed-package metadata, recognizers, a
| Hackage cabal.project workspace file | `**/cabal.project` | hackage | Haskell | [Link](https://cabal.readthedocs.io/en/stable/cabal-project-description-file.html) |
| Haxe haxelib.json package manifest | `**/haxelib.json` | haxe | Haxe | [Link](https://lib.haxe.org/documentation/creating-a-haxelib-package/) |
| Helm chart metadata | `**/Chart.yaml, **/Chart.lock` | helm | YAML | [Link](https://helm.sh/docs/topics/charts/) |
| Erlang OTP application resource file | `**/*.app.src` | hex | Erlang | [Link](https://www.erlang.org/doc/apps/kernel/application) |
| Hex mix.lock lockfile | `**/mix.lock` | hex | Elixir | [Link](https://hexdocs.pm/mix/Mix.Tasks.Deps.html) |
| Rebar3 configuration | `**/rebar.config` | hex | Erlang | [Link](https://rebar3.org/docs/configuration/configuration/) |
| Rebar3 lockfile | `**/rebar.lock` | hex | Erlang | [Link](https://rebar3.org/docs/configuration/configuration/) |
| Julia Manifest.toml resolved dependencies | `**/Manifest.toml` | julia | Julia | [Link](https://pkgdocs.julialang.org/v1/toml-files/) |
| Julia Project.toml manifest | `**/Project.toml` | julia | Julia | [Link](https://pkgdocs.julialang.org/v1/toml-files/) |
| Linux OS release metadata file | `*etc/os-release, *usr/lib/os-release` | linux-distro | | [Link](https://www.freedesktop.org/software/systemd/man/os-release.html) |
Expand Down
Loading
Loading