Skip to content

docs: add mdBook documentation site with Pages deploy#37

Merged
Arkptz merged 19 commits intomainfrom
docs/mdbook
Apr 24, 2026
Merged

docs: add mdBook documentation site with Pages deploy#37
Arkptz merged 19 commits intomainfrom
docs/mdbook

Conversation

@Arkptz
Copy link
Copy Markdown
Owner

@Arkptz Arkptz commented Apr 22, 2026

Summary

Add a complete mdBook-powered documentation site at https://arkptz.github.io/mitm2openapi/.
Migrates detailed content from README into 13 structured chapters covering installation,
traffic capture, the discover-curate-generate pipeline, CLI reference, resource limits,
strict mode, reports, formats, benchmarks, security model, and diagnostics.

Type of Change

  • Documentation update
  • CI / build / tooling

Checklist

  • cargo fmt --all clean
  • cargo clippy --all-targets --all-features -- -D warnings clean
  • cargo test passes locally
  • Added or updated tests for the change
  • Updated README / CLI help text if user-facing behavior changed
  • Added entry to CHANGELOG.md under [Unreleased] (if user-facing)
  • Used conventional commit style in commit messages (feat:, fix:, docs:, etc.)

What's included

  • book/ — mdBook scaffold with book.toml, SUMMARY.md, and 13 chapter source files
  • .github/workflows/docs.yml — build + Pages deploy workflow
  • README.md — trimmed, detailed content moved to book, docs badge added
  • CHANGELOG.md / CONTRIBUTING.md — H1 headings adjusted for book chapter inclusion

Review feedback addressed

Review found several doc-vs-code mismatches and two missing input validations. All corrected:

Code fixes:

  • Symlink-to-directory inputs now rejected (was bypassing validation via is_dir() following symlinks)
  • Symlinked directory entries are also filtered in both mitmproxy and HAR dir readers
  • HAR reader now enforces the same header size caps (8 KiB name, 64 KiB value) as the mitmproxy reader

Doc corrections:

  • Strict mode: documented that only parse_error counter is currently populated; cap_fired/rejected marked as reserved
  • Templates YAML examples: added missing x-path-templates: top-level key
  • Parameter placeholder: {parameter}{id} (matches actual output)
  • Download URL: added version segment to prevent 404
  • TOCTOU section: removed fd-based recheck claim, documented path-based approach honestly
  • Report field: flow_rejectedrejected (matches struct field name)
  • Pipeline merging: corrected "request body schemas are unioned" to "request body from first observation"
  • TNetString error: clarified that parse error halts file processing (no resync)
  • Introduction: replaced "milliseconds" overclaim with benchmark-backed speedup ratio
  • CLI reference: auto-detect uses extension + content, not content-only
  • Benchmarks: removed duplicate H1 from included file

Testing

  • cargo test --lib passes (235 tests)
  • cargo test --test security passes (8 tests including new symlink coverage)
  • cargo build --release succeeds
  • cargo publish --dry-run clean
  • Author/content leak checks pass

Before merging

Enable GitHub Pages in Settings → Pages with Source = "GitHub Actions" (one-time manual step).

Arkptz added 6 commits April 23, 2026 02:10
Remove H1 headings from both files so that mdBook chapters can own the
top-level heading via include directives without creating duplicate H1s.
GitHub renders H2-first content correctly (uses filename as implicit title).
13 chapters covering installation, quick start, traffic capture,
the discover-curate-generate pipeline, filtering, resource limits,
strict mode, reports, CLI reference, mitmproxy/HAR formats,
benchmarks, security model, and diagnostics.

Includes CHANGELOG and CONTRIBUTING via include directives,
mdbook-admonish CSS assets, and Cargo.toml exclude for book/docs
to keep crate tarball lean.
Slim README from ~274 to ~92 lines. Detailed CLI reference tables,
resource limits, diagnostics, supported formats, and migration guide
now live in the book at arkptz.github.io/mitm2openapi/. Adds docs
badge pointing to the published site.
Arkptz added 12 commits April 25, 2026 01:13
The cap_fired and rejected counters exist in the report schema but are
not wired to reader pipelines yet. Update strict-mode.md and reports.md
to reflect actual runtime behavior and mark unused counters as reserved.
stream_input checked is_dir() before validate_input_path, but is_dir()
follows symlinks — so a symlink to a directory bypassed the symlink
rejection even with --allow-symlinks=false.

Fix: check symlink_metadata() before the is_dir() branch. Also add
_no_symlinks variants for directory iteration in both mitmproxy and HAR
readers so individual entries inside a directory are checked too.
Add tests for:
- symlink pointing to a directory is rejected by validate_input_path
- symlinked .flow entries inside a directory are skipped
- symlinked .har entries inside a directory are skipped
HAR reader passed headers through without size checks. Mitmproxy reader
already enforced MAX_HEADER_NAME_SIZE (8 KiB, drop) and
MAX_HEADER_VALUE_SIZE (64 KiB, truncate). Apply the same caps to HAR
entries so the documented per-field limits are enforced uniformly.
Template YAML examples were missing the top-level x-path-templates key
that the tool actually writes. Also fix {parameter} -> {id} to match
the actual parameterization output (single param = {id}, multiple =
{id1}, {id2}, ...). Fix response merging description: request body
comes from first observation only.
The release workflow packages archives as mitm2openapi-<tag>-<target>,
but the install docs used a URL without the version segment, which 404s.
The security docs claimed fd-based metadata checks prevent TOCTOU races.
In reality, validate_input_path uses path-based symlink_metadata() and
metadata() before opening the file. Document the small TOCTOU window
honestly and note fd-based recheck as a future enhancement.
The report struct field is named 'rejected', not 'flow_rejected'. Fix
the diagnostics reference table to match the actual JSON key.
The docs implied processing continues after a tnetstring parse error.
In reality, the iterator sets done=true and the rest of the file is
dropped. Clarify this in both resource-limits.md and diagnostics.md.
The 89 MB fixture takes ~2.7 s in Rust. 'Milliseconds' was misleading.
Reference the benchmarks page with the real 17x speedup figure instead.
detect_format_score checks file extension first (3 pts) then content
heuristics (2 pts). The docs said 'auto-detect from file content' which
omitted the extension check.
The wrapper book/src/reference/benchmarks.md provides its own H1
(Performance & Benchmarks), so the included docs/benchmarks.md should
not add another. Demote to H2 and remove the redundant heading.

Note: bench.yml workflow may overwrite this file on the next run; the
workflow template should be updated to match if needed.
@Arkptz Arkptz merged commit c0eefc4 into main Apr 24, 2026
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant