fix(gitlab): avoid /raw_diffs 404 on GitLab < 17.9#14
Merged
Conversation
fetch_mr_diff no longer shells out to 'glab mr diff --raw'. Recent glab implements --raw via GitLab's /raw_diffs endpoint, which was only added in GitLab 17.11, so on older self-hosted servers (the reporter was on CE 17.8.6) the whole diff fetch 404s. The replacement uses the long-standing paginated /diffs endpoint via 'glab api', synthesising the 'diff --git'/'---'/'+++' headers that unidiff needs around each change's raw hunks. Pins per_page=20 because GitLab 17.8.x 500s with NoMethodError on PaginatedMergeRequestDiff when per_page > 30 — verified end-to-end against a real GitLab CE 17.8.6 Docker container via scripts/gitlab_e2e.sh. Also collapses the duplicate glab-shelling block in src/infra/cli/diff.rs into a single call to fetch_mr_diff so the two code paths cannot drift. The mocked regression test asserts the new success behaviour: no --raw invocation, /diffs?per_page=20 pagination, multi-file DiffIndex parse.
f09be92 to
834b7c4
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
fetch_mr_diffused to shell out toglab mr diff --raw. Recentglabreleases implement--rawby calling GitLab's/projects/:id/merge_requests/:iid/raw_diffsendpoint, which first shipped in GitLab 17.9 (archived 17.8 docs show it absent; archived 17.9 docs show it present). On older self-hosted instances — the reporter in #12 was on GitLab CE 17.8.6 — that endpoint returns404 Not Foundand the diff fetch aborts:This PR replaces
glab mr diff --rawwith a singleglab api --paginate --output ndjsoncall to the long-standing/diffsendpoint and reconstructs thediff --gitframing that the downstreamunidiff-basedDiffIndexparser requires.Closes #12.
Why
--rawexisted in the first placeWithout
--raw,glab mr diffprints a unified diff without thediff --git a/... b/...separator headers between files.DiffIndex(wrappingunidiff) hard-requires those headers to split the stream into per-file views — without them, the parser sees the whole thing as one big diff and only the first file is recognised. Commit015ca0boriginally added--rawto fix exactly that.This PR takes the other road: drop
--rawentirely, use the structured JSON endpoint, and synthesise the framing in lareview. That keepsDiffIndexhappy and works on every GitLab version from at least 17.x onward (the/diffsendpoint predates both 17.8 and 17.9, though GitLab's docs don't cite the exact introduction version).The fix
src/infra/vcs/gitlab.rs::fetch_mr_diff— rewrittenOne
glab api --paginate --output ndjsoninvocation hitting:The endpoint returns one JSON object per changed file with
old_path,new_path,a_mode,b_mode,new_file,deleted_file,renamed_file, anddiff. A newsynthesize_unified_diffhelper wraps each entry in the framing the parser expects —diff --git,--- /dev/null/+++ b/...for adds,rename from/rename tofor renames, etc. — then appends the raw hunks from the JSON verbatim.src/infra/cli/diff.rs::DiffSource::GitLabMr— deduplicatedThere was a second, verbatim copy of the same
glab mr diff --rawshell-out in the CLI entry point — same bug, independent code path. It now callsgitlab::fetch_mr_diffvia the existingblock_onhelper. The two paths can't drift apart again.Why
--paginate --output ndjsonspecificallyglab api --paginatewalks the server'sLink: rel="next"header internally, collapsing what would otherwise be N sequential subprocess spawns (one per page) into a single invocation. This was explicitly flagged by review and is the right mitigation here.The catch: glab's default output mode doesn't transform multi-page responses — it concatenates the raw bodies back-to-back with no separator, so for an endpoint returning JSON arrays you get
[...p1...][...p2...]which is not a valid single JSON array (confirmed by glab's own REST pagination test:api_test.go#L556asserts the output is literally{"page":1}{"page":2}{"page":3}).--output ndjsonis the supported programmatic-consumption mode — it emits each array element as one JSON document per line, whichserde_json::Deserializer::into_iter::<GlabMrChange>()stream-parses cleanly.On
per_pagepreservation: glab'saddPerPageonly injects its default (100) when the URL has noper_page, and subsequent pages use the server'sLink: rel="next"URL verbatim — which echoes ourper_page=20back into every paged request. Verified inapi_test.gowhich assertsper_page=100appears only on the first request when the user supplied nothing.Why
per_page=20It's the documented default for this endpoint per the GitLab REST API reference. It also incidentally sidesteps a GitLab 17.8.x upstream bug:
per_page > 30triggersNoMethodError: undefined method 'page' for PaginatedMergeRequestDiffin the/diffscontroller (the high-per_pagepath swaps to a pre-paginated collection that doesn't respond to Kaminari's.pagewhile the controller calls it anyway). The bug is not called out in the docs, because bugs by definition aren't documented. 20 is both spec-compliant and safe on 17.8.x, so we pin it.Verification
1. Mock integration test —
tests/gitlab_diff_fetch_integration.rsinstalls a fakeglabonPATHthat exits non-zero on any--rawinvocation, asserts the Rust code passes both--paginateand--output ndjson(so dropping the perf optimization re-trips the review comment), answers with the canned multi-file ndjson payload, and checksfetch_mr_diff's output parses into the expected two files viaDiffIndex::new.2. Real GitLab 17.8.6 via Docker — during development I stood up a manual harness (not in this PR) that booted
gitlab/gitlab-ce:17.8.6-ce.0, seeded a multi-file MR via REST, and ranfetch_mr_metadata+fetch_mr_diffagainst it through a realglab. Red leg (pre-fix source) reportedCould not obtain raw diff: 404 Not Foundverbatim matching #12. Green leg (post-fix source, same container) reporteddiff --git headers: 2,diff bytes: 6553, both files parsed. The real run is also what surfaced theper_page > 30upstream bug and drove theper_page=20pin; the mock test alone would not have caught that. The harness code was stripped from the branch after verification since it's not shippable.3. Build + lint + full suite —
cargo test(201 tests),cargo fmt -- --check,cargo clippy --all-targets --all-features -- -D warnings, all clean.Docs grounding
Every version-specific claim in this PR and in the code comments is grounded in an authoritative source, not in E2E observation:
/raw_diffsfirst shipping in 17.9: 17.8 archived docs (absent) vs 17.9 archived docs (present)./diffsper_pagedefault of 20: current endpoint reference.glab api --paginatesemantics (walksLink: rel="next", stops when absent):pagination.go::findNextPage.glab api --output ndjsonemitting one element per line:api.gosource +api_test.goline 556.per_pagepreservation across--paginatepages:addPerPageinjects default only when absent.Test plan
cargo test --test gitlab_diff_fetch_integrationcargo testcargo fmt --check && cargo clippy --all-targets --all-features -- -D warnings--rawapproach from the reporter's perspective). If you don't have one handy, a GitLab ≥ 17.9 instance confirms the fix doesn't regress newer servers.Non-goals
push_review/push_feedbackagainst GitLab still shell out toglab apithe same way as before — not touched here.diffbodies (GitLab's own representation), matching pre-existing behaviour — not independently exercised by the test.reqwest) as a replacement forglabis not pursued;--paginate --output ndjsonalready collapses the subprocess-per-page cost to 1, and keepingglabas the single VCS auth surface avoids introducing a second "way of talking to GitLab" that could drift.