Security fixes: ZIP symlink traversal, absolute dst paths, SVN rev injection#1095
Security fixes: ZIP symlink traversal, absolute dst paths, SVN rev injection#1095
Conversation
…jection
Three vulnerabilities found via white-hat audit and fixed with failing-first tests:
1. ZIP symlink path traversal (HIGH)
- check_zip_members() only validated filenames but not Unix symlinks encoded
in external_attr. A crafted ZIP with a symlink pointing to ../../etc/passwd
would extract a real symlink, escaping the extraction directory.
- Fix: add _check_zip_member_type() (mirrors _check_tar_member_type) that
detects S_IFLNK in external_attr and validates the symlink target.
- Tests: test_check_zip_members_symlink_{absolute,dotdot,safe_relative}_target
2. validate_destination() missing absolute path check (MEDIUM)
- validate_destination() rejected ".." components but not absolute paths.
Path("/etc/passwd").parts has no ".." so "/etc/passwd" passed silently.
- Fix: add is_absolute() guard before the ".." check.
- Tests: test_validate_destination_rejects_absolute_paths
3. SVN rev.split(" ") argument injection API (LOW)
- SvnRepo.export() accepted a pre-formatted "–-revision N" string and
split it on spaces, creating an API that enables option injection if any
caller bypasses digit-only validation in svnsubproject.py.
- Fix: accept a bare digit-only revision string, pass ["--revision", rev]
directly; raise ValueError for non-digit input.
- Tests: test_svn_vcs.py (new file)
https://claude.ai/code/session_01CxKrTZekNRCoSD9PSMekQZ
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughManifest destination validation now rejects absolute POSIX/Windows paths; SVN export validates and conditionally includes Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Suggested labels
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@dfetch/manifest/manifest.py`:
- Around line 398-403: Manifest.validate_destination() currently relies on
pathlib.Path.is_absolute() which misses Windows-rooted forms; update the
function to explicitly reject Windows- and POSIX-rooted paths by importing
PurePosixPath and PureWindowsPath and treating any input where
PurePosixPath(dst).is_absolute() or PureWindowsPath(dst).is_absolute()
(including a non-empty .drive) as invalid, in addition to the existing check for
".." parts. Make the change in Manifest.validate_destination(), add the imports
(PurePosixPath, PureWindowsPath) near the top of the file, and keep the existing
check_no_path_traversal() usage elsewhere as a secondary safeguard.
In `@dfetch/vcs/archive.py`:
- Around line 380-385: The code reads the entire ZIP member via zf.read(info)
before validating a symlink target, allowing huge allocations; change the
symlink handling in the block where unix_mode is derived and stat.S_ISLNK is
checked so that you open the member with zf.open(info) and read only a small
bounded number of bytes (e.g., a few KB) before decoding to get the target, then
validate with pathlib.PurePosixPath(target) and os.path.isabs(target); ensure
the read limit prevents creating a large bytes object and that the same
validation (no absolute paths, no ".." components) runs on the truncated/decoded
string before any further processing.
In `@tests/test_svn_vcs.py`:
- Around line 36-38: Change the test to assert a ValueError is raised by
SvnRepo.export for the invalid rev string and verify that
dfetch.vcs.svn.run_on_cmdline was never invoked: patch
dfetch.vcs.svn.run_on_cmdline as a mock, use pytest.raises(ValueError) around
SvnRepo.export("svn://example.com/repo", rev="--revision 12345"), and after the
call assert the patched run_on_cmdline mock .not_called(); reference
SvnRepo.export and dfetch.vcs.svn.run_on_cmdline to locate the code to change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 01611ee6-2aaa-4893-8281-0800a8072328
📒 Files selected for processing (7)
dfetch/manifest/manifest.pydfetch/project/svnsubproject.pydfetch/vcs/archive.pydfetch/vcs/svn.pytests/test_archive.pytests/test_manifest.pytests/test_svn_vcs.py
…n SVN test - manifest.py: validate_destination() now rejects Windows-rooted paths by checking PurePosixPath.is_absolute(), PureWindowsPath.is_absolute(), and non-empty PureWindowsPath.drive in addition to the existing POSIX check. Added PurePosixPath/PureWindowsPath imports. Test covers C:/ and C:\ forms. - archive.py: _check_zip_member_type() now reads symlink target content via zf.open()+read(4096) instead of zf.read(), bounding memory allocation to 4 KB regardless of the stored member size. - test_svn_vcs.py: test_export_rejects_non_digit_revision narrowed to assert ValueError specifically (not ValueError|TypeError) and added mock_run.assert_not_called() to prove the subprocess is never reached. https://claude.ai/code/session_01CxKrTZekNRCoSD9PSMekQZ
There was a problem hiding this comment.
Actionable comments posted: 4
♻️ Duplicate comments (1)
dfetch/vcs/archive.py (1)
382-384:⚠️ Potential issue | 🟠 MajorReject oversized symlink targets instead of silently truncating validation.
On Line 383, reading exactly 4096 bytes validates only a prefix. A crafted target with unsafe components after byte 4096 can evade the current
isabs/..checks. Readmax+1bytes and fail closed if oversized.🔧 Proposed hardening
if stat.S_ISLNK(unix_mode): - with zf.open(info) as member_file: - target = member_file.read(4096).decode(errors="replace") + max_target_bytes = 4096 + with zf.open(info) as member_file: + target_bytes = member_file.read(max_target_bytes + 1) + if len(target_bytes) > max_target_bytes: + raise RuntimeError( + f"Archive contains a symlink with an oversized target: {info.filename!r}" + ) + target = target_bytes.decode(errors="surrogateescape") if os.path.isabs(target) or any( part == ".." for part in pathlib.PurePosixPath(target).parts ):🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@dfetch/vcs/archive.py` around lines 382 - 384, The current check reads only 4096 bytes into target (target = member_file.read(4096).decode(...)) which can truncate a malicious symlink and let dangerous content after the prefix pass isabs/.. checks; change the read to max_len+1 (read 4097 bytes) and if the extra byte exists, reject/raise (fail closed) instead of proceeding, otherwise continue validating the fully-read target and keep the isabs/any("..") checks against the complete value.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@dfetch/manifest/manifest.py`:
- Around line 398-406: The current validation uses PureWindowsPath(dst).drive
which misses root-relative Windows paths like "\temp\evil"; update the
absolute-path check in the same if-clause (where PurePosixPath, PureWindowsPath
and dst are referenced) to test PureWindowsPath(dst).anchor (or otherwise check
anchor truthiness) instead of .drive so root-relative paths are detected and the
ValueError is raised for them as well.
In `@tests/test_manifest.py`:
- Around line 226-232: Add a test case for Windows root-relative paths to the
parameterized inputs in test_validate_destination_rejects_absolute_paths:
include the string "\\\\temp\\\\evil" alongside the existing "/etc/passwd",
"/tmp/evil", "C:/Windows/System32", and "C:\\Windows\\System32" so
Manifest.validate_destination is exercised for Windows UNC/root-relative paths
(this will catch changes if implementation later checks pathlib.Path.anchor or
similar).
In `@tests/test_svn_vcs.py`:
- Around line 3-4: Remove the file-wide suppressions "# mypy: ignore-errors" and
"# flake8: noqa" at the top of the test module and instead address the specific
linter/type issues reported by running mypy/flake8; fix the underlying problems
in the tests (typos, missing imports, incorrect types) and only add
narrowly-scoped per-line or per-statement ignores (e.g., "# type: ignore[code]"
or "# noqa: E123") immediately adjacent to the offending lines if a true
false-positive cannot be resolved. Target the replacements around the exact test
functions or statements that trigger warnings so the rest of the file remains
checked by linters.
- Around line 16-18: The test in tests/test_svn_vcs.py currently only checks
adjacency of "--revision" and "12345" using cmd and idx; replace that with an
exact argv equality assertion to harden the regression test: construct the full
expected argument list and assert that mock_run.call_args[0][1] (or the cmd
variable) equals that list so the entire command shape and order are validated
rather than just adjacency.
---
Duplicate comments:
In `@dfetch/vcs/archive.py`:
- Around line 382-384: The current check reads only 4096 bytes into target
(target = member_file.read(4096).decode(...)) which can truncate a malicious
symlink and let dangerous content after the prefix pass isabs/.. checks; change
the read to max_len+1 (read 4097 bytes) and if the extra byte exists,
reject/raise (fail closed) instead of proceeding, otherwise continue validating
the fully-read target and keep the isabs/any("..") checks against the complete
value.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 247f8c13-d0f8-4436-9f16-b57016d60e0d
📒 Files selected for processing (4)
dfetch/manifest/manifest.pydfetch/vcs/archive.pytests/test_manifest.pytests/test_svn_vcs.py
…, exact SVN argv - manifest.py: replace PureWindowsPath(dst).drive with .anchor so root-relative Windows paths like \temp\evil (anchor="\", drive="") are rejected; .drive is empty for paths that have a root but no drive letter. - archive.py: read _MAX_SYMLINK_TARGET+1 (4097) bytes for ZIP symlink targets; if the extra byte is present the target exceeds the limit and is immediately rejected (fail-closed) instead of silently truncating a long target and potentially passing a dangerous suffix through the isabs/.. checks. - test_manifest.py: add "\\temp\\evil" (root-relative Windows path) to the parametrized absolute-path rejection cases. - test_svn_vcs.py: remove file-wide # mypy: ignore-errors / # flake8: noqa suppressions; replace adjacency checks with exact full argv equality assertions so the entire command shape and argument order are validated. https://claude.ai/code/session_01CxKrTZekNRCoSD9PSMekQZ
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@dfetch/vcs/archive.py`:
- Around line 380-385: The local constant _MAX_SYMLINK_TARGET inside the
function (used with unix_mode, stat.S_ISLNK, and zf.open(info)) violates pylint
C0103; rename it to a snake_case local variable such as max_symlink_target
(remove the leading underscore and use lower_case) and update all references
(e.g., the read() call and the length check) so the logic is unchanged but the
name conforms to function-local variable naming conventions.
- Around line 391-393: The current symlink-target validation uses
pathlib.PurePosixPath(target).parts which misses Windows-style backslash
separators (e.g., "..\..\evil"), so update the check that currently calls
os.path.isabs(target) or any(part == ".." for part in
pathlib.PurePosixPath(target).parts) to also validate Windows syntax: treat the
target as both a POSIX and a Windows path (use pathlib.PurePosixPath(target) and
pathlib.PureWindowsPath(target) or ntpath.isabs/posixpath.isabs equivalents) and
consider a path unsafe if either is_absolute() returns True or if either .parts
contains ".."; apply the same change at the identical TAR-member check (the two
locations using target and pathlib.PurePosixPath) so backslash-contained
traversals are detected.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 7caf3261-c36d-4e14-baca-7a42becec64e
📒 Files selected for processing (4)
dfetch/manifest/manifest.pydfetch/vcs/archive.pytests/test_manifest.pytests/test_svn_vcs.py
…onstant - archive.py: extract _is_unsafe_symlink_target() helper that checks both PurePosixPath and PureWindowsPath so targets like "..\\..\\evil" are caught on any host OS; apply it to both _check_zip_member_type and _check_tar_member_type, removing the duplicated inline check. Rename _MAX_SYMLINK_TARGET → max_symlink_target to satisfy pylint C0103 (local variable vs. module-level constant naming convention). - test_archive.py: add Windows-backslash traversal cases for both ZIP (test_check_zip_members_symlink_windows_dotdot_target) and TAR (test_check_tar_member_type_windows_dotdot_symlink). https://claude.ai/code/session_01CxKrTZekNRCoSD9PSMekQZ
There was a problem hiding this comment.
♻️ Duplicate comments (1)
dfetch/vcs/archive.py (1)
373-383:⚠️ Potential issue | 🟠 MajorReject rooted Windows targets, not just
is_absolute()ones.Line 381 relies on
PureWindowsPath.is_absolute(), butPureWindowsPath(r"\temp\evil").is_absolute()isFalse. That means_is_unsafe_symlink_target()still accepts drive-rooted Windows targets that escape the extraction directory, and both the ZIP and TAR symlink checks now inherit that bypass. Treat any non-empty Windows anchor/root as unsafe here and add ar"\temp\evil"regression case.Suggested fix
return ( posix.is_absolute() - or win.is_absolute() + or bool(win.anchor) or any(part == ".." for part in posix.parts) or any(part == ".." for part in win.parts) )This verification should show a non-empty Windows anchor while
is_absolute()staysFalsefor rooted paths liker"\temp\evil":#!/bin/bash python - <<'PY' from pathlib import PureWindowsPath for raw in [r"\temp\evil", r"C:\temp\evil", r"C:temp\evil", r"..\..\evil"]: p = PureWindowsPath(raw) print( raw, { "anchor": p.anchor, "root": p.root, "is_absolute": p.is_absolute(), "parts": p.parts, }, ) PY🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@dfetch/vcs/archive.py` around lines 373 - 383, The `_is_unsafe_symlink_target()` check currently only rejects Windows targets using PureWindowsPath.is_absolute(), but paths like r"\temp\evil" have a non-empty anchor/root while is_absolute() is False; update the condition in the return of `_is_unsafe_symlink_target()` to treat any non-empty PureWindowsPath.anchor (or root) as unsafe (e.g., add `or win.anchor`/`or win.root` check) in addition to `win.is_absolute()`, and add a regression test that asserts r"\temp\evil" is considered unsafe.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@dfetch/vcs/archive.py`:
- Around line 373-383: The `_is_unsafe_symlink_target()` check currently only
rejects Windows targets using PureWindowsPath.is_absolute(), but paths like
r"\temp\evil" have a non-empty anchor/root while is_absolute() is False; update
the condition in the return of `_is_unsafe_symlink_target()` to treat any
non-empty PureWindowsPath.anchor (or root) as unsafe (e.g., add `or
win.anchor`/`or win.root` check) in addition to `win.is_absolute()`, and add a
regression test that asserts r"\temp\evil" is considered unsafe.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 644461e2-ff40-4471-9a0b-cc86de4668b6
📒 Files selected for processing (2)
dfetch/vcs/archive.pytests/test_archive.py
There was a problem hiding this comment.
♻️ Duplicate comments (1)
dfetch/vcs/archive.py (1)
377-383:⚠️ Potential issue | 🔴 CriticalTreat Windows-rooted and drive-qualified symlink targets as unsafe.
Line 381 only checks
PureWindowsPath.is_absolute(), which still returnsFalsefor targets liker"\Windows\Temp"andr"C:temp\file". On Windows those paths are not extraction-relative, so the shared helper can still let both ZIP and TAR symlinks escapedest_dir.🛡️ Proposed fix
return ( posix.is_absolute() - or win.is_absolute() + or bool(win.anchor) or any(part == ".." for part in posix.parts) or any(part == ".." for part in win.parts) )Run this to confirm the current
pathlibbehavior. Expected result:\Windows\TempandC:temp\filehave a non-emptyanchorwhileis_absolute()is stillFalse.#!/bin/bash python - <<'PY' from pathlib import PureWindowsPath for target in [ r"\Windows\Temp", r"C:temp\file", r"C:\Windows\Temp", r"..\..\evil", r"safe\rel", ]: p = PureWindowsPath(target) print(target, {"anchor": p.anchor, "is_absolute": p.is_absolute(), "parts": p.parts}) PY🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@dfetch/vcs/archive.py` around lines 377 - 383, The current safety check uses PureWindowsPath.is_absolute() which misses Windows-rooted or drive-qualified targets like r"\Windows\Temp" and r"C:temp\file"; update the return condition in the helper (the variables posix and win from pathlib.PurePosixPath/PureWindowsPath) to also treat any non-empty win.anchor (or otherwise anchored Windows paths) as unsafe—e.g., add a check like "or bool(win.anchor)" (alongside the existing posix.is_absolute(), win.is_absolute(), and the ".." part checks) so Windows-rooted and drive-qualified symlink targets are rejected.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@dfetch/vcs/archive.py`:
- Around line 377-383: The current safety check uses
PureWindowsPath.is_absolute() which misses Windows-rooted or drive-qualified
targets like r"\Windows\Temp" and r"C:temp\file"; update the return condition in
the helper (the variables posix and win from
pathlib.PurePosixPath/PureWindowsPath) to also treat any non-empty win.anchor
(or otherwise anchored Windows paths) as unsafe—e.g., add a check like "or
bool(win.anchor)" (alongside the existing posix.is_absolute(),
win.is_absolute(), and the ".." part checks) so Windows-rooted and
drive-qualified symlink targets are rejected.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: a3fc1de8-1c82-441a-897e-15ad413c3c84
📒 Files selected for processing (3)
dfetch/vcs/archive.pytests/test_manifest.pytests/test_svn_vcs.py
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/test_svn_vcs.py`:
- Around line 12-23: The tests patching dfetch.vcs.svn.run_on_cmdline should
assert the mock was called exactly once before inspecting call_args; update the
test block that patches run_on_cmdline (the one that calls SvnRepo.export(...))
to call mock_run.assert_called_once() immediately after the export call and do
the same change for the other command-shape test (the second block around lines
28-37) so you validate a single invocation before examining
mock_run.call_args[0][1].
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: f83fc556-c30b-4706-aa94-0b692f253937
📒 Files selected for processing (3)
dfetch/vcs/archive.pytests/test_manifest.pytests/test_svn_vcs.py
Three vulnerabilities found via white-hat audit and fixed with failing-first tests:
ZIP symlink path traversal (HIGH)
in external_attr. A crafted ZIP with a symlink pointing to ../../etc/passwd
would extract a real symlink, escaping the extraction directory.
detects S_IFLNK in external_attr and validates the symlink target.
validate_destination() missing absolute path check (MEDIUM)
Path("/etc/passwd").parts has no ".." so "/etc/passwd" passed silently.
SVN rev.split(" ") argument injection API (LOW)
split it on spaces, creating an API that enables option injection if any
caller bypasses digit-only validation in svnsubproject.py.
directly; raise ValueError for non-digit input.
https://claude.ai/code/session_01CxKrTZekNRCoSD9PSMekQZ
Summary by CodeRabbit
Bug Fixes
Tests