Skip to content

Add option to transfer to temporary filename#238

Open
pnuu wants to merge 32 commits intopytroll:mainfrom
pnuu:new-atomic-temp-transfer
Open

Add option to transfer to temporary filename#238
pnuu wants to merge 32 commits intopytroll:mainfrom
pnuu:new-atomic-temp-transfer

Conversation

@pnuu
Copy link
Copy Markdown
Member

@pnuu pnuu commented Apr 25, 2026

Add options for transferring files first to a temporary filename.

This was implemented using Github Copilot (GPT-5 mini) with a lot of iteration and some manual refactoring.

pnuu and others added 19 commits April 25, 2026 20:12
…ds and FileMover finalizer\n\nCo-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ble remote rename for atomic transfers\n\nCo-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…or atomic transfers; prefer multipart when boto3 available\n\nCo-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…spacing preserved)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…utility

- Extract ensure_remote_dirs and ensure_local_dir to new _mover_utils module
- Replace recursive cd_tree calls in FtpMover with iterative ensure_remote_dirs
- Reduce duplication: ~80 lines of code removed
- Maintains backward compatibility: all tests pass
- Improved performance with fast path for existing directories

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace duplicate final_dir handling in ScpMover and SftpMover
finalize_atomic_transfer methods with shared ensure_final_directory_for_rename
helper in _mover_utils.

Removes ~34 lines of duplicate code for stat/mkdir loop logic.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Split ensure_remote_dirs() into three focused functions:
- _ensure_remote_dirs_ftp(): Handles FTP-like API with fast path optimization
- _ensure_remote_dirs_sftp(): Handles SFTP-like API with simple loop
- ensure_remote_dirs(): Dispatcher for input validation and type routing

Benefits:
- Improved readability: Main function is now a clear dispatcher
- Better testability: Each API handler is independently testable
- Reduced complexity: Lower cyclomatic complexity in main dispatcher
- Easier maintenance: Changes to one API isolated from the other

No behavioral changes - all tests pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 26, 2026

Codecov Report

❌ Patch coverage is 89.31186% with 73 lines in your changes missing coverage. Please review.
✅ Project coverage is 85.94%. Comparing base (4e8e271) to head (b8f2439).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
trollmoves/_mover_utils.py 33.75% 53 Missing ⚠️
trollmoves/movers.py 90.86% 20 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #238      +/-   ##
==========================================
+ Coverage   85.61%   85.94%   +0.33%     
==========================================
  Files          37       40       +3     
  Lines        6714     7350     +636     
==========================================
+ Hits         5748     6317     +569     
- Misses        966     1033      +67     
Flag Coverage Δ
unittests 85.94% <89.31%> (+0.33%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

pnuu and others added 2 commits April 26, 2026 18:30
- Add test_movers_use_tmp.py with 19 tests covering:
  - Mover.tmp_destination_for static method (Groups A-B)
  - FileMover finalize and full move_it() round-trips with real files (Group C)
  - ScpMover and SftpMover finalize and move_it() using real localhost SSH (Groups D-E)
  - FtpMover finalize with mocked FTP connection (Group F)
  - S3Mover finalize for copy-mode, multipart-mode and unconfigured case (Group G)
- Fix S3Mover.finalize_atomic_transfer: bucket (netloc) was dropped from the
  tmp-destination path, causing s3.copy/rm to operate on wrong keys
- Fix pre-existing lint issues in S3Mover.copy: unused variable, long lines

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@pnuu pnuu self-assigned this Apr 27, 2026
lahtinep and others added 6 commits April 27, 2026 10:59
- movers.py: pass tmp_dest (not mover.destination) to finalize_atomic_transfer
  in move_it(); broaden exception cleanup from NotImplementedError to all
  exceptions; add exist_ok=True and guard empty dir in Mover.finalize_atomic_transfer;
  remove duplicate FileMover.finalize_atomic_transfer (identical to base class);
  re-raise OSError errno 2 in ScpMover.copy() instead of silently swallowing it;
  add _S3_MOVER_INTERNAL_KEYS frozenset and filter internal keys before passing
  to S3FileSystem(); rewrite S3Mover.finalize_atomic_transfer to use
  final_destination directly instead of stripping tmp prefix; initialize
  upload_id=None before try block and guard abort on non-None; rename ambiguous
  'parts' variable to 'path_parts'/'upload_parts' in S3Mover.copy()
- _mover_utils.py: except IOError -> except OSError in ensure_final_directory_for_rename
- test_movers_use_tmp.py: fix lambda anti-pattern for raising in monkeypatch;
  add OSError propagation test; update FTP tests to assert cwd call order
  instead of patching ensure_remote_dirs away; update S3 copy test to use
  distinct final directory proving final_destination is used directly;
  add FTP end-to-end move_it() test with use_tmp

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When scp.put() raises OSError with errno 2 (ENOENT/FileNotFoundError),
ScpMover.copy() now logs the error and returns None instead of re-raising.
This matches the intended behaviour: source file gone means nothing to
transfer, so the operation is silently skipped.

Also add @patch('paramiko.SSHClient') to test_scp_copy_oserror_exception
and test_scp_copy_oserror_exception_errno_2 so they no longer rely on the
ScpMover.active_connections class-level cache being populated by a prior
test.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Narrow broad except Exception clauses to the specific exceptions that
each code path can actually raise:

movers.py:
- supports_atomic(): Exception -> AttributeError (scheme access on non-ParseResult)
- tmp_destination_for(): Exception -> AttributeError (path access on non-ParseResult)
- FtpMover.finalize_atomic_transfer rename: Exception -> all_errors
  (ftplib.Error, OSError, EOFError, ssl.SSLError; all_errors already imported)
- ScpMover.copy() SCPClient init: Exception -> (TypeError, SSHException, OSError)
- ScpMover.copy() scp.put fallback: Exception -> (SCPException, SSHException)
- ScpMover.finalize_atomic_transfer sftp.close(): Exception -> (SSHException, OSError)
- S3Mover.copy() multipart body: Exception -> (ClientError, BotoCoreError, OSError)
- S3Mover.copy() abort_multipart: Exception -> (ClientError, BotoCoreError)
- ScpMover.open_connection() connect fallback: keep Exception, add comment
  (SSHClient.connect() may raise unexpected exceptions from transport/agents)

_mover_utils.py:
- _ensure_remote_dirs_ftp() cwd/mkd: Exception -> (ftplib.Error, OSError) in all 6 spots
- _ensure_remote_dirs_sftp() stat/mkdir: Exception -> OSError in all 3 spots
- ensure_final_directory_for_rename() mkdir: Exception -> OSError

The two intentionally broad handlers in move_it() (finalize cleanup +
outer copy logger) are kept as Exception with clarifying comments.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Split the inline 50-line multipart upload block in copy() into three
focused private methods:

- _build_boto3_client(): centralises boto3 client construction from
  self.attrs. Handles explicit key/secret credentials, session token,
  and client_kwargs. Shared by _multipart_upload() and
  finalize_atomic_transfer() (which previously missed key/secret and
  token support on the boto3 copy+delete path).

- _do_multipart_upload(client, bucket, final_key): upload mechanics
  only — create, upload parts loop, complete; aborts best-effort on
  any error and re-raises.

- _multipart_upload(destination_file_path): orchestrator — parses path
  into bucket/key, strips tmp_prefix from basename if present, builds
  client, delegates to _do_multipart_upload, updates self.destination.

copy() is now a clean dispatcher (~10 lines): check imports, get
destination, call _multipart_upload() or fall through to s3fs put.

Tests added:
- test_build_boto3_client_default_credentials
- test_build_boto3_client_explicit_credentials
- test_do_multipart_upload_aborts_on_error
- test_do_multipart_upload_no_abort_when_create_fails

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Convert Mover.supports_atomic from an instance method to a
  @classmethod(cls, attrs=None) with a safe default of False.
- Add @classmethod supports_atomic overrides in every subclass:
    FileMover, FtpMover, ScpMover, SftpMover → always True
    S3Mover → True only when s3_use_multipart or s3_use_copy is set
- Use the new classmethod in _get_tmp_destination: when the user
  requests use_tmp_on_transfer but the mover does not support atomic
  transfers, log LOGGER.error and return None so the transfer falls
  back to a direct (non-tmp) copy.
- Add tests for supports_atomic on all mover classes and for the
  move_it fallback behaviour.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@pnuu pnuu requested a review from TAlonglong April 28, 2026 11:42
@pnuu pnuu marked this pull request as ready for review April 28, 2026 11:42
@pnuu pnuu requested a review from mraspaud as a code owner April 28, 2026 11:42
@pnuu
Copy link
Copy Markdown
Member Author

pnuu commented Apr 28, 2026

Ok, I think this is ready for initial review. I see there are still some refactoring that could be done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ScpMover to transfer to a temporary file and rename after completion

2 participants