Skip to content

batch: Use autobin in hybrid mode (via #302)#1044

Merged
etal merged 3 commits intomasterfrom
i302
Apr 6, 2026
Merged

batch: Use autobin in hybrid mode (via #302)#1044
etal merged 3 commits intomasterfrom
i302

Conversation

@etal
Copy link
Copy Markdown
Owner

@etal etal commented Apr 6, 2026

Fixes #302.

etal and others added 2 commits April 6, 2026 10:58
Previously, `batch -m hybrid` skipped autobin and used hard-coded default
bin sizes (~267 bp target, 150 kbp antitarget). Now autobin is called to
compute optimal sizes from BAM read depths, matching what the standalone
`autobin` command already supports. User-specified --target-avg-size and
--antitarget-avg-size are still respected when provided.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@etal
Copy link
Copy Markdown
Owner Author

etal commented Apr 6, 2026

@claude Review, please.

@claude
Copy link
Copy Markdown

claude bot commented Apr 6, 2026

Claude finished @etal's task —— View job


PR Review: Use autobin in hybrid mode

Todo List:

  • Review CLAUDE.md changes
  • Review cnvlib/batch.py implementation
  • Review conda-env.yml changes
  • Review test/test_commands.py additions
  • Provide overall assessment and recommendations

Review Findings

Changes Overview

This PR implements autobin functionality for hybrid mode in the batch command, addressing issue #302. The changes span 4 files with 72 additions and 2 deletions across 2 commits.

CLAUDE.md Changes ✅

Commit 5b7511c: Minor documentation cleanup removing a tracking line.

conda-env.yml Changes ✅

Commit 5b7511c:

  • Fixed comment typo (.yaml.yml)
  • Re-added hypothesis and mypy dependencies that were previously removed
  • Ensures dev environment completeness

cnvlib/batch.py Implementation ✅

Commit 04b3373: The main feature implementation looks solid:

Positive aspects:

  • Clean integration with existing WGS autobin logic
  • Proper variable initialization (access_arr = None moved to top)
  • Maintains backward compatibility - user-specified sizes still take precedence
  • Good logging for transparency of autobin decisions
  • Follows established patterns from WGS mode

Code quality:

  • Follows project style conventions
  • Uses appropriate type hints and modern Python patterns
  • Error handling is consistent with existing code
  • Variable naming follows codebase conventions (bam_fname, etc.)

Logic flow:

  1. Only runs autobin when normal_fnames exist and sizes not specified
  2. Uses midsize_file() to select representative BAM
  3. Calls autobin.do_autobin() with hybrid mode parameters
  4. Updates target_avg_size and antitarget_avg_size as needed
  5. Logs the depth→size decisions for user visibility

test/test_commands.py Additions ✅

Commit 04b3373: New test test_batch_hybrid_autobin() specifically validates:

  • target_avg_size=0 triggers autobin in hybrid mode
  • Reference and target/antitarget files are created successfully
  • Output structure matches expectations

The test uses existing test data and follows established testing patterns.

Assessment

Strengths

  • Addresses real user need: Eliminates hard-coded bin sizes that may be suboptimal
  • Maintains compatibility: User-provided sizes still override autobin
  • Good test coverage: Specific test for the new functionality
  • Clear documentation: Commit message explains the change well
  • Code quality: Follows project conventions and style

Potential Concerns

  • Clinical impact consideration: Per CLAUDE.md guidelines, this change could alter numerical output. The PR should explicitly flag this as it may affect downstream clinical pipelines.
  • Performance: Autobin adds computational overhead during reference creation, but this is reasonable given the benefit.

Recommendations

  1. Clinical Impact Flag: Add explicit note about potential output changes for clinical users
  2. Documentation: Consider updating doc/*.rst files to document the new autobin behavior
  3. Edge case testing: The implementation handles the common case well, but additional edge case testing could be valuable

Overall Assessment: ✅ APPROVED

This is a well-implemented enhancement that adds valuable functionality while maintaining backward compatibility. The code quality is high and follows project conventions. The main consideration is the potential clinical impact of changed bin sizes affecting numerical output.


@codecov
Copy link
Copy Markdown

codecov bot commented Apr 6, 2026

Codecov Report

❌ Patch coverage is 73.33333% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 66.23%. Comparing base (6d07828) to head (a3bd76b).
⚠️ Report is 4 commits behind head on master.

Files with missing lines Patch % Lines
cnvlib/batch.py 73.33% 0 Missing and 4 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1044      +/-   ##
==========================================
+ Coverage   66.07%   66.23%   +0.15%     
==========================================
  Files          72       72              
  Lines        7382     7396      +14     
  Branches     1296     1301       +5     
==========================================
+ Hits         4878     4899      +21     
+ Misses       2064     2057       -7     
  Partials      440      440              
Flag Coverage Δ
unittests 66.23% <73.33%> (+0.15%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@etal etal merged commit 3e5fb28 into master Apr 6, 2026
16 checks passed
@etal etal deleted the i302 branch April 6, 2026 18:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

batch hybrid: Use autobin for target and antitarget bin sizes

1 participant