Skip to content

Refactor sample loading code#104

Merged
standage merged 6 commits into
masterfrom
fix/loading
Aug 8, 2017
Merged

Refactor sample loading code#104
standage merged 6 commits into
masterfrom
fix/loading

Conversation

@standage

@standage standage commented Aug 4, 2017

Copy link
Copy Markdown
Collaborator

Most of these changes were lumped together with (and are orthogonal to the main thrust of) a PR that likely will not be merged (#102).

The idea is to make the "sequentially diluted loading" strategy optional, and to generally clean things up a bit. Parallel loading should be handled in a separate PR.

Something to consider: there are functions that autodetects whether input files are Fastq/Fasta or sketches and invokes the appropriate command to load them. So far this has only applied to single file names, but it will clean up the code quite a bit in some places if this can handle file lists appropriately.

@codecov-io

codecov-io commented Aug 8, 2017

Copy link
Copy Markdown

Codecov Report

Merging #104 into master will decrease coverage by 0.73%.
The diff coverage is 67.5%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #104      +/-   ##
==========================================
- Coverage   85.59%   84.86%   -0.74%     
==========================================
  Files          31       31              
  Lines        1534     1526       -8     
  Branches      242      239       -3     
==========================================
- Hits         1313     1295      -18     
- Misses        170      178       +8     
- Partials       51       53       +2
Impacted Files Coverage Δ
kevlar/novel.py 77.87% <ø> (ø) ⬆️
kevlar/cli/count.py 100% <ø> (ø) ⬆️
kevlar/count.py 94.11% <100%> (+1.52%) ⬆️
kevlar/counting.py 62.16% <58.06%> (-14.25%) ⬇️
kevlar/__init__.py 86.48% <0%> (-4.06%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 06ed8b3...e9c28af. Read the comment docs.

Comment thread kevlar/count.py
def main(args):
if (args.num_bands is None) is not (args.band is None):
raise ValueError('Must specify --num-bands and --band together')
myband = args.band - 1 if args.band else None

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The human interface (CLI) expects the band to be 1-based (band \in {1..numbands}), whereas the Python and C++ APIs expect band to be 0-based (band \in {0..numbands-1}).

Comment thread kevlar/counting.py
for seqfile in seqfiles:
if mask:
if numbands:
nr, nk = sketch.consume_seqfile_banding_with_mask(

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bulk loading with mask now available from khmer master.

Comment thread kevlar/counting.py
print('[kevlar::counting] ', message, file=logfile)

if outfile:
if not outfile.endswith(('.ct', '.counttable')):

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This behavior (with respect to filename extensions for saved sketches) should be documented somewhere. A single function to handle reading/writing sketches would help. One already exists, but it isn't suited for all cases IIRC.

@standage standage merged commit 2764793 into master Aug 8, 2017
@standage standage deleted the fix/loading branch August 8, 2017 05:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants