Skip to content

epi.ct.bld_mtx_bed drops information from certain threads #138

@smithcathy

Description

@smithcathy

I am using episcanpy to build h5ad files from snATAC-Seq data. I use the following code in episcanpy v 0.4.0:

epi_peak = epi.ct.load_peaks( peaks_file )

ann = epi.ct.bld_mtx_bed( fragments_file,
feature_region = epi_peak,
chromosomes = list( epi_peak.keys() ),
thread = 6 )

ann.write_h5ad( outdir + samp + '_pruned.h5ad' )

within a snakemake pipeline where peaks_file is a sample specific macs2 file and the fragments_file is bed format output from cellranger-arc (chrom, start, end, cell_id/barcode).

During QC, I noticed a sequential chunk of peaks/features with 0 counts across all cells/barcodes. This impacted only a small amount of reads (<1% of the total read count) but resulted in peaks/features without any reads/fragments in those regions within the impacted samples. However I can see reads/fragments within the input fragments file, and the macs2 peaks were called using the same fragments file so this should not be happening. The result is not reproducible - subsequent runs do not skip the same region. I suspect that a thread is infrequently lost into the ethos resulting in 0 read/fragment counts for a small section of the genome.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions