epi.ct.bld_mtx_bed drops information from certain threads

I am using episcanpy to build h5ad files from snATAC-Seq data. I use the following code in episcanpy v 0.4.0:

epi_peak = epi.ct.load_peaks( peaks_file )

ann = epi.ct.bld_mtx_bed( fragments_file,
                                              feature_region = epi_peak,
                                             chromosomes = list( epi_peak.keys() ),
                                             thread = 6 )

ann.write_h5ad( outdir + samp + '_pruned.h5ad' )

within a snakemake pipeline where peaks_file is a sample specific macs2 file and the fragments_file is bed format output from cellranger-arc (chrom, start, end, cell_id/barcode).

During QC, I noticed a sequential chunk of peaks/features with 0 counts across all cells/barcodes. This impacted only a small amount of reads (<1% of the total read count) but resulted in peaks/features without any reads/fragments in those regions within the impacted samples. However I can see reads/fragments within the input fragments file, and the macs2 peaks were called using the same fragments file so this should not be happening. The result is not reproducible - subsequent runs do not skip the same region. I suspect that a thread is infrequently lost into the ethos resulting in 0 read/fragment counts for a small section of the genome. 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

epi.ct.bld_mtx_bed drops information from certain threads #138

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

epi.ct.bld_mtx_bed drops information from certain threads #138

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions