Skip to content

Discrepancy in calls when running individual modified base pileup and combined modbase pileup #606

@gsukrit

Description

@gsukrit

Dear @ArtRand,

I ran modkit pileup on the modbam files for analyzing 5mC and 5hmC individually as well as using the --combine-mods commands. The total global methylation reported is:

Image

I calculated this % using: % = (Total modified reads) / (Total modified + Total canonical reads) × 100

It is odd that the methylation % for 5mC and 5hmC are higher than those reported in the modkit output using the --combine-mods option. The complete command I ran for combined modification calls was:
$ modkitv0.6.1/dist_modkit_v0.6.1_481e3c9/modkit pileup --cpg --combine-strands --sampling-frac 1.0 --ref /PlasmoDB64_Pfalciparum3D7_Genome.fasta -t 8 /P55_sorted.bam --combine-mods ./P55_combinemh_CpG.bed --log ./P55.log --modified-bases 5mC 5hmC

The initial run generating individual 5mC and 5hmC values were run with default parameters (--filter-percentile 10).

If the case is such that the combined run used a stricter probability threshold, and so, reads that were weakly called as "m" or "h" individually got rejected when competing against each other... then how do we explain this phenomenon biologically?
Ideally, the combined run should give a methylation % either similar to or higher than that of the individual runs gave.

I request your assistance on the same.

Thank you.

Metadata

Metadata

Assignees

No one assigned

    Labels

    troubleshootingworkflow and data preparation questions

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions