-
Notifications
You must be signed in to change notification settings - Fork 27
modkit pileup --include-bed silently rejects BED4 files, produces zero valid positions with no warning #605
Description
Description
When --include-bed is passed a BED4 file (4 columns: chr, start, end, name), modkit pileup silently parses zero valid positions and produces an empty or near-empty bedMethyl output. No warning or error is emitted to indicate that the BED file was rejected or that positions were skipped.
Steps to reproduce
# BED4 file — standard UCSC CpG island download format
# chr start end name
# chr1 10468 11240 CpG: 111
modkit pileup input.bam output.bedmethyl --ref reference.fasta --include-bed cpg_islands.bed4 --modified-bases 5mC --cpgResult: output.bedmethyl is empty (or contains only a header). No warning is printed.
Expected behaviour
modkit should either:
- Emit a clear warning:
"WARNING: --include-bed file has 4 columns; expected BED3 (chr/start/end) or BED6 (chr/start/end/name/score/strand). Zero positions parsed.", or - Accept BED4 by treating columns 1–3 as the interval and ignoring column 4
Context
This was discovered while using modkit pileup with a standard UCSC CpG island BED file downloaded from the UCSC Table Browser. UCSC CpG island exports are BED4 by default. The silent failure is particularly confusing because the tool exits with code 0 and the output file is created — it just contains no data.
The workaround is to strip column 4 before passing to --include-bed:
cut -f1-3 cpg_islands.bed4 > cpg_islands.bed3
modkit pileup ... --include-bed cpg_islands.bed3Environment
- modkit version: 0.6.1
- Reported via nf-core/modules PR: docs(modkit/pileup): clarify BAM index, BED format, and --cpg requirements nf-core/modules#10341