Skip to content

Issue parsing bed file #21

@SeAudet

Description

@SeAudet

At this point, I'm unsure what the tool expects, or what is wrong. From the READme I gathered it was not really tolerant to impure STR regions, so I made sure to extract only pure repeat expansion regions from hg38. Made sure they were detectable in the reference fasta (samtools faidx) and that some reads were present in the sorted BAM for the region (even some directly containing expansions). Made a tab-delimited bed in the format of the example data :
chr1 57367043 57367118 AAAAT
chr1 146228800 146228812 GCC
chr1 149390802 149390829 GGC
...

And yet, I keep facing the same issue :

Image

I've tried buffering the input position (start & end; +1 & -1; also tried +2 at the end to match example of HTT) and I always receive the same message. The only thing that changes the output error is limiting the positions to 1 repeat (i.e. chr1 57367043 57367048 AAAAT instead of 57367118) however for all repeat it just outputs "ERROR! No reads were found for repeat region: chr1-57367043-57367048-AAAAT", even if reads are present (from the fact that it moves from step 1 to step 4 within 1 second I gather it probably doesn't actually check).

I also tried changing the nomenclature from chr1 -> 1 but unsurprisingly got "ValueError: invalid contig 1" since the reference used to align has the chr nomenclature. Command for reference (also tried with paths for minimap2/samtools but it doesn't seem to change anything) :
Running command: /opt/conda/bin/nanoRepeat.py -i /Path/Sample_sorted.bam -t bam -d ont_sup -r /Path/GRCh38.primary_assembly.genome.fa -b /Path/nano_pathogenic.hg38.bed -c 8 -o /Path/results/nanorepeat/01922-utm

Cheers,

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions