Hi Eoghan et al,
After sorting out my python env issues, I've managed to successfully run Pore-C-Snakemake through to its salsa bed outputs, and it seemed to work beautifully in that (good news!).
The bad news for my specific experiment is that the coverage I have in any one of my enzymes is too low to get much benefit in scaffolding out of SALSA - it runs, but doesn't really change the assembly contiguity considerably. I think this might be expected, since the throughput of the experiment wasn't great and if I map all my reads (split across 4 enzymes, each with different throughputs) the average coverage on my reference genome is 11X (with, however, only 3% not covered at all, with 90% of it covered by more than 5 reads). I can collect more data if I need to, but first I'd like to get the most out of the data I have at the moment (realizing that this is all very experimental).
As far as I can gather from the documentation and help files, PoreC & its snakemake wrapper is configured to run multiple experiments & different enzymes at once, but it gives each a separate bed output file when run with "to_salsa_bed". Is there any way to tell it to merge from multiple enzymes when writing a salsa bed file?
I tried to use samtools merge to merge the multiple bam files in mapping, then bamtobed as the salsa documentation suggests, but SALSA fails with these bed files, I guess because they don't have the form of "simulated" paired-end reads as poreC generates.
Thanks again for your consideration,
Regards,
Chris L
PS - FYI in the Pore-C-Snakemake documentation there's one line
"snakemake --use-conda salsa2_bed"
that doesn't run, because it should be "salsa_bed". Nitpicky but it threw me for a moment.
Hi Eoghan et al,
After sorting out my python env issues, I've managed to successfully run Pore-C-Snakemake through to its salsa bed outputs, and it seemed to work beautifully in that (good news!).
The bad news for my specific experiment is that the coverage I have in any one of my enzymes is too low to get much benefit in scaffolding out of SALSA - it runs, but doesn't really change the assembly contiguity considerably. I think this might be expected, since the throughput of the experiment wasn't great and if I map all my reads (split across 4 enzymes, each with different throughputs) the average coverage on my reference genome is 11X (with, however, only 3% not covered at all, with 90% of it covered by more than 5 reads). I can collect more data if I need to, but first I'd like to get the most out of the data I have at the moment (realizing that this is all very experimental).
As far as I can gather from the documentation and help files, PoreC & its snakemake wrapper is configured to run multiple experiments & different enzymes at once, but it gives each a separate bed output file when run with "to_salsa_bed". Is there any way to tell it to merge from multiple enzymes when writing a salsa bed file?
I tried to use samtools merge to merge the multiple bam files in mapping, then bamtobed as the salsa documentation suggests, but SALSA fails with these bed files, I guess because they don't have the form of "simulated" paired-end reads as poreC generates.
Thanks again for your consideration,
Regards,
Chris L
PS - FYI in the Pore-C-Snakemake documentation there's one line
"snakemake --use-conda salsa2_bed"
that doesn't run, because it should be "salsa_bed". Nitpicky but it threw me for a moment.