improve annotation in pipeline

Currently annotation calling is one of the largest bottlenecks of the pipeline. It is currently split into several rules and accompanying scripts.

Rules
- peak_Transcripts
- peak_ExonIntron
- peak_RMSK
- peak_Transcripts
- peak_junctions
- peak_process
- project_annotations

Scripts 
- [05_Anno_ExonIntron.R](https://github.com/RBL-NCI/iCLIP/blob/activeDev/workflow/scripts/05_Anno_ExonIntron.R)
[05_Anno_Process.R](https://github.com/RBL-NCI/iCLIP/blob/activeDev/workflow/scripts/05_Anno_Process.R)
- [05_Anno_RMSK.R](https://github.com/RBL-NCI/iCLIP/blob/activeDev/workflow/scripts/05_Anno_RMSK.R)
- [05_Anno_Transcript.R](https://github.com/RBL-NCI/iCLIP/blob/activeDev/workflow/scripts/05_Anno_Transcript.R)
- [05_Anno_junctions.R](https://github.com/RBL-NCI/iCLIP/blob/activeDev/workflow/scripts/05_Anno_junctions.R)
- [05_get_site2peak_lookup.sh](https://github.com/RBL-NCI/iCLIP/blob/activeDev/workflow/scripts/05_get_site2peak_lookup.sh)
- [05_jcounts2peakconnections.py](https://github.com/RBL-NCI/iCLIP/blob/activeDev/workflow/scripts/05_jcounts2peakconnections.py)
- [05_peak_annotation.R](https://github.com/RBL-NCI/iCLIP/blob/activeDev/workflow/scripts/05_peak_annotation.R)
- [05_peak_annotation_functions.R](https://github.com/RBL-NCI/iCLIP/blob/activeDev/workflow/scripts/05_peak_annotation_functions.R)
- [06_annotation.Rmd](https://github.com/RBL-NCI/iCLIP/blob/activeDev/workflow/scripts/06_annotation.Rmd)

The general workflow is to run each annotation type separately before merging into one RMD file. This requires a significant amount of time, and is generating individual jobs per sample per rule, which also utilizes more Biowulf resources than maybe necessary.

Goals for the re-write
1. Speed up performance
2. Reduce the number of input/output files required for execution
3. Transfer all file creation from R files to snakemake
4. Reduce the number of rules required without sacrificing speed considerably 



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

improve annotation in pipeline #125

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

improve annotation in pipeline #125

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions