Skip to content

improve annotation in pipeline #125

Description

@slsevilla

Currently annotation calling is one of the largest bottlenecks of the pipeline. It is currently split into several rules and accompanying scripts.

Rules

  • peak_Transcripts
  • peak_ExonIntron
  • peak_RMSK
  • peak_Transcripts
  • peak_junctions
  • peak_process
  • project_annotations

Scripts

The general workflow is to run each annotation type separately before merging into one RMD file. This requires a significant amount of time, and is generating individual jobs per sample per rule, which also utilizes more Biowulf resources than maybe necessary.

Goals for the re-write

  1. Speed up performance
  2. Reduce the number of input/output files required for execution
  3. Transfer all file creation from R files to snakemake
  4. Reduce the number of rules required without sacrificing speed considerably

Metadata

Metadata

Labels

enhancementNew feature or request

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions