-
Notifications
You must be signed in to change notification settings - Fork 77
Added the ImmuScope class II prediction algorithm #1371
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
ldhtnp
wants to merge
19
commits into
griffithlab:7.0.0
Choose a base branch
from
ldhtnp:add-immunoscope
base: 7.0.0
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
19 commits
Select commit
Hold shift + click to select a range
f54e6bb
added ImmuScope algorithm to prediction class, added allele file
ldhtnp 1f13441
added immuscope to necessary lib files
ldhtnp 433e49f
updated docs
ldhtnp 45aac74
updated documentation to use new download command
ldhtnp 806e361
corrected ImmuScope algorithm casing
ldhtnp d091768
added immuscope predictor test
ldhtnp b94e787
added ImmuScope algorithm to prediction class, added allele file
ldhtnp 276fb5a
added immuscope prediction algorithm
ldhtnp 4554b58
updated docs
ldhtnp b17ab7e
updated documentation to use new download command
ldhtnp da14653
corrected ImmuScope algorithm casing
ldhtnp b31c519
added immuscope predictor test
ldhtnp 027df2f
resolved conflicts
ldhtnp 64d5e2c
implemented requested changes/fixes
ldhtnp a02cc0a
added ImmuScope to output parser test
ldhtnp a483b5d
Merge remote-tracking branch 'origin/7.0.0' into add-immunoscope
susannasiebert a8d6e39
Update class ii aggregate report creation test
susannasiebert 9588b27
modified seq_num, start handling for ImmuScope
ldhtnp eeb6fb9
Merge branch 'add-immunoscope' of github.com:ldhtnp/pVACtools into ad…
ldhtnp File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
2,974 changes: 2,974 additions & 0 deletions
2,974
predictor_tests/test_data/output_immuscope_im.tsv
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it required by the predictor to add the seq_num and start to the input file? If not I think these columns can be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, they are currently required by the predictor/wrapper interface as implemented. If you would rather them be excluded, I can update the fork to make these optional
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha. I think we could generate the file with these columns filled in by creating it in the same block of code above where we read in the fasta file (line 878+). The
determine_neoepitopesmethod returns a hash with the start position as the key and the epitope as the value. The fasta sequence header can be used as the seq_num.I assume that the output includes these two columns as well so that would then save us from having to map back each epitope to it's seq num and start position (line 934+). This would be at the expense of potentially having duplicate epitopes in that file if there are repetitive regions etc which could make ImmunoScope slower (not sure if they accounted for this).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed a commit that keeps the deduped peptide set for scoring, but captures seq_num and start during the initial FASTA parsing and then merges them back onto the ImmuScope output. This lets us drop the remapping loop while still preserving those fields cleanly.
The performance of ImmuScope would be impacted if we passed every epitope occurrence directly to the wrapper with seq_num/start filled in, since it would score duplicates instead of just unique peptides. This approach avoids that by keeping the input deduplicated and only expanding back afterward.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't sure if Immuscope was being smart and deduplicates epitopes on their end.