Conversation
There was a problem hiding this comment.
Pull request overview
This PR fixes input handling for the main Nextflow workflow and harmonizes SpaceMarkers output format from CSV to RDS for consistency with other pipeline outputs.
- Changes spaceMarkers output from CSV to RDS format for consistency
- Updates data structure to use row names for gene identifiers instead of a separate column
- Modernizes Groovy syntax from
tuple()to bracket notation
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| sm <- readRDS("$spaceMarkers") | ||
| plot_names <- colnames(sm) |
There was a problem hiding this comment.
After changing the data format to RDS with genes as rownames (lines 67-69), the data frame no longer contains a "Gene" column. However, the plotIMScores function on line 134 expects the input data frame to have a "Gene" column (see R/utils.R line 156: df$genes <- df$Gene). This will cause the plotting code to fail.
Consider converting rownames back to a "Gene" column before calling plotIMScores, for example:
sm <- readRDS("$spaceMarkers")
sm$Gene <- rownames(sm)
plot_names <- colnames(sm)
plot_names <- plot_names[plot_names != "Gene"]| sm <- readRDS("$spaceMarkers") | |
| plot_names <- colnames(sm) | |
| sm <- readRDS("$spaceMarkers") | |
| sm$Gene <- rownames(sm) | |
| plot_names <- colnames(sm) | |
| plot_names <- plot_names[plot_names != "Gene"] |
atuldeshpande
left a comment
There was a problem hiding this comment.
Looks good! Do we want a temporary colnames sanity check in hdPipeline.R?
|
actually I tried integrating in the latest version of the pipeline, and it does not look good: SD does not output LRscores. HD outputs IMscores as HD outputs LRscores as so IMscores for SD looks similar to LRscores of HD, and IMscores of HD looks unique. Needs fixing by future gen! |
fix input for main.nh, close #87
in addition, harmonize SD outputs, before:
// IMScores: IMScores.rds with row names holding gene names,
// followed by cell_type1_near_cell_typeN columns, values are IMScores
// LRscores: LRscores.rds with row names holding ligand-receptor pair names,
// followed by cell_type1_near_cell_typeN columns, values are LRscores
and
// IMScores: spacemarkers.csv first column is Gene with gene name,
// followed by cell_type1_cell_typeN columns, values are spacemarkers
now:
// IMScores: IMScores.rds with row names holding gene names,
// followed by cell_type1_near_cell_typeN columns, values are IMScores
// LRscores: LRscores.rds with row names holding ligand-receptor pair names,
// followed by cell_type1_near_cell_typeN columns, values are LRscores
and
// IMScores: spacemarkers.rds with row names holding gene names,
// followed by cell_type1_cell_typeN columns, values are spacemarkers