Skip to content

77 sparseoptimization pattern discrepancy#125

Draft
dimalvovs wants to merge 1601 commits intomasterfrom
77-sparseoptimization-pattern-discrepancy
Draft

77 sparseoptimization pattern discrepancy#125
dimalvovs wants to merge 1601 commits intomasterfrom
77-sparseoptimization-pattern-discrepancy

Conversation

@dimalvovs
Copy link
Contributor

  • add comments to chi-squared calc
  • add 1 to denominator and numerator to avoid NaN in chi-squared calc

@dimalvovs dimalvovs linked an issue Nov 15, 2024 that may be closed by this pull request
Copy link

@drbergman drbergman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't comment on how the 1+'s will affect calculations and their accuracy for a chisq calculation. I'll only ask: isn't the purpose of the SparseIterator that it will skip all 0 values? So dsq = get<1>(it) * get<1>(it) must be > 0?

Also, though it's clearly outside the scope of this PR, it feels like there's the possibility for computational savings by storing the results of the dot products inside the for loop to be re-used in the while loop. Similarly, I think the dsq values are constant throughout the CoGAPS run (they're just the element-wise square of the genes x sample matrix, right?). Could a static variable be used to store these?

@dimalvovs
Copy link
Contributor Author

I can't comment on how the 1+'s will affect calculations and their accuracy for a chisq calculation. I'll only ask: isn't the purpose of the SparseIterator that it will skip all 0 values? So dsq = get<1>(it) * get<1>(it) must be > 0?

that is exactly right, I think the root cause is the implementation of the sparseIterator, so maybe this draft PR should be more of a proof that that the NaN comes from the denominator and demonstrate that it is curable by +1ing.

@dimalvovs dimalvovs force-pushed the 77-sparseoptimization-pattern-discrepancy branch from f25f63c to b06f01b Compare November 22, 2024 17:12
@dimalvovs
Copy link
Contributor Author

dimalvovs commented Nov 22, 2024

additional to-dos from a call with @favorov:

  • change comments for clarity: mDMatrix is data (D); mMatrix is A (left mult matrix) if D is samples x genes; and P is D is genes x samples; *mOtherMatrix contans const pointer to vis-a-vis of mMatrix;

Matrix mDMatrix; // samples by genes for A, genes by samples for P

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

SparseOptimization pattern discrepancy

6 participants