Refactor C++ logic to correctly handle NaN sentinel initializations#1268
Refactor C++ logic to correctly handle NaN sentinel initializations#1268madhavgairola wants to merge 2 commits intoNOAA-FIMS:devfrom
Conversation
|
Thank you for contributing to FIMS and opening your first PR here! We are happy to have your contributions. Please ensure that the PR is made to the dev branch and let us know if you need any help! Also, we encourage you to introduce yourself to the community on the introduction thread in our Discussions. |
|
@madhavgairola thanks for making these changes so quickly. I am wondering if I told you to format the code incorrectly though 🤦♀️ because it looks like many files changed just because of formatting and not because of actual changes that you made. Here are my instructions that I gave on the issue and for example there are changes where the header files are resorted even those SortIncludes: false is in the instructions. Do you know why there are so many formatting changes? Like I mentioned, I might have specified how to run clang-format incorrectly. If needed, I can try and run it on your branch to see if it reduces the number of changes but I thought I would ask you first. Thanks. |
|
Thanks for pointing that out |
|
Why don't you try and minimize the formatting changes the best you can and then we will go from there 😁 |
75c96c2 to
93475fb
Compare
|
Thanks for the suggestion! I reverted the formatting-only changes so the PR now only includes the files with the actual logic updates. The diff should be much smaller now. |
|
I installed your branch and changed an input value for landings in the data frame from a viable number to NA, NA_real_, and 0 and all three things led to the model not being able to optimize. So, a quick question. What value should we be using in R for missing values that should not be evaluated by the likelihood? |
|
I took a closer look at how the C++ side decides whether to skip evaluating data in the likelihood. |
|
Okay, I got back around to testing this and I am not getting a successful json. I ran a model with -999 values in the catch stream which used to lead to -inf in the json which was causing an error. I am still getting the following error message because {jsonlite} cannot handle reading the file. |
|
@madhavgairola I just wanted to check that you received my previous comment and what the status of this PR is? |
|
Thanks for the follow-up, and sorry for the delay — I missed your earlier message. I see the issue now.I checked and found out that with the switch to NaN on the C++ side, those values are getting written directly into the JSON output, but since JSON doesn’t support NaN, it breaks parsing on the R side ( I haven’t handled the JSON serialization layer yet in this PR, so that part is definitely incomplete right now. Would you prefer that I extend this PR to handle NaN/Inf serialization (e.g., mapping to null or a string), or should that be handled in a separate follow-up PR? |
|
That would be great if we can do it all in one PR because we cannot merge it in if downstream workflows are breaking. |
🎨 Chore: code formatting workflowOur automated workflows cannot run on forks because of permission issues, and thus, we ask that you run the following code locally and push any changes that are created to your feature branch. You will only be reminded of this once per PR. Thank you! Format C++ code
Format R code
styler::style_pkg() # Style R code
roxygen2::roxygenise() # Update documentation
styler::style_pkg() # Style R code again
roxygen2::roxygenise() # Update documentation again
usethis::use_tidy_description() # Style DESCRIPTION filePush changes
|
|
Hey! I just pushed a fix for the JSON serialization issue.Could you try it out and see if it works on your end? |
15e7e4c to
52c7851
Compare
|
Okay, I tested this again using Then, I had to augment I can now read the json but I am running into problems with other helpers inside of So, I am honestly not sure what we have gained by changing the code in the C++ because I am still running into all the same problems that I was before. Can you please explain in a little bit more plain language and less C++ speak what the changes you have made are trying to do so I make sure that I am testing the right thing? |
|
Thanks for running those tests and bringing this up. I can explain exactly what’s happening here, as there are two separate things colliding in your test script that are causing this confusion.
To fix this natively, I just pushed an update to the C++ code to output standard unquoted JSON null instead of text strings. R's jsonlite will now seamlessly convert the internal null emissions directly into NA without coercing your helper arrays into strings, so you shouldn't need your custom regex anymore. Regarding the Scope of the Issue: On a separate note, I wanted to respectfully bring up the scope of this work. I originally picked this up precisely because it was labeled as a "good first issue" focusing strictly on C++ sentinel replacement. However, replacing the internal sentinels has inevitably cascaded into touching cross-language optimization bindings, R-side downstream architecture constraints, and JSON deserialization workflows. While I am happy to navigate these layers and help implement this properly, I would respectfully suggest removing the "good first issue" label, as the downstream complexity involved here is far beyond an intro-level task. If you pull the newest commit and test the model using -999 for your R inputs, does the JSON output parse natively into your helpers? |
This PR replaces the floating-point sentinel value
-999with properNaNhandling in the C++ codebase.Summary of changes
std::numeric_limits<double>::quiet_NaN()-999to usestd::isnan(...)<cmath>,<limits>) where neededNaNwithEXPECT_TRUE(std::isnan(...))EXPECT_DOUBLE_EQtoEXPECT_NEARto account for small floating-point precision differences introduced by NaN initializationScope
-999used with integer types) were intentionally left unchanged, as the issue suggests handling them in a later refactor usingstd::optionalValidation
ctest→ 100%)Future work (not included in this PR)
NaN/Inf-999withNA_real_na_valueafter the integer refactor