I have resisted understanding the application of interpolants for too long, probably to the detriment of cubie. Currently, the loop truncates a step to the nearest save points, and the truncation isn't communicated to the step controller. This will provide a falsely inflated error to the step controller, potentially causing a step size increase that will be rejected.
Changing this will require a moderate refactor of loop and step structure. Loops will need truncation logic removed, and will pass next_save to steps. Steps must then check if the next save is between now and step-end (cheap, done already in loop), and calculate the interpolant if so. This should probably be compile-time toggled, as some algorithms won't have easy interpolants. There is potential for warp divergence, as threads will reach the save point at different times. This is already present in the save logic path - I'm unsure how this is handled at a CUDA execution level currently.
I have resisted understanding the application of interpolants for too long, probably to the detriment of cubie. Currently, the loop truncates a step to the nearest save points, and the truncation isn't communicated to the step controller. This will provide a falsely inflated error to the step controller, potentially causing a step size increase that will be rejected.
Changing this will require a moderate refactor of loop and step structure. Loops will need truncation logic removed, and will pass next_save to steps. Steps must then check if the next save is between now and step-end (cheap, done already in loop), and calculate the interpolant if so. This should probably be compile-time toggled, as some algorithms won't have easy interpolants. There is potential for warp divergence, as threads will reach the save point at different times. This is already present in the save logic path - I'm unsure how this is handled at a CUDA execution level currently.