Guard solve_up Enzyme reverse rule against runtime-activity aliased shadows#3740
Guard solve_up Enzyme reverse rule against runtime-activity aliased shadows#3740SebastianM-C wants to merge 3 commits into
Conversation
…hadows Under `set_runtime_activity`, a runtime-inactive value reaches the rule as Duplicated/MixedDuplicated whose shadow IS the primal (dval === val). The reverse rule for `DiffEqBase.solve_up` accumulated every non-Const cotangent into `ptr.dval` unconditionally, so e.g. the du0 cotangent was broadcast-added into the caller's primal u0 whenever the solved problem's u0 aliased an array reachable from a Const argument (the common `setsym_oop`/`remake` pattern in MTK loss functions). The first gradient call returned correct results while silently corrupting the Const problem's u0; subsequent calls were garbage. Skip accumulation when the shadow aliases the primal — a runtime-inactive value accumulates nowhere, exactly as if it were Const. Found while reducing SciML/SciMLSensitivity.jl#1477 (failure mode B). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
|
Seems reasonable but this should definitely get a test. |
Tests the runtime-activity scenario fixed in the previous commit: a loss that reuses the Const problem's own u0 via remake must keep correct gradients across repeated calls without mutating the primal problem, and genuinely active u0 must still receive its cotangent. Verified against ForwardDiff with a real SciMLSensitivity adjoint (GaussAdjoint); on the unpatched rule the first call silently corrupts prob.u0 and the second call's gradient is garbage. Runs in the Downstream group (adds SciMLSensitivity and Enzyme to that test environment), guarded out on prerelease Julia like DiffEqBaseEnzymeExt itself. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
|
Sure, but where would that go? From what I see SciMLSensitivity related testing is done via downstream. Should I PR the test to SciMLSensitivity? |
|
either there or the downstream ad test group |
| # corrupting subsequent calls. Skip them: a runtime-inactive value | ||
| # accumulates nowhere, exactly as if it were `Const`. | ||
| if ptr isa MixedDuplicated | ||
| ptr.dval[] === ptr.val && continue |
There was a problem hiding this comment.
@SebastianM-C / @ChrisRackauckas you should only do this check if runtime activity is set (see https://github.com/EnzymeAD/Enzyme.jl/blob/a26b7e07696f771056ad70abdf147cc2f7739b19/lib/EnzymeCore/src/rules.jl#L83 to determine if it is)
|
this is a likely culrprit for errors, like mentioned earlier. Anything we can do to get this merged? |
|
I added a test, but I'm not sure if it's in the right place, there is also an |
While looking into reducing SciML/SciMLSensitivity.jl#1477 with Claude, I found a potential bug in the Enzyme rule under
set_runtime_activity(Reverse)when theu0is aliased inremake(which happens becauseSII.setsym_oopreturns theprob.u0if there are no changes). The fix that Claude proposed was to skip accumulation when the shadow aliases the primal, exactly as if it wereConst.@ChrisRackauckas Would this be appropriate or should we always copy in
SII.setsym_oop(or both)?Checklist
contributor guidelines, in particular the SciML Style Guide and
COLPRAC.
Additional context
Add any other context about the problem here.