Fix adaptive noise accounting#807
Conversation
|
@facebook-github-bot has imported this pull request. If you are a Meta employee, you can view this in D92044480. (Because this pull request was imported automatically, there will not be any future comments.) |
…nting_noise_multiplier` property and simplify privacy accounting logic
|
Thank you @david-stan for fixing this subtle bug. The fix looks good to me. We also have an implementation of adaptive clipping + fast gradient clipping here, which incurs a similar bug. I'm curious if you've used adaptive clipping with fast gradient clipping? Let me know if you'd like to fix that one too. If not I can open an issue. Please see the lint error. |
Pull Request Test Coverage Report for Build 21669348710Details
💛 - Coveralls |
I will fix those as well. We use fast gradient clipping as well, except for trl! |
… noise multiplier for adaptive clipping
|
Also, kinda (un)related question regarding optimizer step.. opacus/opacus/optimizers/optimizer.py Line 504 in f17f254 why do we divide .grads by statistical average (expected_batch_size), if we at that moment already know the exact number of poisson samples? |
This comes from the DP-SGD algorithm and its analysis (https://arxiv.org/pdf/1607.00133). Using expected batch size ensures we have an unbiased estimator of the gradient. The epxected_batch_size is computed from the probability with which we sample datapoints. |
|
@david-stan this looks good to me, can you please fix the failing lint error? |
Actually, the more important reason we use expected_batch_size is that the actual batch size is nonprivate and we do not want to leak it. |
|
@david-stan one last lint fix is needed to pass the isort |
|
This pull request has been merged in 9d3c2b0. |
Types of changes
Motivation and Context / Related issue
This PR fixes incorrect privacy accounting for
AdaClipDPOptimizer.The AdaClip optimizer internally adjusts the
noise_multiplierbased on Theorem 1 from the AdaClip paper:σ_eff = (σ^-2 - (2σ_u)^-2)^(-1/2)where
σis the user-provided noise multiplier andσ_uis theunclipped_num_stdparameter.The adjusted
σ_effis used internally for noise generation during training. However, privacy accountants (rdp, prv,...) were using this adjusted value instead of the original user-providedσfor privacy budget calculations, resulting in incorrect epsilon values.Introduced an
accounting_noise_multiplierproperty that:noise_multiplierfor standardDPOptimizer(backward compatible)AdaClipDPOptimizer(before adjustment)The internal noise generation continues to use the adjusted value as intended by the AdaClip algorithm, while privacy accounting now uses the correct original value.
Update!!
As of commit 1f9283a
Refactored the noise adjustment to be calculated locally within the
add_noise()method where it's actually needed, rather than modifyingself.noise_multiplierat initialization.Now:
self.noise_multiplieralways holds the original user-provided value (σ)This is simpler and more maintainable than introducing separate properties or modifying the accounting system.
How Has This Been Tested (if it applies)
Created comprehensive test suite in
opacus/tests/accounting_noise_multiplier_test.pywith 7 test cases:Checklist