Skip to content

WeightedRandomSampler causes silent epsilon miscalculation)Β #813

@MN-Noor

Description

@MN-Noor

πŸ› Bug

When DataLoader uses WeightedRandomSampler, make_private_with_epsilon
silently computes sample_rate from len(sampler) (e.g. 128 = num_samples)
instead of the full dataset size. This produces sample_rate β‰ˆ 0.31 instead
of β‰ˆ 0.000007 β€” a 45,000x difference β€” causing the entire privacy budget
to burn in a single epoch with no warning or error.

Related to #600 (which notes the sampler is replaced) but this issue
focuses on the privacy accounting consequence: epsilon tracking is
silently invalid.

Please reproduce using our template Colab

###To Reproduce

import torch
from torch.utils.data import TensorDataset, DataLoader, WeightedRandomSampler
from opacus import PrivacyEngine

# Simulate a dataset of 100,000 samples
X = torch.randn(100_000, 10)
y = torch.randint(0, 2, (100_000,))
dataset = TensorDataset(X, y)

# WeightedRandomSampler with 128 samples drawn per epoch
weights = torch.ones(100_000)
sampler = WeightedRandomSampler(weights, num_samples=128, replacement=True)
loader = DataLoader(dataset, batch_size=16, sampler=sampler)

model = torch.nn.Linear(10, 2)
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)

privacy_engine = PrivacyEngine()
model, optimizer, loader = privacy_engine.make_private_with_epsilon(
    module=model,
    optimizer=optimizer,
    data_loader=loader,
    epochs=1,
    target_epsilon=8.0,
    target_delta=1e-5,
    max_grad_norm=1.0,
)

print("real sample_rate (correct):      ", 16 / len(dataset))
print("sample_rate used by accountant:  ", 
      optimizer.expected_batch_size / len(loader.dataset))
print("original batch_size:", 16)
print("expected_batch_size:", optimizer.expected_batch_size)
print("len(dataset):       ", len(dataset))
print("len(loader.sampler):", len(loader.sampler))
# Expected: ~0.00016  (16 / 100_000)
# Actual:   0.125     (16 / 128)  

Observed:
No warning or error is raised.
Confirmed output from running the reproduce script:
sample_rate used by accountant: 0.125
Correct sample_rate should be 16 / 100_000 = 0.00016.
Ratio: 0.125 / 0.00016 = 781x faster epsilon burn than expected.
(Earlier estimate of 45,000x assumed num_samples=1 edge case β€”
the real multiplier depends on num_samples in the sampler.
With num_samples=128 and batch_size=16, the ratio is 781x.)

Expected behavior

Either:

  1. Raise a warning when WeightedRandomSampler is detected, explaining
    that privacy accounting may be incorrect

Suggested warning (minimal fix):

if isinstance(data_loader.sampler, WeightedRandomSampler):
    warnings.warn(
        "WeightedRandomSampler detected. Opacus replaces it with "
        "UniformWithReplacementSampler for Poisson sampling. "
        "Privacy accounting uses batch_size/dataset_size as sample_rate. "
        "Your epsilon tracking may be incorrect.",
        UserWarning
    )
or
```python
# After sampler replacement, recompute sample_rate:
sample_rate = batch_size / len(new_data_loader.dataset)
Workaround until fixed: replace `WeightedRandomSampler` with `shuffle=True`.
### Environment

- PyTorch Version: 2.11.0+cu128
- OS: Linux
- Python version: 3.10.20 
- CUDA/cuDNN version: 12.8
- GPU: NVIDIA RTX 5090 (Blackwell architecture, 32GB VRAM)
- Opacus version: 1.5.4
## Additional context

The `len(loader.sampler) = 100000` after `make_private_with_epsilon` 
proves Opacus successfully replaced the sampler. The `sample_rate = 0.125` 
proves it was computed from the old sampler before replacement. 
The inconsistency between these two values is the bug.

Happy to open a PR fixing the sample_rate recomputation in 
`privacy_engine.py`.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions