Skip to content

refactor(discriminatory-behavior): migrate to new defense architecture#79

Open
asim29 wants to merge 1 commit intopr/4.5-unauth-model-ownershipfrom
pr/4.6-discriminatory-behavior
Open

refactor(discriminatory-behavior): migrate to new defense architecture#79
asim29 wants to merge 1 commit intopr/4.5-unauth-model-ownershipfrom
pr/4.6-discriminatory-behavior

Conversation

@asim29
Copy link
Copy Markdown
Collaborator

@asim29 asim29 commented Apr 21, 2026

Context

Sixth in the refactor/risk-modules stack. Stacked on #78 (unauth model ownership vertical).

Changes

amulet/discriminatory_behavior/defenses/adversarial_debiasing.py

  • AdversarialDebiasing subclasses DiscriminatoryBehaviorDefense(ABC) and delegates model, criterion, optimizer, train_loader, test_loader, device to super().__init__().
  • Dead imports (numpy, sklearn.metrics) removed — no longer used after adversary_auc was moved to the metrics class.

amulet/discriminatory_behavior/metrics/discriminatory_behavior.py

  • adversary_auc() extracted from the defense into DiscriminatoryBehavior as a @staticmethod, keeping attacks/defenses free of metric logic per API contract.
  • p_rule() now returns 0.0 on divide-by-zero instead of propagating nan.
  • Docstrings converted to Google style; RST markup removed.

examples/defense_pipelines/run_adversarial_debiasing.py

  • Updated to use load_or_train instead of manual save/load blocks.

tests/integration/test_discriminatory_behavior.py (new)

  • Smoke test for AdversarialDebiasing.train_fair() verifying model weights change and discmodel attribute is set.

tests/unit/test_metrics_discriminatory.py (new)

  • Unit tests for p_rule, demographic_parity, true_positive_parity, false_positive_parity, and accuracy covering edge cases including divide-by-zero.

Test plan

  • uv run pytest tests/integration/test_discriminatory_behavior.py -v -m integration
  • uv run pytest tests/unit/test_metrics_discriminatory.py -v
  • uv run pre-commit run --all-files
  • CI passes on this branch

…hitecture

AdversarialDebiasing subclasses DiscriminatoryBehaviorDefense(ABC). Dead
imports (numpy, sklearn.metrics) removed. adversary_auc() extracted from
the defense into DiscriminatoryBehavior metrics class as a static method
with best-threshold AUC logic. p_rule() now returns 0.0 on divide-by-zero.
Example updated to use load_or_train.
@asim29
Copy link
Copy Markdown
Collaborator Author

asim29 commented Apr 22, 2026

Fixes #59 and #60

@asim29 asim29 requested a review from sebszyller April 22, 2026 15:20
@asim29 asim29 self-assigned this Apr 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant