Skip to content

refactor: add abstract base classes for all risk module attack and defense entry points#72

Open
asim29 wants to merge 1 commit intopr/model-base-shared-utilities-toolingfrom
pr/abstract-base-classes
Open

refactor: add abstract base classes for all risk module attack and defense entry points#72
asim29 wants to merge 1 commit intopr/model-base-shared-utilities-toolingfrom
pr/abstract-base-classes

Conversation

@asim29
Copy link
Copy Markdown
Collaborator

@asim29 asim29 commented Apr 21, 2026

Summary

This PR introduces ABC base classes for every risk module's attacks and defenses, establishing a uniform lifecycle contract across security, privacy, and fairness modules. It also ships the DistributionInferenceAttack and DistributionSplits data utilities, fixes the discriminatory behavior defense class naming typo, and migrates MI shadow-model training away from the removed InferenceModel wrapper.

Stacked on: pr/model-base-shared-utilities-tooling


What changed and why

Abstract base classes — all risk modules

Each risk module now has an explicit ABC with a single @abstractmethod entry point, replacing the previous pattern of duck-typed or convention-only classes.

File Change
amulet/evasion/attacks/evasion_attack.py EvasionAttack(ABC) with @abstractmethod attack() -> DataLoader
amulet/evasion/defenses/evasion_defense.py EvasionDefense(ABC) with @abstractmethod train_robust() -> nn.Module
amulet/poisoning/attacks/poisoning_attack.py New. PoisoningAttack(ABC) with @abstractmethod poison_train() and @abstractmethod poison_test()
amulet/poisoning/defenses/poisoning_defense.py PoisoningDefense(ABC) with @abstractmethod train_robust() -> nn.Module
amulet/membership_inference/attacks/membership_inference_attack.py MembershipInferenceAttack(ABC); adds _load_shadow_model() helper; removes InferenceModel inner class; migrates torch.cuda.amptorch.amp (PyTorch 2.3 deprecation)
amulet/membership_inference/defenses/membership_inference_defense.py MembershipInferenceDefense(ABC) with @abstractmethod train_private() -> nn.Module
amulet/discriminatory_behavior/defenses/discr_behavior_defense.py Fixed typo: DicriminatoryBehaviorDefenseDiscriminatoryBehaviorDefense; renamed model_criterioncriterion and model_optimizeroptimizer to match standard naming
amulet/attribute_inference/attacks/attribute_inference_attack.py AttributeInferenceAttack(ABC) with @abstractmethod attack() -> dict[str, np.ndarray]
amulet/data_reconstruction/attacks/data_reconstruction_attack.py DataReconstructionAttack(ABC) with @abstractmethod attack() -> dict[str, np.ndarray]
amulet/unauth_model_ownership/attacks/unauth_model_ownership_attack.py New. UnauthModelOwnershipAttack(ABC) with @abstractmethod attack()
amulet/unauth_model_ownership/defenses/unauth_model_ownership_defense.py New. Two separate ABCs: WatermarkDefense with @abstractmethod watermark() and FingerprintDefense with @abstractmethod fingerprint(). Intentionally no shared parent — watermarking and fingerprinting have incompatible return types and lifecycles.

Distribution Inference lifecycle

DistributionInferenceAttack mirrors the shadow-model lifecycle already present in MembershipInferenceAttack.

File Change
amulet/distribution_inference/attacks/distribution_inference_attack.py New. DistributionInferenceAttack(ABC) with train_model_population(), prepare_model_populations(), and @abstractmethod attack() -> dict[str, np.ndarray]
amulet/distribution_inference/dataset_utils.py New. DistributionSplits dataclass and prepare_distribution_splits() — required by the DI ABC
amulet/distribution_inference/attacks/__init__.py Exports DistributionInferenceAttack alongside existing DistributionInference

Concrete class migrations (minimal — only what PR 2 breaks)

File Change
amulet/membership_inference/attacks/lira.py Removed InferenceModel import; replaced with _load_shadow_model(shadow_id) returning (model, in_data) tuple
amulet/discriminatory_behavior/defenses/adversarial_debiasing.py Updated import to DiscriminatoryBehaviorDefense; updated attribute references (model_criterioncriterion, model_optimizeroptimizer)
amulet/unauth_model_ownership/attacks/__init__.py Exports UnauthModelOwnershipAttack
amulet/unauth_model_ownership/defenses/__init__.py Exports WatermarkDefense, FingerprintDefense

Test plan

  • uv run pre-commit run --all-files passes (ruff, ruff-format, basedpyright, prettier, markdownlint all green)
  • Importing from amulet.membership_inference.attacks import MembershipInferenceAttack raises TypeError when instantiated without implementing attack()
  • Importing from amulet.distribution_inference.attacks import DistributionInferenceAttack raises TypeError when instantiated without implementing attack()
  • from amulet.unauth_model_ownership.defenses import WatermarkDefense, FingerprintDefense resolves correctly
  • from amulet.distribution_inference import dataset_utils resolves correctly
  • Existing LiRA concrete class instantiates without error with valid args

🤖 Generated with Claude Code

@asim29 asim29 force-pushed the pr/abstract-base-classes branch from c45212c to db745b5 Compare April 21, 2026 21:35
@asim29 asim29 requested a review from sebszyller April 21, 2026 21:36
…fense entry points

All existing base classes now subclass ABC with a single @AbstractMethod
that enforces the entry-point contract. Concrete classes that don't
implement it will raise TypeError at instantiation, not at call time.

Per-risk changes:

Evasion:
- EvasionAttack(ABC): @AbstractMethod attack() -> DataLoader
- EvasionDefense(ABC): @AbstractMethod train_robust() -> nn.Module

Poisoning:
- PoisoningAttack(ABC) [new file]: @AbstractMethod poison_train() and
  poison_test() — the base was previously an empty shell
- PoisoningDefense(ABC): @AbstractMethod train_robust() -> nn.Module

Membership inference:
- MembershipInferenceAttack(ABC): shadow-model lifecycle centralised on
  the base class; InferenceModel removed (LSP violation — it wrapped
  nn.Module but was used as a data container). Replaced by private
  _load_shadow_model(shadow_id) -> (model, in_data) which owns checkpoint
  loading, device placement, and grad freezing. torch.cuda.amp migrated
  to torch.amp (PyTorch 2.3). @AbstractMethod attack() -> dict.
- MembershipInferenceDefense(ABC): @AbstractMethod train_private() -> nn.Module
- lira.py: updated to use _load_shadow_model() in place of InferenceModel.

Attribute inference:
- AttributeInferenceAttack(ABC): @AbstractMethod attack() -> dict

Distribution inference:
- DistributionInferenceAttack(ABC) [new file]: owns the full model-population
  lifecycle — train_model_population(), prepare_model_populations() (concrete
  shared helpers), and @AbstractMethod attack(). Mirrors the MI lifecycle
  pattern for consistency.
- dataset_utils.py [new file]: DistributionSplits dataclass and
  prepare_distribution_splits() — ratio-preserving subsampling helpers
  required by the DI base class.

Data reconstruction:
- DataReconstructionAttack(ABC): @AbstractMethod attack() -> list[Tensor]

Discriminatory behavior:
- Renamed DicriminatoryBehaviorDefense (typo) -> DiscriminatoryBehaviorDefense.
- model_criterion/model_optimizer attributes renamed to criterion/optimizer
  for consistency with all other defense base classes.
- @AbstractMethod train_fair() -> nn.Module
- adversarial_debiasing.py: import and attribute references updated to match.

Unauthorized model ownership:
- UnauthModelOwnershipAttack(ABC) [new file]: @AbstractMethod attack() -> nn.Module
- WatermarkDefense(ABC) and FingerprintDefense(ABC) [new file]: two separate
  ABCs with no shared parent. WatermarkNN and DatasetInference serve distinct
  verification strategies; a shared base would obscure that.

Closes #26
@asim29 asim29 force-pushed the pr/abstract-base-classes branch from db745b5 to 63758d2 Compare April 21, 2026 21:46
@asim29 asim29 self-assigned this Apr 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant