Skip to content

refactor(poisoning): migrate to new attack/defense architecture#77

Open
asim29 wants to merge 1 commit intopr/4.3-evasionfrom
pr/4.4-poisoning
Open

refactor(poisoning): migrate to new attack/defense architecture#77
asim29 wants to merge 1 commit intopr/4.3-evasionfrom
pr/4.4-poisoning

Conversation

@asim29
Copy link
Copy Markdown
Collaborator

@asim29 asim29 commented Apr 21, 2026

Context

Fourth in the refactor/risk-modules stack. Stacked on #76 (evasion vertical).

Changes

amulet/poisoning/attacks/badnets.py

  • Subclasses PoisoningAttack(ABC) and implements the new poison_train(dataset) / poison_test(dataset) split entry points, replacing the old attack(mode=...) dispatch.
  • Constructor delegates random_seed to super().__init__(random_seed).

amulet/poisoning/defenses/outlier_removal.py

  • Removes pandas dependency; filtering now uses pure NumPy.
  • Fixes threshold direction: previously kept samples below the (100 - percent) percentile (wrong); now correctly keeps samples at or above the percent percentile.
  • Renames self.trainself._train_fn to avoid shadowing the built-in.
  • Fixes typo: _knn_shapely_knn_shapley.
  • Honors self.criterion and self.optimizer from the base class instead of creating new ones internally.

examples/attack_pipelines/run_poisoning.py

  • Updated to the new poison_train / poison_test API.

tests/integration/test_poisoning.py (new)

  • Smoke tests for BadNets and OutlierRemoval covering the full poison_trainpoison_test lifecycle, output shape, label-flip correctness, and defense coverage.

Test plan

  • uv run pytest tests/integration/test_poisoning.py -v -m integration
  • uv run pre-commit run --all-files
  • CI passes on this branch

BadNets (attacks/badnets.py):
  - Now subclasses PoisoningAttack(ABC).
  - Splits attack(mode="train"|"test") into the two required ABC methods:
    poison_train(dataset) and poison_test(dataset).
  - poison_train keeps existing logic; poison_test keeps existing logic.
  - Constructor delegates random_seed to super().__init__(random_seed).
  - torch.tensor -> torch.as_tensor in train path to avoid an extra copy.

OutlierRemoval (defenses/outlier_removal.py):
  - Removes pandas dependency; replaces pd.Series wrapper with direct
    numpy operations on normalized_scores, train_inputs, train_targets.
  - Fixes threshold direction: was keeping samples BELOW the (100-percent)
    percentile (wrong); now keeps samples AT OR ABOVE the percent-th
    percentile, so the lowest-scoring self.percent% are removed as intended.
  - Renames self.train -> self._train_fn to avoid shadowing the built-in.
  - Fixes typo: _knn_shapely -> _knn_shapley (method name).
  - Honors self.criterion and self.optimizer from the base class instead of
    silently creating new CrossEntropyLoss and Adam instances.
  - DataLoader shuffle=False -> shuffle=True for the cleaned training set.
@asim29 asim29 force-pushed the pr/4.4-poisoning branch from 5b66d42 to b1cca7a Compare April 21, 2026 22:19
@asim29 asim29 changed the title refactor(poisoning): migrate poisoning vertical to new attack/defense architecture refactor(poisoning): migrate to new attack/defense architecture Apr 21, 2026
@asim29 asim29 self-assigned this Apr 22, 2026
@asim29 asim29 requested a review from sebszyller April 22, 2026 15:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant