Skip to content

daletoniris/adversarial-ml-attacks

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Adversarial Attacks on Neural Networks

Implementation of adversarial example generation using the Fast Gradient Sign Method (FGSM) against pre-trained neural networks. Demonstrates how imperceptible perturbations can fool state-of-the-art image classifiers.

TensorFlow Python

What are Adversarial Examples?

Adversarial examples are inputs intentionally designed to cause a machine learning model to make mistakes. By adding carefully crafted noise to an image, we can make a neural network misclassify it with high confidence — while the changes are invisible to the human eye.

  Original Image        +   FGSM Noise   =   Adversarial Image
  "Labrador" (99.8%)         (epsilon)        "Missile" (99.9%)

How it Works

The FGSM attack computes the gradient of the loss with respect to the input image, then creates a perturbation in the direction that maximizes the loss:

perturbation = epsilon * sign(gradient_x(J(theta, x, y)))
adversarial_image = original_image + perturbation

Implementation

  • Target model: MobileNetV2 (pre-trained on ImageNet)
  • Attack method: FGSM (Fast Gradient Sign Method)
  • Framework: TensorFlow 2.x / Keras

adversarial.py

  • Loads pre-trained MobileNetV2
  • Computes loss gradient w.r.t. input
  • Generates adversarial perturbation with configurable epsilon
  • Visualizes original vs. adversarial classification

Usage

python adversarial.py

Why This Matters

Understanding adversarial attacks is critical for:

  • AI Security — building robust models that resist manipulation
  • Autonomous systems — ensuring self-driving cars cannot be fooled by stickers
  • Content moderation — detecting adversarial bypass attempts
  • WAF/IDS — defending against AI-powered evasion techniques

References

  • Goodfellow et al. — "Explaining and Harnessing Adversarial Examples" (2014)
  • Kurakin et al. — "Adversarial Examples in the Physical World" (2016)

Year

2021

Part of a series of talks on AI security presented at Chubut Hack cybersecurity conference.

About

FGSM adversarial attacks on neural networks - AI security research

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages