Targeted Adversarial Examples

An adversarial example is an instance with small, intentional feature perturbations that cause a machine-learning model to make a false prediction. Adversarial examples are counterfactual examples with the aim to deceive the model, not interpret it. Many techniques exist to create adversarial examples. Most approaches suggest minimising the distance between the adversarial example and the instance to be manipulated; while shifting the prediction to the desired (adversarial) outcome. Some methods require access to the gradients of the model, which of course only works with gradient-based models such as neural networks, other methods only require access to the prediction function, which makes these methods model-agnostic. This library currently implements two basic gradient-based methods:

Fast Gradient Sign Method (FGSM), and,
Basic Iterative Method (BIM) - iterative application of FGSM

Set up Instructions

make sure you have installed conda or miniconda
download this project git clone https://github.com/chrisgks/targeted_adversarial_perturbations.git
open up a terminal and navigate to the root folder cd targeted_adversarial_perturbations
install library and dependencies by either running zsh -i ./install.sh or bash -i ./install.sh contingent on the type of shell you are using.

The `install.sh` script

The install.sh script executes the following actions:

creates a new conda environment
installs dependencies found in requirements.txt
activates the newly created environment
runs all the tests with pytest
runs an adversarial attack and saves the attack visual under src/adversarial_outputs
deactivates and deletes the newly created environment

Usage

The engine can be called from the root folder like below:

python src/run_adversarial_engine.py <path/to/image/file> <class_id>

where <path/to/image/file> is self-explanatory; and class_id is imagenet's class index. The default attack method is BIM. The attack method can be specified by an additional argument like below:

python src/run_adversarial_engine.py <path/to/image/file> <class_id> fgsm or,

python src/run_adversarial_engine.py <path/to/image/file> <class_id> bim

Other parameters that can be tweaked:

iterations (BIM Method only - number of iterations)
epsilon (Both methods- perturbation intensity)

Contributions

Anyone who would like to contribute to this library, please feel free to reach out to the author; or raise a PR.

To-do & future features

add basic loging
add more tests
handle exceptions
implement Projected Gradient Descent (PGD) Method
consider implementing other gradient-based methods like Boosting FGSM with Momentum, Carlini Wagner Attack with L2 Norm, and others
Re-think project structure once more methods have been implemented
Think about methods applied to LLMs and experiment
Write documentation

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
notebooks		notebooks
src		src
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
install.sh		install.sh
pytest.ini		pytest.ini
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Targeted Adversarial Examples

Set up Instructions

The `install.sh` script

Usage

Contributions

To-do & future features

About

Releases

Packages

Languages

License

chrisgks/targeted_adversarial_perturbations

Folders and files

Latest commit

History

Repository files navigation

Targeted Adversarial Examples

Set up Instructions

The install.sh script

Usage

Contributions

To-do & future features

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

The `install.sh` script

Packages