This repository contains the code to evaluate models on the image obfuscation benchmark, first presented in Benchmarking Robustness to Adversarial Image Obfuscations (Stimberg et al., 2023).
The dataset consists of 22 obfuscations and the Clean data. 19 obfuscations are training obfuscations and 3 are hold-out obfuscations. All images are central cropped to 224 x 224 and saved as compressed JPEG images. Each obfuscation is applied to each image in the ILSVRC2012 dataset. For each image, the file_name, label and obfuscation hyper-parameters are stored with it. The dataset can be loaded through the TensorFlow datasets API. Each combination of train
/ validation
and an obfuscation is its own split, e.g. to load the validation split obfuscated with the StyleTransfer
obfuscation do
import tensorflow_datasets as tfds
ds = tfds.load('obfuscated_imagenet', split='validation_StyleTransfer', data_dir='/path/to/extracted/dataset/')
where the splits must be present in the /path/to/extracted/dataset/obfuscated_imagenet/1.0.0
directory.
To load multiple obfuscations together, e.g. for training use the sample_from_datasets
function.
Clean | AdversarialPatches | BackgroundBlurComposition |
---|---|---|
ColorNoiseBlocks | ColorPatternOverlay | Halftoning |
---|---|---|
HighContrastBorder | IconOverlay | ImageOverlay |
---|---|---|
Interleave | InvertLines | LineShift |
---|---|---|
LowContrastTriangles | PerspectiveComposition | PerspectiveTransform |
---|---|---|
PhotoComposition | RotateBlocks | RotateImage |
---|---|---|
StyleTransfer | SwirlWarp | TextOverlay |
---|---|---|
Texturize | WavyColorWarp |
---|---|
You can download the validation and train splits for all the obfuscations below. If you want to load them with the Tensorflow datasets API as described above you also need to download these two JSON files: dataset_info.json, features.json.
Obfuscation | Validation | Train |
---|---|---|
Clean | tar (1.2 GB) | tar (31 GB) |
AdversarialPatches | tar (1.4 GB) | tar ( 36 GB) |
BackgroundBlurComposition | tar ( 0.5 GB) | tar (12 GB) |
ColorNoiseBlocks | tar (1.9 GB) | tar (48 GB) |
ColorPatternOverlay | tar (1.8 GB) | tar (45 GB) |
Halftoning | tar (2.4 GB) | tar (54 GB) |
HighContrastBorder | tar (2.1 GB) | tar (55 GB) |
IconOverlay | tar (1.7 GB) | tar (43 GB) |
ImageOverlay | tar (1.2 GB) | tar (30 GB) |
Interleave | tar (1.5 GB) | tar (38 GB) |
InvertLines | tar (1.4 GB) | tar (35 GB) |
LineShift | tar (1.5 GB) | tar (37 GB) |
LowContrastTriangles | tar (0.9 GB) | tar (23 GB) |
PerspectiveComposition | tar (1.1 GB) | tar (29 GB) |
PerspectiveTransform | tar (0.4 GB) | tar (9.5 GB) |
PhotoComposition | tar (1.2 GB) | tar (31 GB) |
RotateBlocks | tar (1.5 GB) | tar (37 GB) |
RotateImage | tar (1.0 GB) | tar (24 GB) |
StyleTransfer | tar (1.3 GB) | tar (34 GB) |
SwirlWarp | tar (1.2 GB) | tar (29 GB) |
TextOverlay | tar (2.1 GB) | tar (55 GB) |
Texturize | tar (1.3 GB) | tar (34 GB) |
WavyColorWarp | tar (1.3 GB) | tar (33 GB) |
Download the eval dataset and extract it to a folder.
Clone this repository.
git clone https://github.com/google-deepmind/image_obfuscation_benchmark.git
Execute run.sh
to create and activate a virtualenv, install all necessary
dependencies and run a test program to ensure that you can import all the
modules.
cd image_obfuscation_benchmark
sh image_obfuscation_benchmark/run.sh
source /tmp/image_obfuscation_benchmark/image_obfuscation_benchmark/bin/activate
and then run
python3 -m image_obfuscation_benchmark.eval.predict \
--dataset_path=/path/to/the/downloaded/dataset/ \
--model_path=https://tfhub.dev/google/imagenet/resnet_v2_50/classification/1 \
--evaluate_obfuscation=Clean \
--normalization=zero_one \
--output_dir=/tmp/
Which will write predictions to /tmp/Clean.csv
. This has to be done for all
obfuscations. Afterwards you run
python3 -m image_obfuscation_benchmark.eval.gather_results \
--output_dir=/tmp/
which will load all the predictions, calculate the metrics and save them to
/tmp/metrics.csv
.
We do not supply code to train models on the dataset at the moment but it can be easily loaded with tensorflow_datasets into any pipeline.
The specific obfuscations that we use in our benchmark may have the potential to fool automatic filters and therefore increase the amount of harmful con�tent on digital platforms. To reduce this risk, we decided against releasing the code to create the obfuscations systematically and instead only releasing the precomputed dataset and code to evaluate on it.
If you use this code (or any derived code) in your work, please cite the accompanying paper:
@misc{stimberg2023benchmarking,
title={Benchmarking Robustness to Adversarial Image Obfuscations},
author={Florian Stimberg and Ayan Chakrabarti and Chun-Ta Lu and Hussein Hazimeh and Otilia Stretcu and Wei Qiao and Yintao Liu and Merve Kaya and Cyrus Rashtchian and Ariel Fuxman and Mehmet Tek and Sven Gowal},
year={2023},
eprint={2301.12993},
archivePrefix={arXiv},
}
Copyright 2023 DeepMind Technologies Limited.
All software is licensed under the Apache License, Version 2.0 (Apache 2.0); you may not use this file except in compliance with the License. You may obtain a copy of the Apache 2.0 license at
https://www.apache.org/licenses/LICENSE-2.0
All non-code materials are licensed under the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0). You may obtain a copy of the CC BY-NC License at:
https://creativecommons.org/licenses/by-nc/4.0/legalcode
You may not use the non-code portions of this file except in compliance with the CC BY-NC License.
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.
This is not an official Google product.