Avi Schwarzschild, Micah Goldblum, Arjun Gupta, John Dickerson, and Tom Goldstein

University of Maryland, College Park

This repository is the official implementation of Just How Toxic is Data Poisoning? A Unified Benchmark for Backdoor and Data Poisoning Attacks.

CIFAR-10

Transfer Learning

Attack	White-box (%)	Black-box (%)
Feature Collision	22.0	7.0
Convex Polytope	33.0	7.0
Bullseye Polytope	85.0	8.5
Clean Label Backdoor	5.0	6.5
Hidden Trigger Backdoor	10.0	9.5

From Scratch Training

Attack	ResNet-18 (%)	MobileNetV2 (%)	VGG11 (%)	Average (%)
Feature Collision	0	1	3	1.33
Convex Polytope	0	1	1	0.67
Bullseye Polytope	3	3	1	2.33
Witches’ Brew	45	25	8	26.00
Clean Label Backdoor	0	1	2	1.00
Hidden Trigger Backdoor	0	4	1	2.67

TinyImageNet

Transfer Learning

Attack	White-box (%)	Black-box (%)
Feature Collision	49.0	32.0
Convex Polytope	14.0	1.0
Bullseye Polytope	100.0	10.5
Clean Label Backdoor	3.0	1.0
Hidden Trigger Backdoor	3.0	0.5

From Scratch Training

Attack	VGG11 (%)
Feature Collision	4
Convex Polytope	0
Bullseye Polytope	44
Witches’ Brew	32
Clean Label Backdoor	0
Hidden Trigger Backdoor	0

For more information on each attack consult our paper and the original sources listed there.

Getting Started:

Requirements

To install requirements:

pip install -r requirements.txt

Then download the TinyImageNet Dataset. (Additionally available on our drive). In learning_module.py, change the line

TINYIMAGENET_ROOT = "/fs/cml-datasets/tiny_imagenet"

accordingly, to point to the unzipped TinyImageNet directory. (It is left in this repo to match our filesystem, and will likely not work with yours.)

Pre-trained Models

Pre-trained checkpoints used in this benchmark can be downloaded from here. They should be copied into the pretrained_models folder (which is empty until downloaded models are added).

Testing

To test a model, run:

python test_model.py --model <model> --model_path <path_to_model_file>

See the code for additional optional arguments.

Crafting Poisons With Our Setups

See How To for full details and sample code.

Evaluating A Single Batch of Poison Examples

We have left one sample folder of poisons in poison_examples.

python poison_test.py --model <model> --model_path <model_path> --poisons_path <path_to_poisons_dir>

This allows users to test their poisons in a variety of settings, not only the benchmark setups. See the file poison_test.py for a comprehensive list of arguments.

Benchmarking A Backdoor or Triggerless Attack

To compute benchmark scores, craft 100 batches of poisons using the setup pickles (for transfer learning: poison_setups_transfer_learning.pickle, for from-scratch training: poison_setups_from_scratch.pickle), and run the following.

Important Note: In order to be on the leaderboard, new submissions must host their poisoned datasets online for public access, so results can be corroborated without producing new poisons. Consider a Dropbox or GoogleDrive folder with all 100 batches of poisons.

For one trial of transfer learning poisons:

python benchmark_test.py --poisons_path <path_to_poison_directory>  --dataset <dataset>

For one trial of from-scratch training poisons:

python benchmark_test.py --poisons_path <path_to_poison_directory> --dataset <dataset> --from_scratch

To benchmark 100 batches of poisons, run

bash benchmark_all.sh <path_to_directory_with_100_batches>

bash benchmark_all.sh <path_to_directory_with_100_batches> from_scratch