Download imagenet_captions.zip from https://github.com/mlfoundations/imagenet-captions and unzip it to obtain the imagenet_captions.json.
Run preprocess_imagenet_captions.py. This will create a dataframe at this folder and a dictionary containing WNID of each image at imagenet-captions/processed/labels.
Download ILSVRC2012 training images from https://www.image-net.org/download.php and place them under ilsvrc2012/ILSVRC2012_img_train. This is a temporary place.
Run drop_imagenet_examples_wo_caption.py. This will copy ILSVRC training images which are present in ImageNet-Captions to ilsvrc2012/ILSVRC2012_img_train_selected. You can now remove the images left in ilsvrc2012/ILSVRC2012_img_train.

Provide feedback

Saved searches