Error at: caffe2/core/context_gpu.cu:343: out of memory #5

Description
Hi, thanks for the great work!
Ran to an out of memory issue when we were running test_net.py on COCO dataset with 2 TITAN X set up. Installations were fine and coco datasets were included in /lib/datasets/data/coco. A line was added in test_net.py to facilitate the use of 3rd and 4th GPUs.
os.environ['CUDA_VISIBLE_DEVICES'] = "2,3"
We tried to run the code:
./tools/test_net.py --cfg configs/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml --multi-gpu-testing TEST.WEIGHTS https://s3-us-west-2.amazonaws.com/detectron/35861858/12_2017_baselines/e2e_mask_rcnn_R-101-FPN_2x.yaml.02_32_51.SgT4y1cO/output/train/coco_2014_train:coco_2014_valminusminival/generalized_rcnn/model_final.pkl NUM_GPUS 2
and encountered error as such:
terminate called after throwing an instance of 'caffe2::EnforceNotMet'
what(): [enforce fail at context_gpu.cu:343] error == cudaSuccess. 2 vs 0. Error at: /home/user/default/caffe2/caffe2/core/context_gpu.cu:343: out of memory Error from operator:
input: "gpu_0/roi_feat_fpn2" input: "gpu_0/roi_feat_fpn3" input: "gpu_0/roi_feat_fpn4" input: "gpu_0/roi_feat_fpn5" output: "gpu_0/roi_feat_shuffled" output: "gpu_0/_concat_roi_feat" name: "" type: "Concat" arg { name: "axis" i: 0 } device_option { device_type: 1 cuda_gpu_id: 0 }
*** Aborted at 1516700071 (unix time) try "date -d @1516700071" if you are using GNU date ***
PC: @ 0x7faf48c0c428 gsignal
*** SIGABRT (@0x3e800001c16) received by PID 7190 (TID 0x7fae72ffd700) from PID 7190; stack trace: ***
@ 0x7faf48fb2390 (unknown)
@ 0x7faf48c0c428 gsignal
@ 0x7faf48c0e02a abort
@ 0x7faf45bf484d __gnu_cxx::__verbose_terminate_handler()
@ 0x7faf45bf26b6 (unknown)
@ 0x7faf45bf2701 std::terminate()
@ 0x7faf45c1dd38 (unknown)
@ 0x7faf48fa86ba start_thread
@ 0x7faf48cde3dd clone
@ 0x0 (unknown)
Aborted (core dumped)
Traceback (most recent call last):
File "./tools/test_net.py", line 168, in <module>
main(ind_range=args.range, multi_gpu_testing=args.multi_gpu_testing)
File "./tools/test_net.py", line 133, in main
results = parent_func(multi_gpu=multi_gpu_testing)
File "/home/user/default/Detectron/lib/core/test_engine.py", line 59, in test_net_on_dataset
num_images, output_dir
File "/home/user/default/Detectron/lib/core/test_engine.py", line 82, in multi_gpu_test_net_on_dataset
'detection', num_images, binary, output_dir
File "/home/user/default/Detectron/lib/utils/subprocess.py", line 83, in process_in_parallel
log_subprocess_output(i, p, output_dir, tag, start, end)
File "/home/user/default/Detectron/lib/utils/subprocess.py", line 121, in log_subprocess_output
assert ret == 0, 'Range subprocess failed (exit code: {})'.format(ret)
AssertionError: Range subprocess failed (exit code: 134)
Is there any settings that we are missing? Thank you!