CoDeNet: Efficient Deployment of Input-Adaptive Object Detection on Embedded FPGAs

1. Object-Detection Networks

The code of quantized object-detection network model for hardware acceleration is under dnn/CoDeNet submodule. We also have non-quantized models to study impact of hardware-friendly deformable convolution modifications. These models are compatiable with the Detectron2 library and are under dnn/CoDeNet_Detectron2.

1.1 Experimental Results in Table 3.

Quantized Network Accuracy Table 3. row 3

Command to run Config a:

python test.py ctdet --arch shufflenetv2 --exp_id pascal_shufflenetv2_256_new1_1 --dataset pascal --input_res 256 --resume --flip_test --gpu 0

Quantized Network Accuracy Table 3. row 5

Command to run Config b:

python test.py ctdet --arch shufflenetv2 --exp_id pascal_shufflenetv2_256_new3_1 --dataset pascal --input_res 256 --resume --flip_test --gpu 0 --maxpool

Quantized Network Accuracy Table 3. row 7

Command to run Config c:

python test.py ctdet --arch shufflenetv2 --exp_id pascal_shufflenetv2_512_new14_1_test --dataset pascal --input_res 512 --resume --flip_test --gpu 0

Quantized Network Accuracy Table 3. row 9

Command to run Config d:

python test.py ctdet --arch shufflenetv2 --exp_id pascal_shufflenetv2_512_new17_1 --dataset pascal --input_res 512 --resume --flip_test --gpu 0 --w2

Quantized Network Accuracy Table 3. row 11

Command to run Config e:

python test.py ctdet --arch shufflenetv2 --exp_id pascal_shufflenetv2_512_new15_1 --dataset pascal --input_res 512 --resume --flip_test --gpu 0 --w2 --maxpool

1.2 Ablation Study Results in Table 1.

Please follow the instructions at CoDeNet_Detectron2 Installation to set up the environment. We also provide a remote server to evaluate the trained model.

Command to run VOC result with modified deformable convolution in Table 1. last row:

python tools/train_net.py --num-gpus 1 --config-file configs/centernet/voc/V2_1.0x_voc_512_4gpus_1x_deform_conv_square_depthwise.yaml --eval-only MODEL.WEIGHTS output/centernet/voc/V2_1.0x_voc_512_4gpus_1x_deform_conv_square_depthwise/model_final.pth 
# result: AP: 41.7	AP50: 64.5	AP75: 43.8

Command to run COCO result with modified deformable convolution in Table 1. last row:

python tools/train_net.py --num-gpus 1 --config-file configs/centernet/coco/V2_1.0x_coco_512_10gpus_1x_deform_conv_square_depthwise.yaml --eval-only MODEL.WEIGHTS output/centernet/coco/V2_1.0x_coco_512_10gpus_1x_deform_conv_square_depthwise/model_final.pth 
# result: AP: 21.6	AP50: 37.4	AP75: 21.8	APs: 6.5	APm: 23.7	APl: 34.8

2. Object-Detection Accelerator

We evaluate the latency of our network on the Ultra96 PYNQ platform.

2.1 HLS Accelerator Source Code

Please refer to cpp files and the system files under ./hls. The precompiled FPGA image is under ./bitfile. The project file can be downloaded here. The hls project can be downloaded here.

2.2 Software Invocation Source Code

The source code for running the first layer layer latency is under sw/tvm. Please follow the sw/tvm/README.md to run it. The source code for calling the accelearator is in codenet.ipynb.

2.3 Latency Results in Table 5 and Figure 8.

Please connect to the Ultra96 board and browse to the ipython notebook page http://192.168.2.1:9090/. Upload the sw/codenet.ipynb and sw/bitfile folder to the remote FPGA. Run the iptyon notebook to see the latency results.

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
dnn		dnn
hls		hls
sw		sw
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CoDeNet: Efficient Deployment of Input-Adaptive Object Detection on Embedded FPGAs

1. Object-Detection Networks

1.1 Experimental Results in Table 3.

Quantized Network Accuracy Table 3. row 3

Quantized Network Accuracy Table 3. row 5

Quantized Network Accuracy Table 3. row 7

Quantized Network Accuracy Table 3. row 9

Quantized Network Accuracy Table 3. row 11

1.2 Ablation Study Results in Table 1.

2. Object-Detection Accelerator

2.1 HLS Accelerator Source Code

2.2 Software Invocation Source Code

2.3 Latency Results in Table 5 and Figure 8.

About

Releases 2

Packages

Contributors 3

Languages

License

hqjenny/CoDeNet

Folders and files

Latest commit

History

Repository files navigation

CoDeNet: Efficient Deployment of Input-Adaptive Object Detection on Embedded FPGAs

1. Object-Detection Networks

1.1 Experimental Results in Table 3.

Quantized Network Accuracy Table 3. row 3

Quantized Network Accuracy Table 3. row 5

Quantized Network Accuracy Table 3. row 7

Quantized Network Accuracy Table 3. row 9

Quantized Network Accuracy Table 3. row 11

1.2 Ablation Study Results in Table 1.

2. Object-Detection Accelerator

2.1 HLS Accelerator Source Code

2.2 Software Invocation Source Code

2.3 Latency Results in Table 5 and Figure 8.

About

Resources

License

Stars

Watchers

Forks

Releases 2

Packages 0

Contributors 3

Languages

Packages