DNN: make MatMul support 3D or 4D with broadcast #22828

WanliZhong · 2022-11-18T05:22:35Z

Merge with extra: opencv/opencv_extra#1018

This PR follows the #22775

The main purpose of this PR is making MatMul support the broadcast that the second input has less dimention than the first one. And let the operation support SIMD and multi-thread. Beacuse it doesn't support 1D Mat, only support MatMul like

2x3x4 mul 4x5 -> 2x3x5
2x3x4x5 mul 3x5x6 -> 2x3x4x6

2 const inputs: create a virtual layer for the first input
1 const input with CPU (whether or not using broadcast): use the SIMD and multi-thread flow which for InnerProduct
1 const input with CUDA: broadcast inputs will fallback to CPU. Inputs with the same shape will use the cuda.

Pull Request Readiness Checklist

See details at https://github.com/opencv/opencv/wiki/How_to_contribute#making-a-good-pull-request

I agree to contribute to the project under Apache 2 License.
To the best of my knowledge, the proposed patch is not based on a code under GPL or another license that is incompatible with OpenCV
The PR is proposed to the proper branch
There is a reference to the original bug report and related work
There is accuracy test, performance test and test data in opencv_extra repository, if applicable
Patch to opencv_extra has the same branch name.
The feature is well documented and sample code can be built with the project CMake

WanliZhong · 2022-12-09T07:47:44Z

Because the upstream has supported tranA and tranB, I will fix the confilcts.

asmorkalov · 2022-12-13T12:30:18Z

@zihaomu @rogday Your turn.

zihaomu

Thanks for your contribution. LGTM! 👍

alalek · 2022-12-14T01:14:59Z

Merge branch '4.x' into improve_matmul
fix conflicts

Use git rebase instead of merge commits to have clear changes. GitHub has issues with handling PRs which includes merge commits.
PR should have 1 commit. This is stated in contribution guidelines.

WanliZhong · 2022-12-14T07:22:13Z

I'm sorry I forgot to squash it. I will squash it later.

rogday

👍

alalek · 2022-12-20T08:52:44Z

modules/dnn/test/test_onnx_importer.cpp

@@ -921,6 +921,7 @@ TEST_P(Test_ONNX_layers, MatMul_init)
    testONNXModels("matmul_4d_init");

    testONNXModels("matmul_init_2");
+    testONNXModels("matmul_init_bcast");


There is failed OpenCL FP16 test:

[ RUN ] Test_ONNX_layers.MatMul_init/1, where GetParam() = OCV/OCL_FP16 [ INFO:0@189.433] global onnx_importer.cpp:822 populateNet DNN/ONNX: loading ONNX v8 model produced by 'matmul_2d_init'. Number of nodes = 1, initializers = 1, inputs = 2, outputs = 1 [ INFO:0@189.433] global onnx_importer.cpp:724 parseOperatorSet DNN/ONNX: ONNX opset version = 17 [ INFO:0@189.433] global onnx_importer.cpp:991 handleNode DNN/ONNX: processing node with 2 inputs and 1 outputs: [MatMul]:(onnx_node_output_0!output) from domain='ai.onnx' [ INFO:0@189.434] global onnx_importer.cpp:822 populateNet DNN/ONNX: loading ONNX v8 model produced by 'matmul_3d_init'. Number of nodes = 1, initializers = 1, inputs = 2, outputs = 1 [ INFO:0@189.434] global onnx_importer.cpp:724 parseOperatorSet DNN/ONNX: ONNX opset version = 17 [ INFO:0@189.434] global onnx_importer.cpp:991 handleNode DNN/ONNX: processing node with 2 inputs and 1 outputs: [MatMul]:(onnx_node_output_0!output) from domain='ai.onnx' [ INFO:0@189.434] global onnx_importer.cpp:822 populateNet DNN/ONNX: loading ONNX v8 model produced by 'matmul_4d_init'. Number of nodes = 1, initializers = 1, inputs = 2, outputs = 1 [ INFO:0@189.434] global onnx_importer.cpp:724 parseOperatorSet DNN/ONNX: ONNX opset version = 17 [ INFO:0@189.434] global onnx_importer.cpp:991 handleNode DNN/ONNX: processing node with 2 inputs and 1 outputs: [MatMul]:(onnx_node_output_0!output) from domain='ai.onnx' [ INFO:0@189.434] global onnx_importer.cpp:822 populateNet DNN/ONNX: loading ONNX v8 model produced by 'matmul_init_2'. Number of nodes = 2, initializers = 2, inputs = 3, outputs = 1 [ INFO:0@189.434] global onnx_importer.cpp:724 parseOperatorSet DNN/ONNX: ONNX opset version = 17 [ INFO:0@189.434] global onnx_importer.cpp:991 handleNode DNN/ONNX: processing node with 2 inputs and 1 outputs: [MatMul]:(onnx_node_output_0!outputY) from domain='ai.onnx' [ INFO:0@189.434] global onnx_importer.cpp:991 handleNode DNN/ONNX: processing node with 2 inputs and 1 outputs: [Add]:(onnx_node_output_0!output) from domain='ai.onnx' [ INFO:0@189.435] global onnx_importer.cpp:822 populateNet DNN/ONNX: loading ONNX v8 model produced by 'matmul_init_bcast'. Number of nodes = 1, initializers = 1, inputs = 2, outputs = 1 [ INFO:0@189.435] global onnx_importer.cpp:724 parseOperatorSet DNN/ONNX: ONNX opset version = 17 [ INFO:0@189.435] global onnx_importer.cpp:991 handleNode DNN/ONNX: processing node with 2 inputs and 1 outputs: [MatMul]:(onnx_node_output_0!output) from domain='ai.onnx' /build/precommit_opencl_linux/4.x/opencv/modules/dnn/test/test_common.impl.hpp:74: Failure Expected: (normL1) <= (l1), actual: 1.22411 vs 0.004 |ref| = 6.9979562759399414 /build/precommit_opencl_linux/4.x/opencv/modules/dnn/test/test_common.impl.hpp:77: Failure Expected: (normInf) <= (lInf), actual: 6.99796 vs 0.02 |ref| = 6.9979562759399414 [ INFO:0@189.435] global ts.cpp:850 testTearDown Memory_usage (OpenCL): 3960 (base=0 current=0) [ FAILED ] Test_ONNX_layers.MatMul_init/1, where GetParam() = OCV/OCL_FP16 (2 ms)

WanliZhong mentioned this pull request Nov 18, 2022

Add test for MatMul with broadcast opencv/opencv_extra#1018

Merged

WanliZhong added category: dnn category: dnn (onnx) ONNX suport issues in DNN module labels Nov 18, 2022

WanliZhong force-pushed the improve_matmul branch 2 times, most recently from e72deee to 05508ee Compare November 24, 2022 09:32

WanliZhong force-pushed the improve_matmul branch 2 times, most recently from c731aaf to 34da3c0 Compare December 1, 2022 07:23

WanliZhong requested review from zihaomu and rogday December 1, 2022 08:21

WanliZhong force-pushed the improve_matmul branch from 7b504b5 to c5b5a00 Compare December 1, 2022 08:41

WanliZhong marked this pull request as ready for review December 1, 2022 10:02

WanliZhong marked this pull request as draft December 9, 2022 07:48

WanliZhong marked this pull request as ready for review December 9, 2022 09:27

zihaomu approved these changes Dec 13, 2022

View reviewed changes

WanliZhong force-pushed the improve_matmul branch from 70593a3 to 2a3853a Compare December 15, 2022 01:52

make MatMul support 3D or 4D with broadcast

4891818

WanliZhong force-pushed the improve_matmul branch from 2a3853a to 4891818 Compare December 15, 2022 02:37

rogday approved these changes Dec 15, 2022

View reviewed changes

asmorkalov added this to the 4.7.0 milestone Dec 15, 2022

asmorkalov merged commit ac6fb17 into opencv:4.x Dec 15, 2022

alalek reviewed Dec 20, 2022

View reviewed changes

alalek mentioned this pull request Dec 20, 2022

dnn: disable OpenCL code path in MatMul processing #22995

Merged

alalek mentioned this pull request Jan 8, 2023

(5.x) Merge 4.x #23113

Merged

WanliZhong mentioned this pull request Jan 19, 2023

DNN: make GEMM can be supported with transA and transB in CUDA #23061

Merged

6 tasks

WanliZhong deleted the improve_matmul branch May 16, 2023 12:33

dkurt mentioned this pull request Aug 3, 2023

Resolve uncovered CUDA dnn layer #24080

Merged

6 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

DNN: make MatMul support 3D or 4D with broadcast #22828

DNN: make MatMul support 3D or 4D with broadcast #22828

Uh oh!

WanliZhong commented Nov 18, 2022 •

edited

Loading

Uh oh!

WanliZhong commented Dec 9, 2022 •

edited

Loading

Uh oh!

asmorkalov commented Dec 13, 2022

Uh oh!

zihaomu left a comment

Uh oh!

alalek commented Dec 14, 2022

Uh oh!

WanliZhong commented Dec 14, 2022 via email

Uh oh!

rogday left a comment

Uh oh!

alalek Dec 20, 2022

Uh oh!

Uh oh!

Uh oh!

DNN: make MatMul support 3D or 4D with broadcast #22828

DNN: make MatMul support 3D or 4D with broadcast #22828

Uh oh!

Conversation

WanliZhong commented Nov 18, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request Readiness Checklist

Uh oh!

WanliZhong commented Dec 9, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

asmorkalov commented Dec 13, 2022

Uh oh!

zihaomu left a comment

Choose a reason for hiding this comment

Uh oh!

alalek commented Dec 14, 2022

Uh oh!

WanliZhong commented Dec 14, 2022 via email

Uh oh!

rogday left a comment

Choose a reason for hiding this comment

Uh oh!

alalek Dec 20, 2022

Choose a reason for hiding this comment

Uh oh!

Uh oh!

WanliZhong commented Nov 18, 2022 •

edited

Loading

WanliZhong commented Dec 9, 2022 •

edited

Loading