Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Discrepancy in Parameter Count and FLOPs in Paper #61

Open
npurson opened this issue Nov 7, 2023 · 0 comments
Open

Discrepancy in Parameter Count and FLOPs in Paper #61

npurson opened this issue Nov 7, 2023 · 0 comments

Comments

@npurson
Copy link

npurson commented Nov 7, 2023

In the paper, on page 8, in Section 4.5, the reported statistics are as follows:

Specifically, TPVFormer has only 6.0M parameters versus 15.7M for MonoScene, and 128G FLOPS per image versus 500G for MonoScene.

However, this information is evidently inaccurate due to the fact that the backbone of MonoScene, EfficientNet-B7, alone contains 66M parameters, as documented in the paper and as verified by our own measurements.

Furthermore, it's worth noting that the correct terminology should be "FLOPs" (Floating-Point Operations) rather than "FLOPS", which stands for "Floating-Point Operations per Second"

For reference, we have included our measurement for MonoScene as assessed by fvcore:

| module                                         | #parameters or shape   | #flops     |
|:-----------------------------------------------|:-----------------------|:-----------|
| model                                          | 0.149G                 | 0.99T      |
|  encoder                                       |  0.133G                |  0.461T    |
|   encoder.encoder.backbone                     |   63.787M              |   45.949G  |
|    encoder.encoder.backbone.conv_stem          |    1.728K              |    0.195G  |
|    encoder.encoder.backbone.bn1                |    0.128K              |    14.445M |
|    encoder.encoder.backbone.blocks             |    62.142M             |    45.053G |
|    encoder.encoder.backbone.conv_head          |    1.638M              |    0.685G  |
|    encoder.encoder.backbone.bn2                |    5.12K               |    2.14M   |
|   encoder.decoder                              |   68.801M              |   0.415T   |
|    encoder.decoder.conv                        |    6.556M              |    3.408G  |
|    encoder.decoder.resizes                     |    77.056K             |    4.328G  |
|    encoder.decoder.upsamples                   |    62.167M             |    0.407T  |
|  decoder                                       |  16.861M               |  0.53T     |
|   decoder.process_l1                           |   28.896K              |   4.474G   |
|    decoder.process_l1.0.blks                   |    13.824K             |    3.624G  |
|    decoder.process_l1.1.btnk                   |    15.072K             |    0.85G   |
|   decoder.process_l2                           |   0.113M               |   2.178G   |
|    decoder.process_l2.0.blks                   |    53.76K              |    1.762G  |
|    decoder.process_l2.1.btnk                   |    58.816K             |    0.416G  |
|   decoder.CP_mega_voxels                       |   15.346M              |   54.459G  |
|    decoder.CP_mega_voxels.mega_context         |    3.539M              |    1.812G  |
|    decoder.CP_mega_voxels.context_prior_logits |    0.526M              |    2.147G  |
|    decoder.CP_mega_voxels.aspp.blks            |    10.62M              |    43.499G |
|    decoder.CP_mega_voxels.resize               |    0.66M               |    2.705G  |
|   decoder.up_13_l2.up_bn                       |   0.885M               |   3.632G   |
|    decoder.up_13_l2.up_bn.0                    |    0.885M              |    3.624G  |
|    decoder.up_13_l2.up_bn.1                    |    0.256K              |    8.389M  |
|   decoder.up_12_l1.up_bn                       |   0.221M               |   7.281G   |
|    decoder.up_12_l1.up_bn.0                    |    0.221M              |    7.248G  |
|    decoder.up_12_l1.up_bn.1                    |    0.128K              |    33.554M |
|   decoder.up_l1_full.up_bn                     |   55.392K              |   14.63G   |
|    decoder.up_l1_full.up_bn.0                  |    55.328K             |    14.496G |
|    decoder.up_l1_full.up_bn.1                  |    64                  |    0.134G  |
|   decoder.ssc_head                             |   0.211M               |   0.443T   |
|    decoder.ssc_head.conv0                      |    27.68K              |    57.982G |
|    decoder.ssc_head.aspp.blks                  |    0.166M              |    0.349T  |
|    decoder.ssc_head.conv_cls                   |    17.3K               |    36.239G |
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant