Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Clarification of the BPD results on ImageNet32/ImageNet64 #7

Open
zhengkw18 opened this issue Nov 29, 2023 · 2 comments
Open

Clarification of the BPD results on ImageNet32/ImageNet64 #7

zhengkw18 opened this issue Nov 29, 2023 · 2 comments

Comments

@zhengkw18
Copy link

Congratulations on your good work! I think DenseFlow is the SOTA among normalizing flows, but I would like to make some clarifications regarding its comparison with other methods (such as diffusion models).

I was comparing DenseFlow against VDM on ImageNet64x64.

DenseFlow: 3.35 BPD, 130M, 1 V100 ~2 weeks
VDM: 3.4 BPD, ?M, 128 TPUv3 for ?weeks?

It looks like DenseFlow gets better BPD with ~100x less compute,

I think the reason why DenseFlow has such a good BPD on ImageNet32/ImageNet64 with distinctly lower computational cost is that the wrong version of downsampled ImageNet was used. I have recently uploaded the code of our ICML2023 paper Improved Techniques for Maximum Likelihood Estimation for Diffusion ODEs (https://github.com/thu-ml/i-DODE), where this question is emphasized as:

There are two different versions of ImageNet32 dataset. For fair comparisons, we use both versions of ImageNet32, one is downloaded from https://image-net.org/data/downsample/Imagenet32_train.zip, following Flow Matching [3], and the other is downloaded from http://image-net.org/small/train_32x32.tar (old version, no longer available), following ScoreSDE and VDM. The former dataset applies anti-aliasing and is easier for maximum likelihood training.

Clearly, DenseFlow chose the new version of ImageNet32/64 (https://github.com/matejgrcic/DenseFlow/blob/473220a9c02b262b481fbaa50a947e40bad3f99c/denseflow/data/datasets/image/imagenet32.py), which is in favor of the BPD. Therefore, I suggest the author clarify this and remove the BPD result from the rank list (https://paperswithcode.com/paper/densely-connected-normalizing-flows), where other methods are using the old version ImageNet and the comparison is unfair and confusing.

@zhengkw18
Copy link
Author

zhengkw18 commented Nov 29, 2023

We conducted experiments on both versions of ImageNet32, and found that the new version typically results in about 0.3 lower BPD than the old version: 3.43 (new version, batch size 128, A40 GPU) vs. 3.69 (old version, batch size 512, A100 GPU). So the dataset difference is rather notable.

It seems that Efficient-VDVAE on https://paperswithcode.com/sota/image-generation-on-imagenet-64x64 also uses the wrong version of ImageNet and leads to unfair comparison.

Under fair comparison, VDM is still the current SOTA likelihood model on CIFAR10/ImageNet32/ImageNet64.

@matejgrcic
Copy link
Owner

matejgrcic commented Dec 17, 2023

Hi, thanks for pointing out the mismatch between the two versions of IN32. As far as I know, this is mostly unknown in the community and the old version being unavailable doesn't help. I will update the README so that it is more clear that we trained on the new version of IN32. Cheers!

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants