Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Clarify interpolation algorithms for resample2d #816

Merged
merged 9 commits into from
Feb 13, 2025

Conversation

inexorabletash
Copy link
Member

@inexorabletash inexorabletash commented Feb 12, 2025

This gives formal definitions for the nearest-neighbor and linear interpolation modes. The definitions are based on text given by @fdwr and baseline implementation by @BruceDai and independently verified.

Resolves #358


Preview | Diff

Copy link
Collaborator

@fdwr fdwr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice. Thanks for improving the spec. 🙏

@@ -7245,6 +7284,33 @@ partial dictionary MLOpSupportLimits {
1. Return |output|.
</details>


<div class="note">
The specific sampling algorithms are based on those widely used in existing Machine Learning frameworks. For example, when performing {{MLInterpolationMode/linear}} resampling from the following *[4, 4]* input tensor (considering only spatial dimensions):
Copy link
Collaborator

@fdwr fdwr Feb 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The specific sampling algorithms are based on those widely used in existing Machine Learning frameworks.

Note

Some ML libraries got this wrong historically and did things like stretch the centers of the input corner pixels to the centers of the output corner pixels (rather than the corner extents, including the whole pixel box rectangle rather than just a point sample), which graphics experts know is incorrect 😉 and gives you poor results. Imaging libraries like OpenCV though do the right thing, and thankfully newer versions of TF and PyTorch have fixed this behavior by default. e.g. #1 #2.

Wrong:
image
Right:
image

(no action - resolve me)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to say more in the spec we can!

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated comment with visualization - think it would help? I should probably recreate it from scratch to avoid directly reusing Jacob Richeimer's figure https://jricheimer.github.io/tensorflow/2019/02/11/resize-confusion/.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking it will be hard to capture more of the history here without it turning into an essay equivalent to these linked resources. Maybe we should just link to these blog posts? @anssiko - any thoughts on non-normative links to potentially ephemeral resources?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to stall on this aspect. Happy to merge if you say go.

Copy link
Contributor

@huningxin huningxin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with nits, thanks much!

Copy link
Collaborator

@fdwr fdwr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

inexorabletash and others added 9 commits February 13, 2025 15:20
This gives formal definitions for the `nearest-neighbor` and `linear`
interpolation modes. The definitions are based on text given by @fdwr
and baseline implementation by @BruceDai and independently verified.

Resolves webmachinelearning#358
Co-authored-by: Dwayne Robinson <dwayner@microsoft.com>
Co-authored-by: Dwayne Robinson <dwayner@microsoft.com>
Co-authored-by: Dwayne Robinson <dwayner@microsoft.com>
Co-authored-by: Ningxin Hu <ningxin.hu@intel.com>
Co-authored-by: Ningxin Hu <ningxin.hu@intel.com>
Co-authored-by: Ningxin Hu <ningxin.hu@intel.com>
@inexorabletash
Copy link
Member Author

If it still looks good @fdwr can you squash-merge ?

@fdwr fdwr merged commit 4c34b9e into webmachinelearning:main Feb 13, 2025
2 checks passed
github-actions bot added a commit that referenced this pull request Feb 13, 2025
SHA: 4c34b9e
Reason: push, by fdwr

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
@huningxin
Copy link
Contributor

👍

@inexorabletash inexorabletash deleted the resample-algos branch February 14, 2025 03:05
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Please clarify interpolation algorithm for resample2d
3 participants