Skip to content

Commit

Permalink
make image similarity check less sensitive (#258)
Browse files Browse the repository at this point in the history
Summary:
Pull Request resolved: #258

as part of my personal side quest to make augly's tests pass again, i am making a change to our tests.

currently, to assess image similarity, we use the `np.allclose` function. while that's better / less sensitive than an MD5 hash it's not much better because imperceptible changes can actually have large differences in values between numpy image arrays.

thus, to make augly's tests less affected by slight version updates by PIL or whatever else, we are switching to using imagehash.

we're specifically using the phash - you can read about it here: https://www.hackerfactor.com/blog/index.php?/archives/432-Looks-Like-It.html

phash isn't a perfect fit though, long term. it's not sensitive to color, scaling, or aspect ratio changes. to deal with the latter two, im keeping in the size equality check. for color, i want to do some more research on what is an efficient way to do this. nonetheless, this is still better than what we currently have right now.

Differential Revision: D70137163
  • Loading branch information
jbitton authored and facebook-github-bot committed Feb 25, 2025
1 parent c245017 commit 7778f95
Show file tree
Hide file tree
Showing 2 changed files with 5 additions and 3 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/test_python.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ jobs:
python-version: '3.9'
- run: sudo apt-get update
- run: sudo apt-get install --fix-missing ffmpeg python3-soundfile
- run: pip install pyre-check pytest
- run: pip install pyre-check pytest imagehash
- run: pip install -e .[all]
- run: pyre --source-directory "." --noninteractive check || true
- run: pytest --durations=10 .
6 changes: 4 additions & 2 deletions augly/tests/image_tests/base_unit_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,14 +10,16 @@
import unittest
from typing import Any, Callable, Dict, List, Optional

import numpy as np
import imagehash
from augly.tests import ImageAugConfig
from augly.utils import pathmgr, TEST_URI
from PIL import Image


def are_equal_images(a: Image.Image, b: Image.Image) -> bool:
return a.size == b.size and np.allclose(np.array(a), np.array(b))
a_hash = imagehash.phash(a)
b_hash = imagehash.phash(b)
return a.size == b.size and a_hash - b_hash < 2


def are_equal_metadata(
Expand Down

0 comments on commit 7778f95

Please # to comment.