Skip to content

Added the ndcg metric [WIP] #2632

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Draft
wants to merge 38 commits into
base: master
Choose a base branch
from
Draft

Conversation

kamalojasv181
Copy link
Contributor

@kamalojasv181 kamalojasv181 commented Jul 23, 2022

Related #2631

Description: This is the implementation [WIP] for the NDCG metric.

Check list:

  • Create a basic implementation.
  • Handle rating ties.
  • Write and parameterize tests.
  • Add comments about its use
  • Add exponential implementation
  • Add Docs

@github-actions github-actions bot added the module: metrics Metrics module label Jul 23, 2022
@kamalojasv181 kamalojasv181 marked this pull request as draft July 23, 2022 16:45
Copy link
Collaborator

@vfdev-5 vfdev-5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kamalojasv181 thanks for the PR. I left few comments on few points, but I haven't yet explored the code to compute ndcg.

Can you please write few tests vs scikit-learn as ref ?

Copy link
Collaborator

@vfdev-5 vfdev-5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for the update @kamalojasv181 !
I left few other comments on the implementation and the API.

Let's also start working on docs and tests

@kamalojasv181
Copy link
Contributor Author

Thanks for all the feedback. I will revert with a pull request addressing everything!

@vfdev-5
Copy link
Collaborator

vfdev-5 commented Jul 24, 2022

Thanks for all the feedback. I will revert with a pull request addressing everything!

You can just continue working with this pull request, no need to revert anything.

Copy link
Collaborator

@vfdev-5 vfdev-5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates @kamalojasv181 !
I have few other code update suggestions.

@kamalojasv181
Copy link
Contributor Author

@vfdev-5 there are a bunch of things I did in this commit:

  1. Put the data on GPU for tests
  2. Replaced torch.add with +.
  3. Changed the naming and ordering as per ignite as (y_pred, y_true ....)
  4. Added function to handle ties and wrote corresponding tests(A part of it is un-vectorized, TODO: Think of implementation to vectorise it.)
  5. Added check for log_base > 0 and not equal to 1.
  6. Added tests for log_base and exponential.
  7. Separated the _ndcg_smaple_score and _dcg_smaple_score from the NDCG class.
    NOTE: log_base tests could have been done with the other tests, but I put them separately so that if sklearn changes their code, then we know about it (as the tests would be failing)

If there is anything else, lemme know before I can finally add some comments against the class and documentation.

Comment on lines +52 to +54
discounted_gains = torch.tensor(
[_tie_averaged_dcg(y_p, y_t, discount_cumsum, device) for y_p, y_t in zip(y_pred, y_true)], device=device
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, there is no way to make it vectorized == without for-loop ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I havent checked yet. For now I have added this implementation. It's a TODO

@puhuk
Copy link
Contributor

puhuk commented Aug 30, 2022

@vfdev-5 @kamalojasv181

Belows are checklist for test in ddp configuration

  • Data generating with different random seed per each process
    • torch.random_seed(12 + rank + i)
  • Generated data size per each rank should be n_iters * batch_size to update with its batch_size in update()
    • y_true = torch.randint(0, n_classes, size=(n_iters * batch_size,))
  • Update data with batch style in update()
    • y_pred[i * batch_size : (i+1) * batch_size]
  • Check generated data from each processes are gathered for computing (after running Engine)
    •    engine.run(data=data, max_epochs=n_epochs)
         y_true = idist.all_gather(y_true)
         y_preds = idist.all_gather(y_preds)
  • Check calculated metric is same as reference
    • pytest.approx(calculated_value) == reference_value

The whole process should be seem like (Or you can refer from ignite/tests/ignite/metrics/test_accuracy.py)

def _test_distrib_integration_list_of_tensors_or_numbers(device):

acc = Accuracy(is_multilabel=True, device=metric_device)

# data generation
torch.manual_seed(12 + rank)
y_true = torch.randint(0, n_classes, size=(n_iters * batch_size,)).to(device)
y_preds = torch.rand(n_iters * batch_size, n_classes).to(device)

# update each batch
def update(engine, i):
            return (
                y_preds[i * batch_size : (i + 1) * batch_size, :],
                y_true[i * batch_size : (i + 1) * batch_size],
            )

# Initialize Engine
engine = Engine(update)
acc = Accuracy(device=metric_device)
acc.attach(engine, "acc")

data = list(range(n_iters))
engine.run(data=data, max_epochs=n_epochs)

# all gather data
y_pred = idist.all_gather(y_pred)
y = idist.all_gather(y)

res = engine.state.metrics["acc"]

# calculate reference value with scikit learn and compare
true_res = sklearn.metrics.accuracy_score(y_true.cpu().numpy(), torch.argmax(y_preds, dim=1).cpu().numpy())
assert pytest.approx(res) == true_res

Comment on lines 163 to 166
return (
[v for v in y_preds[i * batch_size : (i + 1) * batch_size, ...]],
[v for v in y_true[i * batch_size : (i + 1) * batch_size]],
)
Copy link
Collaborator

@vfdev-5 vfdev-5 Aug 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kamalojasv181 Why do you return tuple of 2 lists instead of a tuple of two tensors ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have taken inspiration from the code provided by @puhuk . Here each element of the list is a batch. We feed our engine one batch at a time; hence using a list is also ok. To maintain uniformity across the code, I have kept it this way.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see, he provided a wrong link. Yes, in accuracy we also a test on list of tensors and numbers but this is untypical. Here is a typical example

def _test_distrib_integration_multilabel(device):
rank = idist.get_rank()
def _test(n_epochs, metric_device):
metric_device = torch.device(metric_device)
n_iters = 80
batch_size = 16
n_classes = 10
torch.manual_seed(12 + rank)
y_true = torch.randint(0, 2, size=(n_iters * batch_size, n_classes, 8, 10)).to(device)
y_preds = torch.randint(0, 2, size=(n_iters * batch_size, n_classes, 8, 10)).to(device)
def update(engine, i):
return (
y_preds[i * batch_size : (i + 1) * batch_size, ...],
y_true[i * batch_size : (i + 1) * batch_size, ...],
)
engine = Engine(update)
acc = Accuracy(is_multilabel=True, device=metric_device)
acc.attach(engine, "acc")
data = list(range(n_iters))
engine.run(data=data, max_epochs=n_epochs)
y_true = idist.all_gather(y_true)
y_preds = idist.all_gather(y_preds)
assert (
acc._num_correct.device == metric_device
), f"{type(acc._num_correct.device)}:{acc._num_correct.device} vs {type(metric_device)}:{metric_device}"
assert "acc" in engine.state.metrics
res = engine.state.metrics["acc"]
if isinstance(res, torch.Tensor):
res = res.cpu().numpy()
true_res = accuracy_score(to_numpy_multilabel(y_true), to_numpy_multilabel(y_preds))
assert pytest.approx(res) == true_res
metric_devices = ["cpu"]
if device.type != "xla":
metric_devices.append(idist.device())
for metric_device in metric_devices:
for _ in range(2):
_test(n_epochs=1, metric_device=metric_device)
_test(n_epochs=2, metric_device=metric_device)

@vfdev-5
Copy link
Collaborator

vfdev-5 commented Aug 31, 2022

@sadra-barikbin can you check why tests/ignite/metrics/test_precision.py::test_binary_input[None] is failing systematically ?

E       assert array([1.]) == approx([1.0 ± 1.0e-06, 0.0 ± 1.0e-12])
E         Impossible to compare arrays with different shapes.
E         Shapes: (2,) and (1,)

tests/ignite/metrics/test_precision.py:130: AssertionError

@puhuk
Copy link
Contributor

puhuk commented Dec 7, 2022

@kamalojasv181 Hi, do you need any help to finalize this PR? Please feel free to let me and @vfdev-5 know :)

@ili0820
Copy link

ili0820 commented Aug 27, 2023

Any updates on this ? If it is not finished yet, I'd love to contribute @vfdev-5

@vfdev-5
Copy link
Collaborator

vfdev-5 commented Aug 27, 2023

Yes, this PR is not finished, unfortunately. @ili0820 if you can help with getting it landed it would be great!

@ili0820
Copy link

ili0820 commented Aug 30, 2023

after reviewing all the comments,
As far as I understand, the testing part is the last and only problem that has been not resolved yet,
this seems resolved and
this is the problem.
Is this correct? @vfdev-5

@vfdev-5
Copy link
Collaborator

vfdev-5 commented Aug 30, 2023

after reviewing all the comments, As far as I understand, the testing part is the last and only problem that has been not resolved yet, this seems resolved and this is the problem. Is this correct? @vfdev-5

@ili0820 yes, it remains few things here:

In case if you are not familiar with DDP, please check: https://pytorch-ignite.ai/tutorials/advanced/01-collective-communication/. As for testing best practices, we would like to use now distributed fixture as here:

def test_distrib_integration(distributed, metric_device):

if you have any questions, you can reach out to us on Discord in #start-contributing channel.

@ili0820 ili0820 mentioned this pull request Aug 31, 2023
3 tasks
@exitflynn exitflynn mentioned this pull request Mar 19, 2025
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
module: metrics Metrics module
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants