Failure to reproduce paper's recall on IVF Product quantization #1045

mayya-sharipova · 2019-11-27T22:14:48Z

I am trying to reproduce results the paper: “Product quantization for nearest neighbor search”, Jégou & al., PAMI 2011.
The paper reports that IVFADC for sift 1 million docs dataset with 128 dims with k'=1024, m=8, w=8 produces recall of around 0.9 (Fig 7).

Running ann-benchmarks on the sift dataset with nprobe=8

ncentroids = 1024
m = 8 
code_size = 8 
self.coarse_quantizer = faiss.IndexFlatL2(X.shape[1])
index = faiss.IndexIVFPQ(self.coarse_quantizer, X.shape[1], ncentroids, m, code_size)

produces recall of 0.372.
FaissIVFPQ(n_list=1024, n_probe=8) 0.372 4493.587

What would be the way to achieve recall as reported by the paper?

The text was updated successfully, but these errors were encountered:

mdouze · 2019-11-29T15:45:24Z

Here is code that reproduces the result:
repro_IVF_paper.ipynb
It gives 87.3% recall. There is probably an error in your eval code.

mayya-sharipova · 2019-11-29T22:57:11Z

Thanks so much for your reply. I will investigate further.

mayya-sharipova · 2019-12-01T19:50:24Z

@mdouze I am not clear about the way you calculate recall:
recall_at_100 = (I[:, :100] == gt[:, :1]).sum() / float(xq.shape[0])

For every query, we have a ground truth -- 100 expected documents to find. So, we need to compare these 100 expected docs with 100 documents found during the search, calculate how many documents are in common (intersection), and this will be the recall for this query. We do the same for all queries to get the average recall among 10K queries.

But in your calculation, you only check if 1st expected document is present in 100 documents found during the search. That's not a traditional evaluation of recall. Do you have references when this definition of the recall is used?

So, using your code in repro_IVF_paper.ipynb, and calculating the recall in the traditional way:

total_matches = 0
for q in range(xq.shape[0]):
    matches = [id for id in I[q] if id in gt[q]]
    total_matches += len(matches)
recall_at_100 = (total_matches / float(xq.shape[0]))/ 100
print("recall@100: %.3f" % recall_at_100);

I am getting the result of:
recall@100: 0.436 or 43.6%

mdouze · 2019-12-02T06:48:36Z

Check the definition of recall@100 in the paper:

The search quality is measured with recall@R, i.e., the
proportion of query vectors for which the nearest neighbor
is ranked in the first R positions. This measure indicates
the fraction of queries for which the nearest neighbor is
retrieved correctly, if a short-list of R vectors is verified
using Euclidean distances. Furthermore, the curve obtained by
varying R corresponds to the distribution function of the ranks,
and the point R=1 corresponds to the “precision” measure used
in [9] to evaluate ANN methods.

mayya-sharipova · 2019-12-02T12:56:29Z

@mdouze Thank you so much for your clarification, I have missed this.
This definition of recall looks to be a very relaxed definition, and not sure how much it is really useful in practice.

mdouze · 2020-01-06T14:30:40Z

no activity, closing.

mdouze added the question label Nov 29, 2019

mdouze closed this as completed Jan 6, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Failure to reproduce paper's recall on IVF Product quantization #1045

Failure to reproduce paper's recall on IVF Product quantization #1045

mayya-sharipova commented Nov 27, 2019

mdouze commented Nov 29, 2019

mayya-sharipova commented Nov 29, 2019

mayya-sharipova commented Dec 1, 2019 •

edited

Loading

mdouze commented Dec 2, 2019

mayya-sharipova commented Dec 2, 2019

mdouze commented Jan 6, 2020

Failure to reproduce paper's recall on IVF Product quantization #1045

Failure to reproduce paper's recall on IVF Product quantization #1045

Comments

mayya-sharipova commented Nov 27, 2019

mdouze commented Nov 29, 2019

mayya-sharipova commented Nov 29, 2019

mayya-sharipova commented Dec 1, 2019 • edited Loading

mdouze commented Dec 2, 2019

mayya-sharipova commented Dec 2, 2019

mdouze commented Jan 6, 2020

mayya-sharipova commented Dec 1, 2019 •

edited

Loading