Fix ScalarQuantizer to use full bucket range #4074

ddrcoder · 2024-12-07T01:27:51Z

Summary:
Scalar quantizers have an off-by-one bug. Even when the quantizer is trained to cover the full trained data range, it only ends up encoding the maximum quantized value for exact matches on the maximum value instead of having the upper bound correspond to the upper bound of the last bucket. This means a 4-bit quantizer often only uses 15 / 16 possible values, effectively 3.9 bits.

Existing tests show significant movements for 4-bit quantization results; +9% recall in one of the unit tests.

The fix is to use 2^n - eps everywhere 2^n - 1 is used.

This would break existing stored indices, though, so a backwards compatibility fix is included; ranges are adjusted at deserialization time to simulate old behavior.

Differential Revision: D66909688

Summary: `TestScalarQuantizer.test_4variants_ivf` shouldn't mix IVF probing misses in with its evaluation. As is, it probes 4/64 centroids, so FP16 has only 73% recall. This fixes it to *still exercise residual encoding* and the resulting distributions, but exhaustively scan the index. Differential Revision: D66909687

Summary: Scalar quantizers have an off-by-one bug. Even when the quantizer is trained to cover the full trained data range, it only ends up encoding the maximum quantized value for exact matches on the maximum value instead of having the upper bound correspond to the upper bound of the last bucket. This means a 4-bit quantizer often only uses 15 / 16 possible values, effectively 3.9 bits. Existing tests show significant movements for 4-bit quantization results; +9% recall in one of the unit tests. The fix is to use `2^n - eps` everywhere `2^n - 1` is used. This would break existing stored indices, though, so a backwards compatibility fix is included; ranges are adjusted at deserialization time to simulate old behavior. Differential Revision: D66909688

facebook-github-bot · 2024-12-07T01:28:13Z

This pull request was exported from Phabricator. Differential Revision: D66909688

alexanderguzhva · 2024-12-11T00:52:09Z

@ddrcoder this causes multiple backward compatibility problems. Add a new quantization type?

mdouze · 2024-12-16T10:26:55Z

@ddrcoder this causes multiple backward compatibility problems. Add a new quantization type?

I agree that the current bucket allocation is suboptimal.

The bucket training uses parameters RangeStat here
https://github.com/facebookresearch/faiss/blob/main/faiss/impl/ScalarQuantizer.h#L54

Could you define a new RangeStat entry rather than a new quantization QuantizationType?

ddrcoder · 2024-12-18T18:07:19Z

@ddrcoder this causes multiple backward compatibility problems. Add a new quantization type?

That's addressed; the ranges are adapted to produce the same values as before. I do still need to prove that with a unit test, though.

mdouze · 2025-01-08T11:16:14Z

Please address my comment above.
Only the training code should be need to be adapted, there is no reason to change the file format or the distance computation code.

ddrcoder added 2 commits December 6, 2024 17:27

facebook-github-bot added the CLA Signed label Dec 7, 2024

facebook-github-bot added the fb-exported label Dec 7, 2024

satymish added bug Implementation labels Dec 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix ScalarQuantizer to use full bucket range #4074

Fix ScalarQuantizer to use full bucket range #4074

ddrcoder commented Dec 7, 2024

facebook-github-bot commented Dec 7, 2024

alexanderguzhva commented Dec 11, 2024

mdouze commented Dec 16, 2024

ddrcoder commented Dec 18, 2024

mdouze commented Jan 8, 2025

Fix ScalarQuantizer to use full bucket range #4074

Are you sure you want to change the base?

Fix ScalarQuantizer to use full bucket range #4074

Conversation

ddrcoder commented Dec 7, 2024

facebook-github-bot commented Dec 7, 2024

alexanderguzhva commented Dec 11, 2024

mdouze commented Dec 16, 2024

ddrcoder commented Dec 18, 2024

mdouze commented Jan 8, 2025