Revert naive compression format #32

Satrat · 2024-07-22T19:12:29Z

SUMMARY:
Rather than float and int quantization sharing a format, we infer int-quantized for uniform integer quantization and float-quantized for uniform fp8 quantization. Any non-uniform quantization will still default to naive-quantized

TEST PLAN:
Updated unit test with new expected defaults

* group size * add logic in base observer * group size full lifecycle run * before vectorize the for loop * comments, todo add channelwise * chan wise impl * comments * fix channel wise * comments, validators * fix typo * tensor return error fix * fix sparseml-side of code and add per channel * pyndatic defaults * token wise quant * Update src/compressed_tensors/quantization/quant_args.py Co-authored-by: Benjamin Fineran <bfineran@users.noreply.github.com> * comments' * update dim * shape consistency * Update src/compressed_tensors/quantization/lifecycle/forward.py Co-authored-by: Benjamin Fineran <bfineran@users.noreply.github.com> * comments * pass test_quant_args --------- Co-authored-by: Benjamin Fineran <bfineran@users.noreply.github.com>

fix compression types

e1a47a3

Satrat requested review from bfineran, dsikka, horheynm and robertgshaw2-neuralmagic July 22, 2024 19:13

bfineran approved these changes Jul 22, 2024

View reviewed changes

Satrat changed the title ~~Revert naive-compression format~~ Revert naive compression format Jul 22, 2024

robertgshaw2-neuralmagic approved these changes Jul 22, 2024

View reviewed changes

robertgshaw2-neuralmagic merged commit 07c1fd7 into main Jul 22, 2024
8 of 12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Revert naive compression format #32

Revert naive compression format #32

Satrat commented Jul 22, 2024

Revert naive compression format #32

Revert naive compression format #32

Conversation

Satrat commented Jul 22, 2024