-
Notifications
You must be signed in to change notification settings - Fork 350
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
2024-05-30 Add FP8 PTQ #1877
base: develop
Are you sure you want to change the base?
2024-05-30 Add FP8 PTQ #1877
Conversation
Thanks for your contribution! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
可以增加一下使用文档和使用示例
@@ -21,6 +21,7 @@ | |||
- `EMDObserver`:收集最大绝对值并通过最小化EMD误差,收集量化scale | |||
- `HistObserver`:将张量值收集到直方图中,并根据百分比计算量化scale | |||
- `KLObserver`:以最小化浮点值分布与量化浮点值分布之间的 Kullback-Leibler散度计算量化scale | |||
- `AbsmaxObserver`:根据目标权重的Tensor维度,收集最大绝对值作为量化scale,可使用quant_bits调整量化的数据类型,支持FP8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
只有 absmaxobserver支持fp8吗?
@@ -60,8 +61,8 @@ model = mobilenet_v1() | |||
q_config = QuantConfig(activation=None, weight=None) | |||
|
|||
# define act_quanter and weight_quanter | |||
act_quanter = MSEObserver() | |||
weight_quanter = MSEObserver() | |||
activation = AbsmaxObserver(quant_bits = (4,3)) # quant_bits = (4,3) and quant_bits = (5,2) for float8_e4m3 and float8_e5m2 formats quantization. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
不要直接修改已有的示例,新增一个示例表示fp8量化,新增文档说明
可以把不同observer的fp8量化实验结果贴上来 |
|
修改Uniform Observer支持FP8量化

