Skip to content

Run test_base_fp8 for compute capability 8.9 or later #3164

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Merged
merged 1 commit into from
Oct 11, 2024

Conversation

HolyWu
Copy link
Contributor

@HolyWu HolyWu commented Sep 17, 2024

Description

According to https://docs.nvidia.com/deeplearning/tensorrt/support-matrix/index.html#hardware-precision-matrix, FP8 precision is supported for SM 8.9 or later. Confirmed to pass the tests locally on RTX 4060 Ti.

Checklist:

  • My code follows the style guidelines of this project (You can use the linters)
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas and hacks
  • I have made corresponding changes to the documentation
  • I have added tests to verify my fix or my feature
  • New and existing unit tests pass locally with my changes
  • I have added the relevant labels to my PR in so that relevant reviewers are notified

Copy link
Collaborator

@lanluo-nvidia lanluo-nvidia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.
I can confirm that I am also able to do the fp8 ptq on my RTX4080.

@lanluo-nvidia lanluo-nvidia merged commit 6ba44fa into pytorch:main Oct 11, 2024
51 of 70 checks passed
@HolyWu HolyWu deleted the test_base_fp8 branch October 11, 2024 16:28
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants