Skip to content

Refactor ggml.c for future tensor types #1001

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Merged
merged 2 commits into from
Apr 15, 2023
Merged

Conversation

sw
Copy link
Contributor

@sw sw commented Apr 15, 2023

This makes two somewhat tedious changes to ggml.c, that should help us with trying out other tensor types, as has been discussed recently in various issues/PRs.

The argument against the latter by @ggerganov was #951 (comment):

It's easier to search for all the places that a type is used.
It does get annoying, so maybe we should reconsider. Overall, there is a lot of room for simplification and reduction of code duplication in ggml.c

My arguments for the change are:

  • shortens the file by >200 lines
  • makes diffs less noisy when you add a new type
  • the assert now also catches invalid values caused by memory corruption or other bugs, instead of the function silently exiting
  • "all the places that a type is used" -> it's not a "usage" in any practical sense as it just causes an abort().

I have not updated tests/test-quantize.c with QK, as that may be removed in pending #953.

@sw sw requested a review from ggerganov April 15, 2023 15:43
@slaren
Copy link
Member

slaren commented Apr 15, 2023

I think this is a good change, but I am concerned that we won't be able to change QK without breaking backwards compatibility, and if we ever want to support a different value it will require much deeper changes anyway. I think this could be very easily solved with templates, but with C I am afraid that we may have to choose between converting everything into macro hell or accept the runtime cost of a dynamic QK value. Or just never change QK at all.

@sw
Copy link
Contributor Author

sw commented Apr 15, 2023

I don't think the idea was to change QK for an existing type, rather to add e.g. Q4_2 which will have its own QK42 != 32

@slaren
Copy link
Member

slaren commented Apr 15, 2023

Right, I was thinking of @qwopqwop200's implementation that showed some benefits from using a group size of 128 (if I understood that correctly). But as you say that is a completely different use case, my bad.

Copy link
Member

@ggerganov ggerganov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if QK4_0, QK4_1 and QK8_0 wouldn't be better - this way I can search for 4_1 for example and get all related functions, usages, etc.

@sw
Copy link
Contributor Author

sw commented Apr 15, 2023

Wondering if QK4_0, QK4_1 and QK8_0 wouldn't be better

Sure, let's do that. That way we can also support >10 variants without confusion ;-) (at least in the numbering, the size of ggml.c would be another matter)

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants