Skip to content

Add support for running on with Apple GPUs #181

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Open
kacpnowak opened this issue Apr 14, 2025 · 2 comments
Open

Add support for running on with Apple GPUs #181

kacpnowak opened this issue Apr 14, 2025 · 2 comments
Labels
enhancement New feature or request

Comments

@kacpnowak
Copy link
Contributor

Is your feature request related to a problem? Please describe.

Currently to test the model one needs to use Nvidia GPU. In principale it should be possible to run it on with Apple GPU via Metal as well. It's implemented in llama.cpp

Describe the solution you'd like

When model is installed on a Mac it should use this dependency: https://github.com/philipturner/metal-flash-attention

Describe alternatives you've considered

No response

Additional context

No response

Organisation

AWI

@kacpnowak kacpnowak added the enhancement New feature or request label Apr 14, 2025
@clessig
Copy link
Collaborator

clessig commented Apr 14, 2025

Being able to run locally on a Mac would be useful for development. But how substainable would be this solution? How much time do we need to make it work?

@kacpnowak
Copy link
Contributor Author

In the pyproject.toml the flash_attn is listed as optional, so one could just replace it flex_attn. If we don't plan to that, I think it should be changed as it can be confusing

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants