-
Notifications
You must be signed in to change notification settings - Fork 154
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
idea: Support Apple CoreML #1103
Comments
This is about running LLMs locally on Apple Silicone. Core ML is a framework that can redistribute workload across CPU, GPU & Nural Engine (ANE). ANE is available on all modern Apple Devices: iPhones & Macs (A14 or newer and M1 or newer). Ideally, we want to run LLMs on ANE only as it has optimizations for running ML tasks compared to GPU. Apple claims "deploying your Transformer models on Apple devices with an A14 or newer and M1 or newer chip to achieve up to 10 times faster and 14 times lower peak memory consumption compared to baseline implementations".
https://machinelearning.apple.com/research/neural-engine-transformers |
Work in progress on CoreML implementation for [whisper.cpp]. They see x3 performance improvements for some models. (ggerganov/whisper.cpp#548) you might be interested in. |
Updated title to better reflect ask |
Would that be of any help the LM Studio has implemented MLX. And here is Anemll ANE library to work with MLX it is MIT Licensed. And there's FastMLX with an Apache 2.0 license. @freelerobot |
there's Apple MPS Metal GPU flash attention in swift by the way |
Problem
Please consider adding Core ML model package format support to utilize Apple Silicone Nural Engine + GPU.
Success Criteria
Utilize both ANE & GPU, not just GPU on Apple Silicon
Additional context
List of Core ML package format models
https://github.com/likedan/Awesome-CoreML-Models
The text was updated successfully, but these errors were encountered: