Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

idea: Support Apple CoreML #1103

Open
qdrddr opened this issue Apr 25, 2024 · 5 comments
Open

idea: Support Apple CoreML #1103

qdrddr opened this issue Apr 25, 2024 · 5 comments
Labels
category: engine management Related to engine abstraction P3: nice to have Nice to have feature

Comments

@qdrddr
Copy link

qdrddr commented Apr 25, 2024

Problem

Please consider adding Core ML model package format support to utilize Apple Silicone Nural Engine + GPU.

Success Criteria
Utilize both ANE & GPU, not just GPU on Apple Silicon

Additional context
List of Core ML package format models

https://github.com/likedan/Awesome-CoreML-Models

@qdrddr qdrddr added the type: feature request A new feature label Apr 25, 2024
@qdrddr
Copy link
Author

qdrddr commented Apr 25, 2024

This is about running LLMs locally on Apple Silicone. Core ML is a framework that can redistribute workload across CPU, GPU & Nural Engine (ANE). ANE is available on all modern Apple Devices: iPhones & Macs (A14 or newer and M1 or newer). Ideally, we want to run LLMs on ANE only as it has optimizations for running ML tasks compared to GPU. Apple claims "deploying your Transformer models on Apple devices with an A14 or newer and M1 or newer chip to achieve up to 10 times faster and 14 times lower peak memory consumption compared to baseline implementations".

  1. To utilize Core ML first, you need to convert a model from TensorFlow, PyTorch to Core ML model package format using coremltools (or simply utilize existing models in Core ML package format ).
  2. Second, you must now use that converted package with an implementation designed for Apple Devices. Here is the Apple XCode reference PyTorch implementation.

https://machinelearning.apple.com/research/neural-engine-transformers

@qdrddr
Copy link
Author

qdrddr commented Apr 25, 2024

Work in progress on CoreML implementation for [whisper.cpp]. They see x3 performance improvements for some models. (ggerganov/whisper.cpp#548) you might be interested in.
You might also be interested in another implementation Swift Transformers. Example of CoreML application
https://github.com/huggingface/swift-chat

@qdrddr qdrddr changed the title feat: Apple Silicone Nural Engine, Core ML model package format support feat: Apple Silicone Neural Engine, Core ML model package format support Apr 26, 2024
@freelerobot freelerobot changed the title feat: Apple Silicone Neural Engine, Core ML model package format support feat: Support Apple CoreML Sep 5, 2024
@freelerobot
Copy link
Contributor

Updated title to better reflect ask

@freelerobot freelerobot transferred this issue from janhq/jan Sep 5, 2024
@freelerobot freelerobot added P3: nice to have Nice to have feature type: engine request and removed type: feature request A new feature labels Sep 5, 2024
@freelerobot freelerobot changed the title feat: Support Apple CoreML epic: Support Apple CoreML Sep 6, 2024
@dan-menlo dan-menlo changed the title epic: Support Apple CoreML idea: Support Apple CoreML Sep 8, 2024
@dan-menlo dan-menlo moved this to Icebox in Menlo Oct 13, 2024
@freelerobot freelerobot added category: engine management Related to engine abstraction and removed type: engine request labels Oct 17, 2024
@qdrddr
Copy link
Author

qdrddr commented Feb 20, 2025

Would that be of any help the LM Studio has implemented MLX. And here is Anemll ANE library to work with MLX it is MIT Licensed. And there's FastMLX with an Apache 2.0 license. @freelerobot

@qdrddr
Copy link
Author

qdrddr commented Mar 5, 2025

there's Apple MPS Metal GPU flash attention in swift by the way

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
category: engine management Related to engine abstraction P3: nice to have Nice to have feature
Projects
Status: Icebox
Development

No branches or pull requests

2 participants