idea: Support Apple CoreML #1103

qdrddr · 2024-04-25T01:55:41Z

Problem

Please consider adding Core ML model package format support to utilize Apple Silicone Nural Engine + GPU.

Success Criteria
Utilize both ANE & GPU, not just GPU on Apple Silicon

Additional context
List of Core ML package format models

https://github.com/likedan/Awesome-CoreML-Models

qdrddr · 2024-04-25T12:59:16Z

This is about running LLMs locally on Apple Silicone. Core ML is a framework that can redistribute workload across CPU, GPU & Nural Engine (ANE). ANE is available on all modern Apple Devices: iPhones & Macs (A14 or newer and M1 or newer). Ideally, we want to run LLMs on ANE only as it has optimizations for running ML tasks compared to GPU. Apple claims "deploying your Transformer models on Apple devices with an A14 or newer and M1 or newer chip to achieve up to 10 times faster and 14 times lower peak memory consumption compared to baseline implementations".

To utilize Core ML first, you need to convert a model from TensorFlow, PyTorch to Core ML model package format using coremltools (or simply utilize existing models in Core ML package format ).
Second, you must now use that converted package with an implementation designed for Apple Devices. Here is the Apple XCode reference PyTorch implementation.

https://machinelearning.apple.com/research/neural-engine-transformers

qdrddr · 2024-04-25T13:08:28Z

Work in progress on CoreML implementation for [whisper.cpp]. They see x3 performance improvements for some models. (ggerganov/whisper.cpp#548) you might be interested in.
You might also be interested in another implementation Swift Transformers. Example of CoreML application
https://github.com/huggingface/swift-chat

freelerobot · 2024-09-05T07:54:53Z

Updated title to better reflect ask

qdrddr · 2025-02-20T14:43:35Z

Would that be of any help the LM Studio has implemented MLX. And here is Anemll ANE library to work with MLX it is MIT Licensed. And there's FastMLX with an Apache 2.0 license. @freelerobot

qdrddr · 2025-03-05T17:03:59Z

there's Apple MPS Metal GPU flash attention in swift by the way

qdrddr added the type: feature request A new feature label Apr 25, 2024

qdrddr changed the title ~~feat: Apple Silicone Nural Engine, Core ML model package format support~~ feat: Apple Silicone Neural Engine, Core ML model package format support Apr 26, 2024

freelerobot changed the title ~~feat: Apple Silicone Neural Engine, Core ML model package format support~~ feat: Support Apple CoreML Sep 5, 2024

freelerobot transferred this issue from janhq/jan Sep 5, 2024

github-project-automation bot added this to Menlo Sep 5, 2024

freelerobot added P3: nice to have Nice to have feature type: engine request and removed type: feature request A new feature labels Sep 5, 2024

freelerobot changed the title ~~feat: Support Apple CoreML~~ epic: Support Apple CoreML Sep 6, 2024

dan-menlo changed the title ~~epic: Support Apple CoreML~~ idea: Support Apple CoreML Sep 8, 2024

freelerobot mentioned this issue Sep 23, 2024

idea: Apple MLX #678

Open

dan-menlo moved this to Icebox in Menlo Oct 13, 2024

freelerobot added category: engine management Related to engine abstraction and removed type: engine request labels Oct 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

idea: Support Apple CoreML #1103

idea: Support Apple CoreML #1103

qdrddr commented Apr 25, 2024

qdrddr commented Apr 25, 2024

qdrddr commented Apr 25, 2024 •

edited

Loading

freelerobot commented Sep 5, 2024

qdrddr commented Feb 20, 2025 •

edited

Loading

qdrddr commented Mar 5, 2025

idea: Support Apple CoreML #1103

idea: Support Apple CoreML #1103

Comments

qdrddr commented Apr 25, 2024

qdrddr commented Apr 25, 2024

qdrddr commented Apr 25, 2024 • edited Loading

freelerobot commented Sep 5, 2024

qdrddr commented Feb 20, 2025 • edited Loading

qdrddr commented Mar 5, 2025

qdrddr commented Apr 25, 2024 •

edited

Loading

qdrddr commented Feb 20, 2025 •

edited

Loading