These are end-to-end pipelines that demonstrate the power of MAX for accelerating common AI workloads, and more. Each of the supported pipelines can be served via an OpenAI-compatible endpoint.
MAX can also serve most PyTorch-based large language models that are present on Hugging Face, although not at the same performance as native MAX Graph versions.
The easiest way to try out any of the pipelines is with our Magic command-line tool.
-
Install Magic on macOS and Linux with this command:
curl -ssL https://magic.modular.com | bash
Then run the source command that's printed in your terminal.
To see the available commands, you can run
magic --help
. Learn more about Magic here. -
Install max-pipelines command to run the pipelines.
magic global install max-pipelines
-
Serve a model.
max-pipelines serve --huggingface-repo-id deepseek-ai/DeepSeek-R1-Distill-Llama-8B
See https://builds.modular.com/ to discover many of the models supported by MAX.