mops
is a Python library for ML Operations.
Jump to Quickstart if you are impatient prefer examples, like me!
mops
solves for four core design goals:
-
Efficient transfer of pure function execution to remote execution environments with more &| different compute resources
-
Everything is written in standard Python with basic Python primitives; no frameworks, YAML, DSLs…
-
Memoization — i.e. reproducibility and fault tolerance — for individual functions.
-
Droppability:
mops
shouldn’t entangle itself with your code, and you should always be able to run your code with or withoutmops
in the loop.
It is used by decorating or wrapping your pure function and then calling it like a normal function.
-
Python >= 3.8
-
itself to be installed in the remote execution context as well as in the local environment.
-
your function and its arguments to be serializable with
pickle
. -
(if using remote compute) ADLS read+write access on the local/orchestrator and the remote runtime.
It is usually used with remote compute on:
-
Kubernetes, with code distributed as a Docker image
---or---
-
dbxtend (currently internal-only), with code distributed to Databricks as Python wheels.
It optionally integrates with:
-
joblib
for local parallelism.
It has some limitations.
-
Here are some tools for debugging your functions that are running under
mops
.
If making changes to the library, please bump the version in pyproject.toml
accordingly.
Also look at our changelog.
-
poetry run pytest tests --test-uri-root file://./mops-tests
-
poetry run pytest tests -m integration --test-uri-root file://./mops-tests
If you want to run tests against a non-bundled blob store, you will need to make sure that blob store is installed in the venv before running the tests.