Kubernetes

A very common use of mops is to run Python functions on Kubernetes. mops can’t provide the cluster for you, but if you have access to one that’s already up and running, it should be easy to start using.

Installation

You must add mops to your project with the k8s extras, or your imports of the k8s module will fail.

pip install thds.mops[k8s]

Declarative usage with mops `pure.magic`

This approach is preferred for cases where the code to be run on Kubernetes can be considered a pure function and is implemented as or wrapped by Python.

If you’re trying to wrap a Docker image that you can’t or don’t want to install mops on, you’ll need to use the low-level imperative API instead.

You will need to construct a k8s_shell with configuration appropriate to the Python function that will ultimately be invoked remotely. This includes specifying the name of your Docker container as well as providing resource requests or limitations. For example, you might create a module called k8s_shim.py:

from thds import mops

df_k8s_spot_11g_shim = mops.k8s.shim(
    mops.k8s.autocr('ds/demand-forecast:latest'),
    # gotta specify which container this runs on
    node_narrowing=dict(
        resource_requests={"cpu": "3.6", "memory": "11G"},
        tolerations=[mops.k8s.tolerates_spot()],
    ),
)

mops does not distribute your code for you, and neither does the provided k8s shim. You must perform your own code distribution by building and pushing a Docker image to an accessible container registry, and configuring the K8s runtime shim so that it knows what pre-published image to use.

This will require your Docker image to exist and to be accessible to Kubernetes via the configured Container Registry. Additionally, that image must somehow be provided with read and write access to your blob store, since all data will be transferred back and forth from your local environment via your blob store.

Once you have constructed an appropriate runtime shim, you need only pass it to the pure.magic() decorator factory, and apply that decorator to any pure function to the remote K8s environment, and inject its result right back into the local/orchestrator context.

Imperative usage

Declarative usage is recommended because it confers additional capabilities such as memoization (and therefore a form of fault-tolerance) but if you have a non-Python image you want to launch, see the docs on the low-level imperative interface here.

The basic container specification will be the same regardless of which approach is used.

Logging

For Kubernetes only, mops will automatically try to scrape and print the logs from each launched pod. ANSI terminal Colors will be automatically chosen for each separate pod being scraped.

If you are running many pods in parallel, it may be desirable to turn off logging, as it has been observed to leak memory over the course of long runs, and a separate OS thread is used for each pod being scraped.

Either set the MOPS_NO_K8S_LOGS environment variable, or programmatically pass suppress_logs=True to either k8s_shell (declarative API) or launch (imperative API).

Common failure modes

This list is not even close to exhaustive, but we can add to it over time.

Too little memory.

Kubernetes will often kill pods that use more than their requested memory, but what makes this sometimes tricky to diagnose is that Kubernetes will not kill pods because they used more memory than requested - pod death will only occur when other pods scheduled on the same node also end up using more memory than they requested… and then the node itself runs out of RAM, and K8s has to choose a pod to kill.

If you are CPU bound, you may as well request roughly 4x more RAM than CPU, since CPU-optimized nodes generally have 4GB of RAM for each CPU. If you are strictly memory bound, you will simply need to request enough memory for the worst case scenario for each of your Jobs.

If you have a hybrid workload (different types of Jobs running concurrently) and are willing to occasionally let partially-complete pods die from OOM, you can set your memory requests lower than the max pressure – may the odds be ever in your favor ;)

Missing Docker image.

We’re working to make this easier to identify when using mops, but on a basic level, the Kubernetes API does not ever reject a request to launch a Job or Pod when provided with a Docker image tag that it can’t find. It will happily accept your request, and then there will just be infinite ImageBackOff errors that can be found by querying the API (or using a console like k9s). This is fundamentally because K8s wants to make allowance for the possibility that your container registry might go down for periods of time, which shouldn’t prevent you from making a request for something to work once the registry comes back up.

In any case, if you’re running a new process, or have recently changed something about your Docker image, keep an eye on the API or console for ImageBackOff errors that might be an indication that you’ve asked mops to create a Job with an image that simply does not exist.

Alternatively, you can take a look at the machinery we have made available that may help you detect this.

Orchestrator pods

If you’re running a big or long-running process with remote workers in K8s, because of common network limitations, you may wish to run from an orchestrator pod also hosted in Kubernetes.

See here.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kubernetes.adoc

kubernetes.adoc

Kubernetes

Installation

Declarative usage with mops `pure.magic`

Imperative usage

Logging

Common failure modes

Too little memory.

Missing Docker image.

Orchestrator pods

Further reading

Files

kubernetes.adoc

Latest commit

History

kubernetes.adoc

File metadata and controls

Kubernetes

Installation

Declarative usage with mops pure.magic

Imperative usage

Logging

Common failure modes

Too little memory.

Missing Docker image.

Orchestrator pods

Further reading

Declarative usage with mops `pure.magic`