What is an orchestrator?

Simply put, it’s a process that tells other processes what to do. Often, it then collects results from those processes and organizes them, or sends those results to other processes for another round of orchestrated task completion.

A good internal example of an orchestrator is demand-forecast, which orchestrates the extraction of various slices of Unified Asset data, then orchestrates the training of various models on that data, and then orchestrates prediction/forecast based on those models.

As an analogy, PySpark code is usually a form of orchestration, though it acts at a somewhat higher level than mops, since your Spark code will often go through a query planner of some sort before the underlying tasks are generated. mops takes a lower-level approach that assumes you already have known tasks that you want to execute, and that you’re willing to write your own code to 'plan' or 'order' their execution.

Rules for orchestrators

Concurrency

When running many mops.pure functions concurrently, you should prefer to use threads rather than parallel processes. Shared memory within the orchestration parts of the process allows for much simpler reuse of various contexts, and since the Runner you are using almost certainly builds in process-level parallelism, there’s no additional advantage (and many possible disadvantages) to layering extra polling processes on top of the underlying processes.

To help with this, we’ve provided thds.mops.parallel:parallel_yield_results, which should be general enough for many use cases. If it is not, feel free to bring your own concurrency primitives.

Joblib backend

A simple joblib backend is also provided, for cases where you might already be using it, or if the library you’re using (e.g. scikit-learn) is already using it under the hood.

Please note that running thousands of very short (e.g. ten second) tasks is not something K8S excels at…

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

orchestrator.adoc

orchestrator.adoc

What is an orchestrator?

Rules for orchestrators

Concurrency

Joblib backend

Files

orchestrator.adoc

Latest commit

History

orchestrator.adoc

File metadata and controls

What is an orchestrator?

Rules for orchestrators

Concurrency

Joblib backend