Skip to content

Deploy the Kubeflow Pipelines Service

Pavel Dournov edited this page Nov 7, 2018 · 10 revisions

This page guides you through the steps to deploy Kubeflow, including the Kubeflow pipelines service.

Requirements

This guide assume you already have a GCP project. You can use Cloud Shell to run all the commands in this guide.

Alternatively, if you prefer to install and interact with GKE from your local machine, make sure you have gcloud CLI and kubectl installed locally.

Setup a GKE cluster

Follow the instructions to create a GCP project.

Enable the GKE API in this page. You can find more details about enabling billing, as well as activating the GKE API.

We recommend that you use Cloud Shell from the GCP console to run the below commands. Cloud Shell starts with an environment already logged in to your account and set to the currently selected project. The following two commands are required only in a workstation shell environment; they are not needed in the Cloud Shell.

gcloud auth login
gcloud config set project [your-project-id]

You need a GKE cluster to run Kubeflow pipelines. To start a new GKE cluster, first set a default compute zone (us-central1-a in this case):

gcloud config set compute/zone us-central1-a

Then start a GKE cluster:

# Specify your cluster name
CLUSTER_NAME=[YOUR-CLUSTER-NAME]
gcloud container clusters create $CLUSTER_NAME \
  --zone us-central1-a \
  --scopes cloud-platform \
  --enable-cloud-logging \
  --enable-cloud-monitoring \
  --machine-type n1-standard-2 \
  --num-nodes 4

Here we choose the cloud-platform scope so the cluster can invoke GCP APIs. You can find all the options for creating a cluster in here.

Next, grant your user account permission to create new cluster roles. This step is necessary because installing Kubeflow pipelines includes installing a few clusterroles.

kubectl create clusterrolebinding ml-pipeline-admin-binding --clusterrole=cluster-admin --user=$(gcloud config get-value account)

Deploy Kubeflow Pipelines

Go to the release page to find a version of the pipelines library. Deploy Kubeflow pipelines to your cluster.

For example:

PIPELINE_VERSION=0.1.1
kubectl create -f https://storage.googleapis.com/ml-pipeline/release/$PIPELINE_VERSION/bootstrapper.yaml

By running kubectl get job, you should see a job created that deploys Kubeflow pipelines along with all dependencies in the cluster. Wait for the number of successful job runs to reach 1:

NAME                      DESIRED   SUCCESSFUL   AGE
deploy-ml-pipeline-wjqwt  1         1            9m

You can check the deployment log in case of any failure

kubectl logs $(kubectl get pods -l job-name=[JOB_NAME] -o jsonpath='{.items[0].metadata.name}')

By default, the Kubeflow pipelines service is deployed with usage collection turned on. We use Spartakus which does not report any personal identifiable information (PII).

When deployment is successful, forward its port to visit the Kubeflow pipelines UI:

export NAMESPACE=kubeflow
kubectl port-forward -n ${NAMESPACE} $(kubectl get pods -n ${NAMESPACE} --selector=service=ambassador -o jsonpath='{.items[0].metadata.name}') 8080:80

If you are using Cloud Shell, you can view the UI by opening the web preview button alt text and navigating to /pipeline URL. Make sure the preview is set to port 8080.

If you are using local console instead of Cloud Shell, you can access the Kubeflow Pipelines UI at localhost:8080/pipeline.

Run your first pipeline

Follow the samples guide to compile a sample and deploy your first pipeline.

Disable usage reporting

If you want to turn off the usage report, you can download the bootstrapper file and change the arguments to the deployment job.

For example, download bootstrapper

PIPELINE_VERSION=0.0.42
curl https://storage.googleapis.com/ml-pipeline/release/$PIPELINE_VERSION/bootstrapper.yaml --output bootstrapper.yaml

and then update argument in the file

        args: [
          ... 
          # uncomment following line
          "--report_usage", "false",
          ...
        ]

then create job using the updated YAML by running kubectl create -f bootstrapper.yaml

Uninstall

To uninstall Kubeflow pipelines, download the bootstrapper file and change the arguments to the deployment job.

For example, download bootstrapper

PIPELINE_VERSION=0.0.42
curl https://storage.googleapis.com/ml-pipeline/release/$PIPELINE_VERSION/bootstrapper.yaml --output bootstrapper.yaml

and then update argument in the file

        args: [
          ... 
          # uncomment following line
          "--uninstall",
          ...
        ]

then create job using the updated YAML by running kubectl create -f bootstrapper.yaml

Developer Guide

Clone this wiki locally