It provisions all necessary Nebius AI Cloud resources:
- Managed Service for Kubernetes cluster
- Object Storage bucket
- Managed Service for PostgreSQL cluster
After that, it deploys Metaflow services on the Managed Kubernetes cluster.
- Install Terraform.
- Install kubectl.
- Install and configure the Nebius AI Cloud CLI.
After installing the prerequisite tools, run source ./.envrc.zsh
if you run bash, or source ./.envrc.zsh
if you prefer zsh.
The templates are organized into two modules, infra
and services
.
-
Create
terraform.tfvars
with Terraform variables:org_prefix = "yourorg"
This is used to help generate unique names for the following resources:
- Object Storage bucket
- Managed PostgreSQL cluster
You can also add tenant_id, project_id and vpc_subnet_id to
terraform.tfvars
tenant_id = "tenant-***" project_id = "project-***" vpc_subnet_id = "vpcsubnet-***"
- To get the tenant and project IDs, open the project menu at the top of the web console and click ︙ → Copy tenant ID next to the tenant name and ︙ → Copy project ID next to the project name.
- To get the subnet ID, in the web console, go to Network and click ︙ → Copy subnet ID next to the subnet name.
-
Apply the
infra
module that creates the Nebius AI Cloud resources:terraform init terraform apply -target="module.infra" -var-file=terraform.tfvars
-
Set up authentication and authorization for the service account created with
infra
:-
In the web console, go to Access → Service accounts.
-
Find the service account whose name starts with
stmetaflow
and click on it. -
Under the service account name, click Add to group.
-
Add the account to the
editors
group and click Close. -
Click Create access key.
-
Copy the key ID and the secret key and add them to
terraform.tfvars
:aws_access_key_id = "" # Key ID aws_secret_access_key = "" Secret key
-
-
Apply the
services
module:terraform apply -target="module.services" -var-file=terraform.tfvars
Note: This template only provides a quick start for testing purposes. We do not recommend it for real production deployments.
By default, this Terraform template does not deploy Airflow. To deploy it, set the deploy_airflow
variable to true
in terraform.tfvars
:
deploy_airflow = true
If deploy_airflow
is set to true
, the services
module will deploy Airflow on the Managed Kubernetes cluster deployed by the infra
module. It uses the official Helm chart.
The Terraform template deploys Airflow configured with a LocalExecutor
simplicity. Metaflow can work with any Airflow executor.
If you changed the value of deploy_airflow
for an existing deployment, reapply both infra
and services
modules as described in the instructions.
Airflow expects Python files with Airflow DAGS present in the dags_folder. By default, this Terraform template uses the default path set in the Airflow helm chart which is {AIRFLOW_HOME}/dags
(/opt/airflow/dags
).
The metaflow-tools repository also ships an airflow_dag_upload.py file that can help sync Airflow DAG file generated by Metaflow to the Airflow scheduler deployed by this template. Under the hood airflow_dag_upload.py uses the kubectl cp
command to copy files from local to the Airflow scheduler's container. Example of how to use the file:
python airflow_dag_upload.py my-dag.py /opt/airflow/dags/my-dag.py
By default, Terraform manages the state of the Nebius AI Cloud resources in local tfstate files.
If you plan to maintain the minimal stack for any significant period of time, it is highly recommended to store the state files in a cloud storage instead, like Object Storage in Nebius AI Cloud. This is especially useful in the following cases:
- More than one person needs to manage the stack by using Terraform. Everyone should work off a single copy of the state file.
- You want to mitigate the risk of data loss on your local disk.
For more details, see the Terraform documentation.
To destroy infra
, run:
terraform destroy -target="module.infra" -var-file=terraform.tfvars
To destroy services
, run:
terraform destroy -target="module.services" -var-file=terraform.tfvars