Skip to content

CI: add build and push models workflow #474

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
62 changes: 62 additions & 0 deletions .github/workflows/142be17d7563c3499b548dae913cabd7b8242f78.patch
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
From 142be17d7563c3499b548dae913cabd7b8242f78 Mon Sep 17 00:00:00 2001
From: Jordi Massaguer Pla <jmassaguerpla@suse.com>
Date: Tue, 14 Nov 2023 10:30:15 +0100
Subject: [PATCH] Fix using no-cache option for the container build

If we specify no-cache, we should not add the local cache with the
from-cache and to-cache parameters. Otherwise, we get the error

```
WARNING: local cache import at /home/adminuser/.holoscan_build_cache
not found due to err: could not read
/home/adminuser/.holoscan_build_cache/index.json: open
/home/adminuser/.holoscan_build_cache/index.json: no such file or directory
```
being adminuser the user that runs the build.

This is important for CI, where we do not have any cache to start with.

Signed-off-by: Jordi Massaguer Pla <jmassaguerpla@suse.com>
---
python/holoscan/cli/packager/container_builder.py | 14 +++++++++-----
1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/python/holoscan/cli/packager/container_builder.py b/python/holoscan/cli/packager/container_builder.py
index 92edd91..ca6e141 100644
--- a/python/holoscan/cli/packager/container_builder.py
+++ b/python/holoscan/cli/packager/container_builder.py
@@ -89,9 +89,11 @@ def _build_internal(
builder = create_and_get_builder(Constants.LOCAL_BUILDX_BUILDER_NAME)

build_result = PlatformBuildResults(platform_parameters)
-
- cache_to = {"type": "local", "dest": self._build_parameters.build_cache}
- cache_from = [{"type": "local", "src": self._build_parameters.build_cache}]
+ cache_to = {}
+ cache_from = []
+ if not self._build_parameters.no_cache:
+ cache_to = {"type": "local", "dest": self._build_parameters.build_cache}
+ cache_from = [{"type": "local", "src": self._build_parameters.build_cache}]
if platform_parameters.base_image is not None:
cache_from.append({"type": "registry", "ref": platform_parameters.base_image})
if platform_parameters.build_image is not None:
@@ -99,8 +101,6 @@ def _build_internal(
builds = {
"builder": builder,
"cache": not self._build_parameters.no_cache,
- "cache_from": cache_from,
- "cache_to": cache_to,
"context_path": self._temp_dir,
"file": dockerfile,
"platforms": [platform_parameters.docker_arch],
@@ -108,6 +108,10 @@ def _build_internal(
"pull": True,
"tags": [platform_parameters.tag],
}
+ if cache_to != {}:
+ builds["cache_to"] = cache_to
+ if cache_from != []:
+ builds["cache_from"] = cache_from

export_to_tar_ball = False
if self._build_parameters.tarball_output is not None:
129 changes: 129 additions & 0 deletions .github/workflows/build_and_push_models.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,129 @@
# This workflow will install Python dependencies, build the latest models as containers, and push to the registry the resulting containers
# TODO: Use cache for caching the docker images, to speed up the build
# TODO: Can we have the dependencies stored somehow (predownloaded, a custom image, a container registry, our artifact server...) so this will always be reproduceable?

name: build_and_push_models

# This is triggered manually. It could be changed to be triggered by new pushed tags.
on: workflow_dispatch

# Version could be infered from the new tag if this was triggered by a new tag push
# FIXME: Python version could be inferred with "python --version" run inside the containers, and CP is the python version without '.'
# FIXME: wheel name could be dynamically generated, by for example use "ls" on the download folder
# ARM environment variables are used by the terraform azure provider for authentication using a client secret.
# See https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/guides/service_principal_client_secret
# https://github.com/Azure-Samples/terraform-github-actions/blob/main/.github/workflows/tf-plan-apply.yml
env:
ARM_CLIENT_ID: "${{ secrets.AZURE_CLIENT_ID }}"
ARM_SUBSCRIPTION_ID: "${{ secrets.AZURE_SUBSCRIPTION_ID }}"
ARM_TENANT_ID: "${{ secrets.AZURE_TENANT_ID }}"
ARM_CLIENT_SECRET: "${{ secrets.AZURE_CLIENT_SECRET }}"
VERSION: "0.6.0"
PYTHON_VERSION: "3.8"
CP_VERSION: "38"
DOCKER_IMAGE_TAG : "latest"
APP_IMAGE_NAME : "simple_app"
PLATFORM : "x64-workstation"
DOCKER_IMAGE_NAME : "simple_app-x64-workstation-dgpu-linux-amd64-latest"
DOCKER_IMAGE_NAME_SHORT: "simple_app-x64-workstation-dgpu-linux-amd64"
REGISTRY: ghcr.io
IMAGE_NAME: ${{ github.repository }}
APP: "examples/apps/simple_imaging_app"

jobs:
do:
runs-on: ubuntu-latest
permissions:
contents: read
packages: write
id-token: write
steps:
- uses: actions/checkout@v2
- name: Disclaimers
run: |
echo "!!! WARNING !!! This is a hackweek project, not meant for production or clinical usage, does not have any kind of guarantee, use at your own risk.https://hackweek.opensuse.org/23/projects/package-monai-machine-learning-models-for-medical-applications. !!! WARNING !!!"
# Install the latest version of the Terraform CLI
- name: Show disc space
run: df -h
- name: Setup Terraform
uses: hashicorp/setup-terraform@v2
with:
terraform_wrapper: false
- name: Initialize a new Terraform working directory
run: terraform init
- name: Check Terraform configuration files format
run: terraform fmt -check
- name: Generate unique SSH Key
run: ssh-keygen -t rsa -f /tmp/ssh_id_gh -N ""
- name: Terraform Apply
run: terraform apply -auto-approve
- name: Get IP address
run: echo "AZURE_IPADDRESS=$(terraform output | grep instance_public_ip | cut -d\" -f2)" >> $GITHUB_ENV
- name: Output ip address
run: echo "AZURE_IPADDRESS=$AZURE_IPADDRESS"
- name: Test connection
# We use StrictHostKeyChecking=no to accept the SSH fingerprint on the first connection
run: ssh -i /tmp/ssh_id_gh -o StrictHostKeyChecking=no adminuser@$AZURE_IPADDRESS "sudo uname -a"
- name: Add fixed libseccomp package
run: ssh -i /tmp/ssh_id_gh adminuser@${AZURE_IPADDRESS} "sudo zypper ar -G https://download.opensuse.org/repositories/home:/jordimassaguerpla:/branches:/openSUSE:/Leap:/15.5:/Update/pool-leap-15.5/home:jordimassaguerpla:branches:openSUSE:Leap:15.5:Update.repo && sudo zypper ref && sudo zypper -n install --from home_jordimassaguerpla_branches_openSUSE_Leap_15.5_Update --allow-vendor-change libseccomp"
- name: Install Deps
run: ssh -i /tmp/ssh_id_gh adminuser@${AZURE_IPADDRESS} "sudo zypper ar -G https://developer.download.nvidia.com/compute/cuda/repos/opensuse15/x86_64/ nvidia && sudo zypper ref && sudo zypper --non-interactive install patch python39 docker-buildx nvidia-container-toolkit nvidia-computeG05 cuda-cudart-devel-11-0 libyaml-cpp0_6 trivy && wget -c https://bootstrap.pypa.io/get-pip.py && python3.9 get-pip.py && python3.9 --version"
- name: Setup Nvidia container
run: ssh -i /tmp/ssh_id_gh adminuser@${AZURE_IPADDRESS} "sudo usermod -G docker,video adminuser && sudo nvidia-ctk runtime configure --runtime=docker && sudo nvidia-ctk runtime configure --runtime=containerd && sudo systemctl start docker && sudo systemctl start containerd && sudo sed -e \"s/user = \\\"\\\"/user = \\\"adminuser:video\\\"/g \" -i /etc/nvidia-container-runtime/config.toml && sudo modprobe nvidia"
- name: Check nvidia
run: ssh -i /tmp/ssh_id_gh adminuser@${AZURE_IPADDRESS} "sudo systemctl start docker && nvidia-smi && docker run --rm --runtime=nvidia --gpus all ubuntu nvidia-smi"
- name: Install Monai Deploy Sdk and Holoscan
run: ssh -i /tmp/ssh_id_gh adminuser@${AZURE_IPADDRESS} "python3.9 -m pip install monai-deploy-app-sdk holoscan"
- name: Fix Holoscan
run: ssh -i /tmp/ssh_id_gh adminuser@${AZURE_IPADDRESS} 'cd /home/adminuser/.local/lib/python3.9/site-packages/holoscan/lib ; to_link=$(ls libholoscan_*.so*);for i in $to_link; do name=$(echo $i | cut -d. -f1); ln -sv $name.so.$VERSION $name.so.0;done'
- name: Copy holoscan patch
run: scp -i /tmp/ssh_id_gh .github/workflows/142be17d7563c3499b548dae913cabd7b8242f78.patch adminuser@${AZURE_IPADDRESS}:/home/adminuser/.local/lib/python3.9/site-packages/holoscan
- name: Patch Holoscan
run: ssh -i /tmp/ssh_id_gh adminuser@${AZURE_IPADDRESS} "cd /home/adminuser/.local/lib/python3.9/site-packages/holoscan; patch -p3 < 142be17d7563c3499b548dae913cabd7b8242f78.patch"
- name: Make work dir
run: ssh -i /tmp/ssh_id_gh adminuser@${AZURE_IPADDRESS} "mkdir /home/adminuser/work"
- name: Download wheels
run: ssh -i /tmp/ssh_id_gh adminuser@${AZURE_IPADDRESS} "cd /home/adminuser/work && python3.9 -m pip download --no-deps --python-version=$PYTHON_VERSION holoscan==$VERSION && python3.9 -m pip download --no-deps monai-deploy-app-sdk==$VERSION"
- name: Copy example code
run: scp -i /tmp/ssh_id_gh -r * adminuser@${AZURE_IPADDRESS}:/home/adminuser/work
- name: Monai Deploy package
run: ssh -i /tmp/ssh_id_gh adminuser@${AZURE_IPADDRESS} "mkdir /home/adminuser/work/output && cd /home/adminuser/work && monai-deploy package --no-cache /home/adminuser/work/$APP -c /home/adminuser/work/$APP/app.yaml -t $APP_IMAGE_NAME:$DOCKER_IMAGE_TAG --platform $PLATFORM -l DEBUG --holoscan-sdk-file=/home/adminuser/work/holoscan-$VERSION-cp$CP_VERSION-cp$CP_VERSION-manylinux2014_x86_64.whl --monai-deploy-sdk-file=/home/adminuser/work/monai_deploy_app_sdk-$VERSION-py3-none-any.whl --platform-config dgpu --gid 1000 --output /home/adminuser/work/output"
- name: Build SBOM
run: ssh -i /tmp/ssh_id_gh adminuser@${AZURE_IPADDRESS} "trivy image --format spdx-json --input /home/adminuser/work/output/$DOCKER_IMAGE_NAME.tar > /home/adminuser/work/output/sbom.spdx.json"
- name: Size of docker image
run: ssh -i /tmp/ssh_id_gh adminuser@${AZURE_IPADDRESS} "du -hs /home/adminuser/work/output/*"
- name: Compress docker image
run: ssh -i /tmp/ssh_id_gh adminuser@${AZURE_IPADDRESS} "cd /home/adminuser/work/output && gzip $DOCKER_IMAGE_NAME.tar"
- name: Size of docker image
run: ssh -i /tmp/ssh_id_gh adminuser@${AZURE_IPADDRESS} "du -hs /home/adminuser/work/output/*"
- name: Show disc space
run: df -h
- name: Load docker image
run: ssh -i /tmp/ssh_id_gh adminuser@${AZURE_IPADDRESS} "cat /home/adminuser/work/output/$DOCKER_IMAGE_NAME.tar.gz" | docker load
- name: Get digest
run: echo "IMAGE_DIGEST=$(docker images --no-trunc -q $DOCKER_IMAGE_NAME_SHORT:$DOCKER_IMAGE_TAG)" >> $GITHUB_ENV
- name: Copy SBOM
run: scp -i /tmp/ssh_id_gh adminuser@${AZURE_IPADDRESS}:/home/adminuser/work/output/sbom.spdx.json .
- name: Log in to the Container registry
uses: docker/#-action@v3
with:
registry: ${{ env.REGISTRY }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Tag Docker image
run: docker tag $DOCKER_IMAGE_NAME_SHORT:$DOCKER_IMAGE_TAG $REGISTRY/$IMAGE_NAME/$DOCKER_IMAGE_NAME_SHORT:$DOCKER_IMAGE_TAG
- name: Push Docker image
run: docker push $REGISTRY/$IMAGE_NAME/$DOCKER_IMAGE_NAME_SHORT:$DOCKER_IMAGE_TAG
- name: Install sigstore cosign
uses: sigstore/cosign-installer@main
- name: Sign image
env:
COSIGN_EXPERIMENTAL: "true"
run: cosign sign --yes ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}@${{ env.IMAGE_DIGEST }}
- name: Sign attestations
env:
COSIGN_EXPERIMENTAL: "true"
run: cosign attest --yes --type spdx --predicate sbom.spdx.json ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}@${{ env.IMAGE_DIGEST }}
- name: Terraform Destroy
if: ${{ always() }}
run: terraform destroy -auto-approve
139 changes: 139 additions & 0 deletions main.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,139 @@
/*
export these variables before running this file
ARM_CLIENT_ID
ARM_SUBSCRIPTION_ID
ARM_TENANT_ID
ARM_CLIENT_SECRET
*/

# We strongly recommend using the required_providers block to set the
# Azure Provider source and version being used
terraform {
required_providers {
azurerm = {
source = "hashicorp/azurerm"
version = "=3.0.0"
}
}
}

# Configure the Microsoft Azure Provider
provider "azurerm" {
features {}
}

# Create a resource group
resource "azurerm_resource_group" "gh-actions-build-monai-models-resource-group" {
name = "gh-actions-build-monai-models-resource-group"
location = "West Europe"
}

# Create a virtual network within the resource group
resource "azurerm_virtual_network" "gh-actions-build-monai-models-virtual-network" {
name = "gh-actions-build-monai-models-virtual-network"
resource_group_name = azurerm_resource_group.gh-actions-build-monai-models-resource-group.name
location = azurerm_resource_group.gh-actions-build-monai-models-resource-group.location
address_space = ["10.0.0.0/16"]
}

resource "azurerm_subnet" "gh-actions-build-monai-models-internal-subnet" {
name = "gh-actions-build-monai-models-internal-subnet"
resource_group_name = azurerm_resource_group.gh-actions-build-monai-models-resource-group.name
virtual_network_name = azurerm_virtual_network.gh-actions-build-monai-models-virtual-network.name
address_prefixes = ["10.0.2.0/24"]
}

# Create public IPs
resource "azurerm_public_ip" "gh-actions-build-monai-models-public-ip" {
name = "gh-actions-build-monai-models-public-ip"
location = azurerm_resource_group.gh-actions-build-monai-models-resource-group.location
resource_group_name = azurerm_resource_group.gh-actions-build-monai-models-resource-group.name
allocation_method = "Dynamic"
}

resource "azurerm_network_interface" "gh-actions-build-monai-models-network-interface" {
name = "gh-actions-build-monai-models-network-interface"
location = azurerm_resource_group.gh-actions-build-monai-models-resource-group.location
resource_group_name = azurerm_resource_group.gh-actions-build-monai-models-resource-group.name

ip_configuration {
name = "gh-actions-build-monai-models-network-interface-ip-configuration"
subnet_id = azurerm_subnet.gh-actions-build-monai-models-internal-subnet.id
private_ip_address_allocation = "Dynamic"
public_ip_address_id = azurerm_public_ip.gh-actions-build-monai-models-public-ip.id
}
}

# Create Network Security Group and rule
resource "azurerm_network_security_group" "gh-actions-build-monai-models-nsg" {
name = "gh-actions-build-monai-models-nsg"
location = azurerm_resource_group.gh-actions-build-monai-models-resource-group.location
resource_group_name = azurerm_resource_group.gh-actions-build-monai-models-resource-group.name

security_rule {
name = "SSH"
priority = 1001
direction = "Inbound"
access = "Allow"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "22"
source_address_prefix = "*"
destination_address_prefix = "*"
}
}

# Connect the security group to the network interface
resource "azurerm_network_interface_security_group_association" "gh-actions-build-monai-models-ga" {
network_interface_id = azurerm_network_interface.gh-actions-build-monai-models-network-interface.id
network_security_group_id = azurerm_network_security_group.gh-actions-build-monai-models-nsg.id
}

resource "azurerm_linux_virtual_machine" "gh-actions-build-monai-models-vm" {
name = "gh-actions-build-monai-models-vm"
resource_group_name = azurerm_resource_group.gh-actions-build-monai-models-resource-group.name
location = azurerm_resource_group.gh-actions-build-monai-models-resource-group.location
// Standard_NC4as_T4_v3 has GPU. This has a cost associated!!!
size = "Standard_NC4as_T4_v3"
admin_username = "adminuser"
network_interface_ids = [
azurerm_network_interface.gh-actions-build-monai-models-network-interface.id,
]

admin_ssh_key {
username = "adminuser"
public_key = file("/tmp/ssh_id_gh.pub") //This file is in the vm where you run terraform!!
}

os_disk {
caching = "ReadWrite"
storage_account_type = "StandardSSD_LRS"
# With the default 30GB, docker will fail to load and export the image
disk_size_gb = "64"
}

source_image_reference {
publisher = "SUSE"
offer = "opensuse-leap-15-5"
sku = "gen2"
version = "latest"
}
}

resource "null_resource" "example" {
provisioner "remote-exec" {
connection {
host = azurerm_linux_virtual_machine.gh-actions-build-monai-models-vm.public_ip_address
user = "adminuser"
private_key = file("/tmp/ssh_id_gh")
}

inline = ["echo 'connected!'"]
}
}

output "instance_public_ip" {
description = "Public IP address"
value = azurerm_linux_virtual_machine.gh-actions-build-monai-models-vm.public_ip_address
}