You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
configured to use startup-script metadata to perform customization
given a service account that does not have the power to modify its own metadata
then the packer process gets stuck in an infinite loop. The guidance to the user is not very informative. My thoughts:
modify retry.Config to put a limit on the number of Tries or StartTimeout
Improve the guidance to the user at "Error getting startup script status" to help them understand that the service account probably needs the permission to modify its own instance metadata
Whatever process attempts to update the instance metadata should probably have a retry mechanism
These could be done separately. 1 and 2 are probably obvious. The reasoning behind 3 may not be. If you create a service account on Google Cloud and assign it IAM roles, those roles are not immediately applied but have a known propagation delay. Thus an automation pipeline might create the service account, assign it adequate permissions, and nevertheless Packer might fail.
Each timeout might reasonably be 10 minutes to account for worst case propagation delay.
Reproduction Steps
Begin by creating a service account without any IAM roles:
gcloud iam service-accounts create failure \
--description="SA" \
--display-name="failure"
Then supply that project_id and service account to the template below.
Plugin and Packer version
Packer v1.10.2
Plugin latest as shown below
Simplified Packer Buildfile
source"googlecompute""toolkit_image" {
project_id=var.project_idcommunicator="none"image_name="repro-fail"machine_type="n2-standard-8"disk_size=32disk_type="pd-balanced"omit_external_ip=trueuse_internal_ip=truesubnetwork="default"zone="us-central1-c"service_account_email=var.service_account_emailscopes=["https://www.googleapis.com/auth/cloud-platform"]
source_image_family="debian-12"metadata={
startup-script =<<-EOD #!/bin/bash /bin/true EOD
}
}
build {
name="test"sources=["sources.googlecompute.toolkit_image"]
}
variable"project_id" {
description="Project in which to create VM and image"type=string
}
variable"service_account_email" {
description="Service account email address"type=string
}
packer {
required_version=">= 1.7.9, < 2.0.0"# packer plugin 1.0.16 and above includes HPC VM Imagerequired_plugins {
googlecompute={
version ="~> 1.1.0"
source ="github.com/hashicorp/googlecompute"
}
}
}
Log Fragments and crash.log files
tpdownes@poreef ~/repro> packer build -var project_id=my-project -var service_account_email=failure@my-project.iam.gserviceaccount.com .
test.googlecompute.toolkit_image: output will be in this color.
==> test.googlecompute.toolkit_image: Checking image does not exist...
==> test.googlecompute.toolkit_image: Creating temporary RSA SSH key for instance...
==> test.googlecompute.toolkit_image: no persistent disk to create
==> test.googlecompute.toolkit_image: Using image: debian-12-bookworm-v20240312
==> test.googlecompute.toolkit_image: Creating instance...
test.googlecompute.toolkit_image: Loading zone: us-central1-c
test.googlecompute.toolkit_image: Loading machine type: n2-standard-8
test.googlecompute.toolkit_image: Requesting instance creation...
test.googlecompute.toolkit_image: Waiting for creation operation to complete...
test.googlecompute.toolkit_image: Instance has been created!
==> test.googlecompute.toolkit_image: Waiting for the instance to become running...
test.googlecompute.toolkit_image: IP: 10.128.0.10
==> test.googlecompute.toolkit_image: Waiting for any running startup script to finish...
test.googlecompute.toolkit_image: Metadata startup-script-status on instance packer-6615be2c-4509-e09b-a563-a2a3fcc15cf6 not available. Waiting...
test.googlecompute.toolkit_image: Metadata startup-script-status on instance packer-6615be2c-4509-e09b-a563-a2a3fcc15cf6 not available. Waiting...
test.googlecompute.toolkit_image: Metadata startup-script-status on instance packer-6615be2c-4509-e09b-a563-a2a3fcc15cf6 not available. Waiting...
test.googlecompute.toolkit_image: Metadata startup-script-status on instance packer-6615be2c-4509-e09b-a563-a2a3fcc15cf6 not available. Waiting...
test.googlecompute.toolkit_image: Metadata startup-script-status on instance packer-6615be2c-4509-e09b-a563-a2a3fcc15cf6 not available. Waiting...
test.googlecompute.toolkit_image: Metadata startup-script-status on instance packer-6615be2c-4509-e09b-a563-a2a3fcc15cf6 not available. Waiting...
test.googlecompute.toolkit_image: Metadata startup-script-status on instance packer-6615be2c-4509-e09b-a563-a2a3fcc15cf6 not available. Waiting...
test.googlecompute.toolkit_image: Metadata startup-script-status on instance packer-6615be2c-4509-e09b-a563-a2a3fcc15cf6 not available. Waiting...
test.googlecompute.toolkit_image: Metadata startup-script-status on instance packer-6615be2c-4509-e09b-a563-a2a3fcc15cf6 not available. Waiting...
test.googlecompute.toolkit_image: Metadata startup-script-status on instance packer-6615be2c-4509-e09b-a563-a2a3fcc15cf6 not available. Waiting...
test.googlecompute.toolkit_image: Metadata startup-script-status on instance packer-6615be2c-4509-e09b-a563-a2a3fcc15cf6 not available. Waiting...
test.googlecompute.toolkit_image: Metadata startup-script-status on instance packer-6615be2c-4509-e09b-a563-a2a3fcc15cf6 not available. Waiting...
test.googlecompute.toolkit_image: Metadata startup-script-status on instance packer-6615be2c-4509-e09b-a563-a2a3fcc15cf6 not available. Waiting...
test.googlecompute.toolkit_image: Metadata startup-script-status on instance packer-6615be2c-4509-e09b-a563-a2a3fcc15cf6 not available. Waiting...
test.googlecompute.toolkit_image: Metadata startup-script-status on instance packer-6615be2c-4509-e09b-a563-a2a3fcc15cf6 not available. Waiting...
test.googlecompute.toolkit_image: Metadata startup-script-status on instance packer-6615be2c-4509-e09b-a563-a2a3fcc15cf6 not available. Waiting...
Cancelling build after receiving interrupt
test.googlecompute.toolkit_image: Metadata startup-script-status on instance packer-6615be2c-4509-e09b-a563-a2a3fcc15cf6 not available. Waiting...
==> test.googlecompute.toolkit_image: Error waiting for startup script to finish: Error getting startup script status: Instance metadata key, startup-script-status, not found.
The text was updated successfully, but these errors were encountered:
tpdownes
changed the title
Startup script solution gets stuck in loop with infinite timeout
Startup script solution gets stuck in infinite loop
Apr 9, 2024
Another thought: I believe you can eliminate the need for IAM permissions entirely by modifying and polling VM guest attributes rather than instance metadata.
Overview of the Issue
If the Packer VM is:
then the
packer
process gets stuck in an infinite loop. The guidance to the user is not very informative. My thoughts:retry.Config
to put a limit on the number of Tries or StartTimeoutThese could be done separately. 1 and 2 are probably obvious. The reasoning behind 3 may not be. If you create a service account on Google Cloud and assign it IAM roles, those roles are not immediately applied but have a known propagation delay. Thus an automation pipeline might create the service account, assign it adequate permissions, and nevertheless Packer might fail.
Each timeout might reasonably be 10 minutes to account for worst case propagation delay.
Reproduction Steps
Begin by creating a service account without any IAM roles:
Then supply that project_id and service account to the template below.
Plugin and Packer version
Simplified Packer Buildfile
Log Fragments and crash.log files
The text was updated successfully, but these errors were encountered: