Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Breaking change with image_architecture in 1.1.5 #232

Closed
rifelpet opened this issue Jul 30, 2024 · 6 comments · Fixed by #234
Closed

Breaking change with image_architecture in 1.1.5 #232

rifelpet opened this issue Jul 30, 2024 · 6 comments · Fixed by #234
Labels

Comments

@rifelpet
Copy link

Overview of the Issue

The recent 1.1.5 release included a breaking change in #214.

When unset the value defaults to X86_64. In prior versions, the value implicitly defaulted to the machine_type's architecture. This means when building with an ARM64 machine_type and source_image, upgrading to 1.1.5 introduces this error:

Error waiting for image: googleapi: Error 400: Invalid value for field 'resource.architecture': 'X86_64'. Requested architecture must be the same as the source resource architecture (ARM64)., invalid

The error is fixed by setting image_architecture = "ARM64". This required change is not mentioned in the release notes.

I think a better default for the image_architecture field is to use the machine_type's architecture, since it would be one less field that users need to add when building ARM64 images.

Reproduction Steps

Build an image from an ARM64 machine type and source image but without image_architecture set.

Plugin and Packer version

Packer v1.11.2

Simplified Packer Buildfile

The below configuration fails to build. Adjusting the required_plugins to 1.1.4 succeeds.

packer {
  required_plugins {
    googlecompute = {
      source  = "github.com/hashicorp/googlecompute"
      version = "= 1.1.5"
    }
  }
}

source "googlecompute" "example" {
  source_image_family = "ubuntu-minimal-2204-lts-arm64"
  machine_type        = "t2a-standard-1"
  ssh_username        = "ubuntu"
  project_id          = "..."
  image_name          = "..."
  zone                = "..."
  subnetwork          = "..."
}

build {
  name    = "ansible"
  sources = ["source.googlecompute.example"]
}

Operating system and Environment details

OS, Architecture, and any other information you can provide about the
environment.

Log Fragments and crash.log files

Here are logs from a successful build with 1.1.4:

ansible.googlecompute.example: output will be in this color.

==> ansible.googlecompute.example: Checking image does not exist...
==> ansible.googlecompute.example: Creating temporary RSA SSH key for instance...
==> ansible.googlecompute.example: no persistent disk to create
==> ansible.googlecompute.example: Using image: ubuntu-minimal-2204-jammy-arm64-v20240725
==> ansible.googlecompute.example: Creating instance...
    ansible.googlecompute.example: Loading zone: us-central1-a
    ansible.googlecompute.example: Loading machine type: t2a-standard-1
    ansible.googlecompute.example: Requesting instance creation...
    ansible.googlecompute.example: Waiting for creation operation to complete...
    ansible.googlecompute.example: Instance has been created!
==> ansible.googlecompute.example: Waiting for the instance to become running...
    ansible.googlecompute.example: IP: ...
==> ansible.googlecompute.example: Using SSH communicator to connect: ...
==> ansible.googlecompute.example: Waiting for SSH to become available...
==> ansible.googlecompute.example: Connected to SSH!
==> ansible.googlecompute.example: Deleting instance...
    ansible.googlecompute.example: Instance has been deleted!
==> ansible.googlecompute.example: Creating image...
==> ansible.googlecompute.example: Deleting disk...
    ansible.googlecompute.example: Disk has been deleted!
Build 'ansible.googlecompute.example' finished after 2 minutes 16 seconds.

==> Wait completed after 2 minutes 16 seconds

==> Builds finished. The artifacts of successful builds are:
--> ansible.googlecompute.example: A disk image was created in the '...' project: ...

Here are logs from a failing build with 1.1.5:

ansible.googlecompute.example: output will be in this color.

==> ansible.googlecompute.example: Checking image does not exist...
==> ansible.googlecompute.example: Creating temporary RSA SSH key for instance...
==> ansible.googlecompute.example: no persistent disk to create
==> ansible.googlecompute.example: Using image: ubuntu-minimal-2204-jammy-arm64-v20240725
==> ansible.googlecompute.example: Creating instance...
    ansible.googlecompute.example: Loading zone: us-central1-a
    ansible.googlecompute.example: Loading machine type: t2a-standard-1
    ansible.googlecompute.example: Requesting instance creation...
    ansible.googlecompute.example: Waiting for creation operation to complete...
    ansible.googlecompute.example: Instance has been created!
==> ansible.googlecompute.example: Waiting for the instance to become running...
    ansible.googlecompute.example: IP: ...
==> ansible.googlecompute.example: Using SSH communicator to connect: ...
==> ansible.googlecompute.example: Waiting for SSH to become available...
==> ansible.googlecompute.example: Connected to SSH!
==> ansible.googlecompute.example: Deleting instance...
    ansible.googlecompute.example: Instance has been deleted!
==> ansible.googlecompute.example: Creating image...
==> ansible.googlecompute.example: Error waiting for image: googleapi: Error 400: Invalid value for field 'resource.architecture': 'X86_64'. Requested architecture must be the same as the source resource architecture (ARM64)., invalid
==> ansible.googlecompute.example: Deleting disk...
    ansible.googlecompute.example: Disk has been deleted!
==> ansible.googlecompute.example: Provisioning step had errors: Running the cleanup provisioner, if present...
Build 'ansible.googlecompute.example' errored after 1 minute 44 seconds: Error waiting for image: googleapi: Error 400: Invalid value for field 'resource.architecture': 'X86_64'. Requested architecture must be the same as the source resource architecture (ARM64)., invalid

==> Wait completed after 1 minute 44 seconds

==> Some builds didn't complete successfully and had errors:
--> ansible.googlecompute.example: Error waiting for image: googleapi: Error 400: Invalid value for field 'resource.architecture': 'X86_64'. Requested architecture must be the same as the source resource architecture (ARM64)., invalid

==> Builds finished but no artifacts were created.

@rifelpet rifelpet added the bug label Jul 30, 2024
@ianchesal
Copy link

I am also seeing this issue trying to build ARM64 images with the 1.1.5 release.

@lbajolet-hashicorp
Copy link
Contributor

Hi @rifelpet,

Thanks for the call-out; I remember pointing this out in my review at the time, and opted to merge it as-is since it felt safe at first glance, but I was wrong, sorry about this.

Looking at the API docs, it's not super clear what information we can get about the instances (the APIs don't seem to expose architecture on things other than disks it seems?) or the image, but maybe we can just default on not specifying it. There is a ARCHITECTURE_UNSPECIFIED parameter to the enum, I suspect it used to work in your case because this is the default value. I'm surprised though as to why @BrennenMM7 didn't experience the same behaviour in their case, maybe some subtlety in the image/instance involved?

Regarding deriving the architecture from the instance type, I don't think the API/SDK exposes that unfortunately, the image should however, maybe we can use this if the arch is undefined in the configs.

I'll update the thread when I've got something, if possible would you be able to test it once it's up before we release? I'd like to make sure the change works before rolling it in.

Thanks!

@rifelpet
Copy link
Author

rifelpet commented Aug 5, 2024

Hi @lbajolet-hashicorp

Yes I'm happy to test a potential fix.

@lbajolet-hashicorp
Copy link
Contributor

Hi @rifelpet,

I've opened PR #234 that addresses this, now the default value for image_architecture is the empty string, which is what is sent to the APIs again. I've added some acceptance tests that should make sure we don't end-up with this again in the future, but if possible I'd suggest testing to build and use the plugin to run some tests with your existing configs.

Not sure if you have seen the changes to how plugins are handled with Packer 1.11.x, but for reference I'll leave a link to our docs, I'd suggest using packer plugins install --path <binary> github.com/hashicorp/googlecompute to install it.

Thanks in advance!

@rifelpet
Copy link
Author

rifelpet commented Aug 7, 2024

I confirmed that #234 fixes the problem 👍🏻

@lbajolet-hashicorp
Copy link
Contributor

Thanks for the update and the test @rifelpet!

I've merged the change in, and will release the plugin today, hopefully that'll fix things for everyone.

Thanks again for reporting this, and sorry for the blunder in the first place!

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants