Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

tools/gke-disk-image-builder: Fails with Message: Quota 'CPUS_ALL_REGIONS' exceeded. #849

Open
katilp opened this issue Oct 14, 2024 · 1 comment

Comments

@katilp
Copy link

katilp commented Oct 14, 2024

What happens?

The image-build script (both the go script and through gcloud builds....) fails with error Quota 'CPUS_ALL_REGIONS' exceeded:

[...]
[secondary-disk-image]: 2024-10-14T22:45:07+02:00 Step "create-disk" (CreateDisks) successfully finished.
[secondary-disk-image]: 2024-10-14T22:45:07+02:00 Running step "create-instance" (CreateInstances)
[secondary-disk-image.create-instance]: 2024-10-14T22:45:07+02:00 CreateInstances: Creating instance "secondary-disk-image-instance".
[secondary-disk-image]: 2024-10-14T22:45:14+02:00 Error running workflow: step "create-instance" run error: operation failed &{ClientOperationId: CreationTimestamp: Description: EndTime:2024-10-14T13:45:15.481-07:00 Error:0xc000328230 HttpErrorMessage:FORBIDDEN HttpErrorStatusCode:403 Id:3138785068221217850 InsertTime:2024-10-14T13:45:09.545-07:00 InstancesBulkInsertOperationMetadata:<nil> Kind:compute#operation Name:operation-1728938708258-62475e98b4d71-14d19ec0-defb4629 OperationGroupId: OperationType:insert Progress:100 Region: SelfLink:https://www.googleapis.com/compute/v1/projects//zones/europe-west4-a/operations/operation-1728938708258-62475e98b4d71-14d19ec0-defb4629 SetCommonInstanceMetadataOperationMetadata:<nil> StartTime:2024-10-14T13:45:09.546-07:00 Status:DONE StatusMessage: TargetId:7167288959635045435 TargetLink:https://www.googleapis.com/compute/v1/projects/<PROJECT_ID>/zones/europe-west4-a/instances/secondary-disk-image-instance User:<MY_EMAIL> Warnings:[] Zone:https://www.googleapis.com/compute/v1/projects/<PROJECT_ID>/zones/europe-west4-a ServerResponse:{HTTPStatusCode:200 Header:map[Alt-Svc:[h3=":443"; ma=2592000,h3-29=":443"; ma=2592000] Cache-Control:[private] Content-Type:[application/json; charset=UTF-8] Date:[Mon, 14 Oct 2024 20:45:15 GMT] Server:[ESF] Vary:[Origin X-Origin Referer] X-Content-Type-Options:[nosniff] X-Frame-Options:[SAMEORIGIN] X-Xss-Protection:[0]]} ForceSendFields:[] NullFields:[]}:
Code: QUOTA_EXCEEDED
Message: Quota 'CPUS_ALL_REGIONS' exceeded.  Limit: 12.0 globally.
[secondary-disk-image]: 2024-10-14T22:45:14+02:00 Workflow "secondary-disk-image" cleaning up (this may take up to 2 minutes).
[secondary-disk-image]: 2024-10-14T22:45:15+02:00 Workflow "secondary-disk-image" finished cleanup.
2024/10/14 22:45:15 unable to generate disk image: step "create-instance" run error: operation failed &{ClientOperationId: CreationTimestamp: Description: EndTime:2024-10-14T13:45:15.481-07:00 Error:0xc000328230 HttpErrorMessage:FORBIDDEN HttpErrorStatusCode:403 Id:3138785068221217850 InsertTime:2024-10-14T13:45:09.545-07:00 InstancesBulkInsertOperationMetadata:<nil> Kind:compute#operation Name:operation-1728938708258-62475e98b4d71-14d19ec0-defb4629 OperationGroupId: OperationType:insert Progress:100 Region: SelfLink:https://www.googleapis.com/compute/v1/projects/<PROJECT_ID>/zones/europe-west4-a/operations/operation-1728938708258-62475e98b4d71-14d19ec0-defb4629 SetCommonInstanceMetadataOperationMetadata:<nil> StartTime:2024-10-14T13:45:09.546-07:00 Status:DONE StatusMessage: TargetId:7167288959635045435 TargetLink:https://www.googleapis.com/compute/v1/projects/<PROJECT_ID>/zones/europe-west4-a/instances/secondary-disk-image-instance User:<MY_EMAIL> Warnings:[] Zone:https://www.googleapis.com/compute/v1/projects/<PROJECT_ID>/zones/europe-west4-a ServerResponse:{HTTPStatusCode:200 Header:map[Alt-Svc:[h3=":443"; ma=2592000,h3-29=":443"; ma=2592000] Cache-Control:[private] Content-Type:[application/json; charset=UTF-8] Date:[Mon, 14 Oct 2024 20:45:15 GMT] Server:[ESF] Vary:[Origin X-Origin Referer] X-Content-Type-Options:[nosniff] X-Frame-Options:[SAMEORIGIN] X-Xss-Protection:[0]]} ForceSendFields:[] NullFields:[]}:
Code: QUOTA_EXCEEDED
Message: Quota 'CPUS_ALL_REGIONS' exceeded.  Limit: 12.0 globally.
panic: unable to generate disk image: step "create-instance" run error: operation failed &{ClientOperationId: CreationTimestamp: Description: EndTime:2024-10-14T13:45:15.481-07:00 Error:0xc000328230 HttpErrorMessage:FORBIDDEN HttpErrorStatusCode:403 Id:3138785068221217850 InsertTime:2024-10-14T13:45:09.545-07:00 InstancesBulkInsertOperationMetadata:<nil> Kind:compute#operation Name:operation-1728938708258-62475e98b4d71-14d19ec0-defb4629 OperationGroupId: OperationType:insert Progress:100 Region: SelfLink:https://www.googleapis.com/compute/v1/projects/<PROJECT_ID>/zones/europe-west4-a/operations/operation-1728938708258-62475e98b4d71-14d19ec0-defb4629 SetCommonInstanceMetadataOperationMetadata:<nil> StartTime:2024-10-14T13:45:09.546-07:00 Status:DONE StatusMessage: TargetId:7167288959635045435 TargetLink:https://www.googleapis.com/compute/v1/projects/<PROJECT_ID>/zones/europe-west4-a/instances/secondary-disk-image-instance User:<MY EMAIL> Warnings:[] Zone:https://www.googleapis.com/compute/v1/projects/<PROJECT_ID>/zones/europe-west4-a ServerResponse:{HTTPStatusCode:200 Header:map[Alt-Svc:[h3=":443"; ma=2592000,h3-29=":443"; ma=2592000] Cache-Control:[private] Content-Type:[application/json; charset=UTF-8] Date:[Mon, 14 Oct 2024 20:45:15 GMT] Server:[ESF] Vary:[Origin X-Origin Referer] X-Content-Type-Options:[nosniff] X-Frame-Options:[SAMEORIGIN] X-Xss-Protection:[0]]} ForceSendFields:[] NullFields:[]}:
Code: QUOTA_EXCEEDED
Message: Quota 'CPUS_ALL_REGIONS' exceeded.  Limit: 12.0 globally.

How to reproduce?

In a new project, create a bucket for logs

gcloud storage buckets create gs://<BUCKET_FOR_LOGS>/ --location europe-west4

Enable services:

gcloud services enable cloudbuild.googleapis.com compute.googleapis.com

Add IAM policy bindings as instructed:

gcloud projects add-iam-policy-binding <PROJECT_ID> --member serviceAccount:<PROJECT_NR>@cloudbuild.gserviceaccount.com --role roles/compute.serviceAgent
gcloud projects add-iam-policy-binding <PROJECT_ID> --member serviceAccount:<PROJECT_NR>@cloudbuild.gserviceaccount.com --role roles/compute.admin

Add the bucket access:

gcloud storage buckets add-iam-policy-binding gs://<BUCKET_FOR_LOGS>/ --project=<PROJECT_ID> --member=serviceAccount:<PROJECT_NR>-compute@developer.gserviceaccount.com --role=roles/storage.objectCreator

Create the credentials with

gcloud auth application-default login

and run the script:

go run ./cli --project-name=<PROJECT_ID> --image-name=<MY_IMAGE_NAME> --zone=europe-west4-a --gcs-path=gs://<BUCKET_FOR_LOGS> --disk-size-gb=50 --container-image=<MY_SOURCE_IMAGE> --timeout 100m

What I would expect?

I used the script successfully some weeks ago (on Sept 21) in the same region and it went smoothly. The output around that point was:

[...]
[secondary-disk-image]: 2024-09-21T14:18:25+02:00 Step "create-disk" (CreateDisks) successfully finished.
[secondary-disk-image]: 2024-09-21T14:18:25+02:00 Running step "create-instance" (CreateInstances)
[secondary-disk-image.create-instance]: 2024-09-21T14:18:25+02:00 CreateInstances: Creating instance "secondary-disk-image-instance".
[secondary-disk-image.create-instance]: 2024-09-21T14:18:31+02:00 CreateInstances: Streaming instance "secondary-disk-image-instance" serial port 1 output to https://storage.cloud.google.com/pfnano-disk-image-build-logs/daisy-secondary-disk-image-20240921-12:18:17-ny44j/logs/secondary-disk-image-instance-serial-port1.log
[secondary-disk-image]: 2024-09-21T14:18:31+02:00 Step "create-instance" (CreateInstances) successfully finished.
[...]

It is not clear to me why the CPU limit would be an issue for this script.

Any suggestions on how to get this fixed?

Thank you!

@katilp
Copy link
Author

katilp commented Oct 16, 2024

Replying to my own question:

This was due to no resources (type of machines) available in the requested zone.
Changing --zone and/or --machine-type solves the problem.

The error message

Code: QUOTA_EXCEEDED
Message: Quota 'CPUS_ALL_REGIONS' exceeded.

or

Code: QUOTA_EXCEEDED
Message: Quota 'N2_CPUS' exceeded.

is misleading and most likely got distorted by Cloud Build from the original message from the VM creation which is of type:

ERROR: (gcloud.compute.instances.create) Could not fetch resource:
---
code: ZONE_RESOURCE_POOL_EXHAUSTED
errorDetails:
- help:
    links:
    - description: Troubleshooting documentation
      url: https://cloud.google.com/compute/docs/resource-error
- localizedMessage:
    locale: en-US
    message: A e2-standard-2 VM instance is currently unavailable in the europe-west4-a
      zone. Alternatively, you can try your request again with a different VM hardware
      configuration or at a later time. For more information, see the troubleshooting
      documentation.
- errorInfo:
    domain: compute.googleapis.com
    metadatas:
      attachment: ''
      vmType: e2-standard-2
      zone: europe-west4-a
      zonesAvailable: ''
    reason: resource_availability
message: The zone 'projects/<PROEJCT_ID>/zones/europe-west4-a' does not have
  enough resources available to fulfill the request.  Try a different zone, or try
  again later.

May I suggest that someone paid by Google propagates this to the Cloud Build team so that they can make sure that Cloud Build errors retain the original content of the VM error? That would help users understand what happens.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant