Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[Bug]: beam_CleanupGCPResources is failing with log "The operation was canceled" #33435

Open
17 tasks
chamikaramj opened this issue Dec 20, 2024 · 9 comments
Open
17 tasks
Assignees
Labels

Comments

@chamikaramj
Copy link
Contributor

What happened?

https://github.com/apache/beam/actions/workflows/beam_CleanUpGCPResources.yml

For example,
https://github.com/apache/beam/actions/runs/12363928955/job/34506222289

> Task :beam-test-tools:removeStaleK8sWorkload
namespace "beam-performancetests-singlestoreio-12352184229" deleted
Error: The operation was canceled.

Issue Priority

Priority: 1 (data loss / total loss of function)

Issue Components

  • Component: Python SDK
  • Component: Java SDK
  • Component: Go SDK
  • Component: Typescript SDK
  • Component: IO connector
  • Component: Beam YAML
  • Component: Beam examples
  • Component: Beam playground
  • Component: Beam katas
  • Component: Website
  • Component: Infrastructure
  • Component: Spark Runner
  • Component: Flink Runner
  • Component: Samza Runner
  • Component: Twister2 Runner
  • Component: Hazelcast Jet Runner
  • Component: Google Cloud Dataflow Runner
@chamikaramj
Copy link
Contributor Author

cc: @Abacn

@chamikaramj chamikaramj changed the title [Bug]: beam_CleanupGCPResources is failing due to "The operation was canceled" [Bug]: beam_CleanupGCPResources is failing with log "The operation was canceled" Dec 20, 2024
@chamikaramj
Copy link
Contributor Author

The error log doesn't seem to be very useful unfortunately.

@chamikaramj
Copy link
Contributor Author

@Abacn assigning to you since you probably have the most context regarding this workflow. Feel free to forward if needed.

@Abacn
Copy link
Contributor

Abacn commented Dec 27, 2024

for future reference, it is due to a namespace (beam-performancetests-singlestoreio-12352184229) with finalizer stuck at teardown

To fix, follow https://cloud.google.com/knowledge/kb/deleted-namespace-remains-in-terminating-status-000004867

@chamikaramj
Copy link
Contributor Author

From where would you run kubectl ? Probably good to document exact commands you used to resolve the error.

@Abacn
Copy link
Contributor

Abacn commented Dec 27, 2024

the command I used was gcloud container clusters get-credentials io-datastores --zone us-central1-a --project apache-beam-testing. It is given by GCP console. One can run in cloud shell or locally.

@Abacn
Copy link
Contributor

Abacn commented Jan 23, 2025

Re-open this issue to track the task that makes clean up script working for such workload with non-empty finalizer, see #31846 (comment)

@Abacn Abacn reopened this Jan 23, 2025
@Abacn Abacn added P2 and removed P1 labels Jan 23, 2025
@jrmccluskey
Copy link
Contributor

@Abacn is this bug a release blocker for 2.63.0? It doesn't appear to be but i'd like to confirm before moving it off the milestone

@Abacn Abacn removed this from the 2.63.0 Release milestone Feb 4, 2025
@Abacn
Copy link
Contributor

Abacn commented Feb 4, 2025

yeah it doesn't. The milestone was due to the issue get reopened

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants