-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
feat(cli): cdk rollback
#31407
feat(cli): cdk rollback
#31407
Conversation
Add a CLI feature to roll a stuck change back. This is mostly useful for deployments performed using `--no-rollback`: if a failure occurs, the stack gets stuck in an `UPDATE_FAILED` state from which there are 2 options: - Try again using a new template - Roll back to the last stable state There used to be no way to perform the second operation using the CDK CLI, but there now is. `cdk rollback` works in 2 situations: - A paused fail state; it will initiating a fresh rollback. - A paused rollback state; it will retry the rollback, optionally skipping some resources. `cdk rollback --force` will look up all failed resources and continue skipping them until the rollback has finished.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The pull request linter has failed. See the aws-cdk-automation comment below for failure reasons. If you believe this pull request should receive an exemption, please comment and provide a justification.
A comment requesting an exemption should contain the text Exemption Request
. Additionally, if clarification is needed add Clarification Request
to a comment.
✅ Updated pull request passes all PRLinter validations. Dismissing previous PRLinter review.
packages/@aws-cdk-testing/cli-integ/tests/cli-integ-tests/cli.integtest.ts
Show resolved
Hide resolved
packages/@aws-cdk-testing/cli-integ/tests/cli-integ-tests/cli.integtest.ts
Outdated
Show resolved
Hide resolved
packages/@aws-cdk-testing/cli-integ/tests/cli-integ-tests/cli.integtest.ts
Outdated
Show resolved
Hide resolved
print('\n✨ Rollback time: %ss\n', formatTime(elapsedRollbackTime)); | ||
} catch (e: any) { | ||
error('\n ❌ %s failed: %s', chalk.bold(stack.displayName), e.message); | ||
throw new Error('Rollback failed (use --force to orphan failing resources)'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be better to accumulate errors to avoid a poison pill.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh so I think I was commenting from the position that if a stack cannot be rolled back, we should throw an error - in this case a single faulty stack can prevent rolling back others, so we need to accumulate errors.
If you insist on no-oping for a unrollable stack, this is fine, but I still think we should error out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I see that right now you do throw an error on ROLLBACK_FAILED
state - doesn't this mean you need to swallow the error here and proceed? to avoid the poison pill?
* It contains resources r1 and r2, where r1 gets deployed first. | ||
* | ||
* - PHASE = 1: both resources deploy regularly. | ||
* - PHASE = 2: r1 gets updated, r2 will fail to update, and r1 will fail its rollback. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Needs to be adjusted for phases 2a
and 2b
. I'm good with just removing this and let the tests speak for themselves.
print('\n✨ Rollback time: %ss\n', formatTime(elapsedRollbackTime)); | ||
} catch (e: any) { | ||
error('\n ❌ %s failed: %s', chalk.bold(stack.displayName), e.message); | ||
throw new Error('Rollback failed (use --force to orphan failing resources)'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I see that right now you do throw an error on ROLLBACK_FAILED
state - doesn't this mean you need to swallow the error here and proceed? to avoid the poison pill?
➡️ PR build request submitted to A maintainer must now check the pipeline and add the |
Thank you for contributing! Your pull request will be updated from main and then merged automatically (do not update manually, and be sure to allow changes to be pushed to your fork). |
AWS CodeBuild CI Report
Powered by github-codebuild-logs, available on the AWS Serverless Application Repository |
Thank you for contributing! Your pull request will be updated from main and then merged automatically (do not update manually, and be sure to allow changes to be pushed to your fork). |
Comments on closed issues and PRs are hard for our team to see. |
Add a CLI feature to roll a stuck change back.
This is mostly useful for deployments performed using
--no-rollback
: if a failure occurs, the stack gets stuck in anUPDATE_FAILED
state from which there are 2 options:There used to be no way to perform the second operation using the CDK CLI, but there now is.
cdk rollback
works in 2 situations:CREATE_FAILED
,UPDATE_FAILED
).UPDATE_ROLLBACK_FAILED
-- it seems there is no way to continue a rollback inROLLBACK_FAILED
state).cdk rollback --orphan <logicalid>
can be used to skip resource rollbacks that are causing problems.cdk rollback --force
will look up all failed resources and continue skipping them until the rollback has finished.This change requires new bootstrap permissions, so the bootstrap stack is updated to add the following IAM permissions to the
deploy-action
role:These are necessary to call the 2 CloudFormation APIs that start and continue a rollback.
Relates to (but does not close yet) #30546.
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license