(pipelines): changeset conflicts when running multiple self-mutating pipelines (e.g. triggered from same repository) #31060
Labels
@aws-cdk/pipelines
CDK Pipelines library
bug
This issue is a bug.
effort/small
Small work item – less than a day of effort
p3
Describe the bug
We have a repository which creates 7 separate CDK pipelines, all of them in their own pipeline stack, but also residing in the same GitHub repository. The pipelines are set up to trigger on an update of the main branch in the repository.
This means that all the pipelines will trigger at the same time, whenver there is a change in the main branch.
This usually works fine. However, if there is a change that triggers the self-mutation step to change the pipeline, it can fail because it tries to delete a change set that is in state CREATE_IN_PROGRESS.
This seems to be that it tries to delete a changeset from another stack than the one it should be deleting.
So there seems to be some collision/race condition when multiple pipelines self-mutate at the same time, possibly also because there is cross-region execution involved.
As can be seen in the included error message, the tanmbz-pipeline tries to delete a changeset for arz-pipeline-support, which fails.
Expected Behavior
I would expect the pipelines to run normally without errors.
Current Behavior
Example error message, with project /stack name/AWSaccount details replaced:
[Container] 2024/07/31 08:57:49.811612 Running command cdk -a . deploy tanmbz-pipeline --require-approval=never --verbose
[08:57:50] CDK toolkit version: 2.150.0 (build 3f93027)
[08:57:50] Command line arguments: {
_: [ 'deploy' ],
a: '.',
app: '.',
'require-approval': 'never',
requireApproval: 'never',
verbose: 1,
v: 1,
lookups: true,
'ignore-errors': false,
ignoreErrors: false,
json: false,
j: false,
debug: false,
ec2creds: undefined,
i: undefined,
'version-reporting': undefined,
versionReporting: undefined,
'path-metadata': undefined,
pathMetadata: undefined,
'asset-metadata': undefined,
assetMetadata: undefined,
'role-arn': undefined,
r: undefined,
roleArn: undefined,
staging: true,
'no-color': false,
noColor: false,
ci: false,
all: false,
'build-exclude': [],
E: [],
buildExclude: [],
force: false,
f: false,
parameters: [ {} ],
'previous-parameters': true,
previousParameters: true,
logs: true,
concurrency: 1,
'asset-prebuild': true,
assetPrebuild: true,
'ignore-no-stacks': false,
ignoreNoStacks: false,
'$0': '/usr/local/bin/cdk',
STACKS: [ 'tanmbz-pipeline' ],
'S-t-a-c-k-s': [ 'tanmbz-pipeline' ]
}
[08:57:50] merged settings: {
versionReporting: true,
assetMetadata: true,
pathMetadata: true,
output: 'cdk.out',
app: '.',
context: {},
debug: false,
requireApproval: 'never',
toolkitBucket: {},
staging: true,
bundlingStacks: [ '' ],
lookups: true,
assetPrebuild: true,
ignoreNoStacks: false
}
[08:57:50] Toolkit stack: CDKToolkit
[08:57:50] Setting "CDK_DEFAULT_REGION" environment variable to eu-west-1
[08:57:50] Resolving default credentials
[08:57:50] Looking up default account ID from STS
[08:57:50] Notices refreshed
[08:57:50] Failed to store notices in the cache: Error: ENOENT: no such file or directory, open '/root/.cdk/cache/notices.json'
[08:57:50] Default account ID: 123456789012
[08:57:50] Setting "CDK_DEFAULT_ACCOUNT" environment variable to 123456789012
[08:57:50] context: {
'aws:cdk:enable-path-metadata': true,
'aws:cdk:enable-asset-metadata': true,
'aws:cdk:version-reporting': true,
'aws:cdk:bundling-stacks': [ '' ]
}
[08:57:50] --app points to a cloud assembly, so we bypass synth
Including dependency stacks: cross-region-stack-123456789012:us-east-1
✨ Synthesis time: 0.18s
[08:57:50] Checking for previously published assets
[08:57:50] Retrieved account ID 123456789012 from disk cache
[08:57:50] Assuming role 'arn:aws:iam::123456789012:role/cdk-hnb659fds-deploy-role-123456789012-eu-west-1'.
[08:57:50] Retrieved account ID 123456789012 from disk cache
[08:57:50] Assuming role 'arn:aws:iam::123456789012:role/cdk-hnb659fds-deploy-role-123456789012-eu-west-1'.
[08:57:50] Retrieved account ID 123456789012 from disk cache
[08:57:50] Assuming role 'arn:aws:iam::123456789012:role/cdk-hnb659fds-file-publishing-role-123456789012-eu-west-1'.
[08:57:50] Retrieved account ID 123456789012 from disk cache
[08:57:50] Assuming role 'arn:aws:iam::123456789012:role/cdk-hnb659fds-file-publishing-role-123456789012-eu-west-1'.
[08:57:50] tanmbz-pipeline: check: Check s3://cdk-hnb659fds-assets-123456789012-eu-west-1/940f7968a86058a112f2a09ea3ca569fde773c6e2f78677bb843922e63bb86f3.zip
[08:57:50] tanmbz-pipeline: check: Check s3://cdk-hnb659fds-assets-123456789012-eu-west-1/2c6391ecd42f493217c6982a937dc0e862184528f45c3d0b6a24ebf71020cbc5.json
[08:57:50] tanmbz-pipeline: found: Found s3://cdk-hnb659fds-assets-123456789012-eu-west-1/940f7968a86058a112f2a09ea3ca569fde773c6e2f78677bb843922e63bb86f3.zip
[08:57:50] 2 total assets, 1 still need to be published
cross-region-stack-123456789012:us-east-1 (arz-pipeline-support-us-east-1)
cross-region-stack-123456789012:us-east-1 (arz-pipeline-support-us-east-1): deploying... [1/2]
[08:57:50] Retrieved account ID 123456789012 from disk cache
[08:57:50] Retrieved account ID 123456789012 from disk cache
[08:57:50] Assuming role 'arn:aws:iam::123456789012:role/cdk-hnb659fds-deploy-role-123456789012-eu-west-1'.
tanmbz-pipeline: start: Building 2c6391ecd42f493217c6982a937dc0e862184528f45c3d0b6a24ebf71020cbc5:123456789012-eu-west-1
tanmbz-pipeline: success: Built 2c6391ecd42f493217c6982a937dc0e862184528f45c3d0b6a24ebf71020cbc5:123456789012-eu-west-1
[08:57:51] arz-pipeline-support-us-east-1: checking if we can skip deploy
[08:57:51] arz-pipeline-support-us-east-1: template has changed
[08:57:51] arz-pipeline-support-us-east-1: deploying...
[08:57:51] Removing existing change set with name cdk-deploy-change-set if it exists
[08:57:52] Call failed: deleteChangeSet({"StackName":"arz-pipeline-support-us-east-1","ChangeSetName":"cdk-deploy-change-set"}) => Cannot delete ChangeSet in status CREATE_IN_PROGRESS (code=InvalidChangeSetStatus)
❌ cross-region-stack-123456789012:us-east-1 (arz-pipeline-support-us-east-1) failed: InvalidChangeSetStatus: Cannot delete ChangeSet in status CREATE_IN_PROGRESS
at Request.extractError (/usr/local/lib/node_modules/aws-cdk/lib/index.js:400:46723)
at Request.callListeners (/usr/local/lib/node_modules/aws-cdk/lib/index.js:400:91777)
at Request.emit (/usr/local/lib/node_modules/aws-cdk/lib/index.js:400:91225)
at Request.emit (/usr/local/lib/node_modules/aws-cdk/lib/index.js:400:199828)
at Request.transition (/usr/local/lib/node_modules/aws-cdk/lib/index.js:400:193380)
at AcceptorStateMachine.runTo (/usr/local/lib/node_modules/aws-cdk/lib/index.js:400:158252)
at /usr/local/lib/node_modules/aws-cdk/lib/index.js:400:158582
at Request. (/usr/local/lib/node_modules/aws-cdk/lib/index.js:400:193672)
at Request. (/usr/local/lib/node_modules/aws-cdk/lib/index.js:400:199903)
at Request.callListeners (/usr/local/lib/node_modules/aws-cdk/lib/index.js:400:91945) {
code: 'InvalidChangeSetStatus',
time: 2024-07-31T08:57:52.026Z,
requestId: '51170276-b619-4a84-840d-1311ac045d32',
statusCode: 400,
retryable: false,
retryDelay: 211.6141931879587
}
❌ Deployment failed: InvalidChangeSetStatus: Cannot delete ChangeSet in status CREATE_IN_PROGRESS
at Request.extractError (/usr/local/lib/node_modules/aws-cdk/lib/index.js:400:46723)
at Request.callListeners (/usr/local/lib/node_modules/aws-cdk/lib/index.js:400:91777)
at Request.emit (/usr/local/lib/node_modules/aws-cdk/lib/index.js:400:91225)
at Request.emit (/usr/local/lib/node_modules/aws-cdk/lib/index.js:400:199828)
at Request.transition (/usr/local/lib/node_modules/aws-cdk/lib/index.js:400:193380)
at AcceptorStateMachine.runTo (/usr/local/lib/node_modules/aws-cdk/lib/index.js:400:158252)
at /usr/local/lib/node_modules/aws-cdk/lib/index.js:400:158582
at Request. (/usr/local/lib/node_modules/aws-cdk/lib/index.js:400:193672)
at Request. (/usr/local/lib/node_modules/aws-cdk/lib/index.js:400:199903)
at Request.callListeners (/usr/local/lib/node_modules/aws-cdk/lib/index.js:400:91945) {
code: 'InvalidChangeSetStatus',
time: 2024-07-31T08:57:52.026Z,
requestId: '51170276-b619-4a84-840d-1311ac045d32',
statusCode: 400,
retryable: false,
retryDelay: 211.6141931879587
}
[08:57:52] Notices refreshed
Cannot delete ChangeSet in status CREATE_IN_PROGRESS
[08:57:52] InvalidChangeSetStatus: Cannot delete ChangeSet in status CREATE_IN_PROGRESS
at Request.extractError (/usr/local/lib/node_modules/aws-cdk/lib/index.js:400:46723)
at Request.callListeners (/usr/local/lib/node_modules/aws-cdk/lib/index.js:400:91777)
at Request.emit (/usr/local/lib/node_modules/aws-cdk/lib/index.js:400:91225)
at Request.emit (/usr/local/lib/node_modules/aws-cdk/lib/index.js:400:199828)
at Request.transition (/usr/local/lib/node_modules/aws-cdk/lib/index.js:400:193380)
at AcceptorStateMachine.runTo (/usr/local/lib/node_modules/aws-cdk/lib/index.js:400:158252)
at /usr/local/lib/node_modules/aws-cdk/lib/index.js:400:158582
at Request. (/usr/local/lib/node_modules/aws-cdk/lib/index.js:400:193672)
at Request. (/usr/local/lib/node_modules/aws-cdk/lib/index.js:400:199903)
at Request.callListeners (/usr/local/lib/node_modules/aws-cdk/lib/index.js:400:91945)
[Container] 2024/07/31 08:57:52.051808 Command did not exit successfully cdk -a . deploy tanmbz-pipeline --require-approval=never --verbose exit status 1
[Container] 2024/07/31 08:57:52.056035 Phase complete: BUILD State: FAILED
[Container] 2024/07/31 08:57:52.056051 Phase context status code: COMMAND_EXECUTION_ERROR Message: Error while executing command: cdk -a . deploy tanmbz-pipeline --require-approval=never --verbose. Reason: exit status 1
[Container] 2024/07/31 08:57:52.082089 Entering phase POST_BUILD
[Container] 2024/07/31 08:57:52.083763 Phase complete: POST_BUILD State: SUCCEEDED
[Container] 2024/07/31 08:57:52.083788 Phase context status code: Message:
[Container] 2024/07/31 08:57:52.119967 Set report auto-discover timeout to 5 seconds
[Container] 2024/07/31 08:57:52.120006 Expanding base directory path: .
[Container] 2024/07/31 08:57:52.121634 Assembling file list
[Container] 2024/07/31 08:57:52.121647 Expanding .
[Container] 2024/07/31 08:57:52.123259 Expanding file paths for base directory .
[Container] 2024/07/31 08:57:52.123272 Assembling file list
[Container] 2024/07/31 08:57:52.123276 Expanding */
[Container] 2024/07/31 08:57:52.125540 No matching auto-discover report paths found
[Container] 2024/07/31 08:57:52.125561 Report auto-discover file discovery took 0.005594 seconds
[Container] 2024/07/31 08:57:52.125577 Phase complete: UPLOAD_ARTIFACTS State: SUCCEEDED
[Container] 2024/07/31 08:57:52.125587 Phase context status code: Message:
Reproduction Steps
A bit tricky to set up something short and simple that reliably would reproduce this behaviour.
Possible Solution
No response
Additional Information/Context
No response
CDK CLI Version
2.150.0
Framework Version
No response
Node.js Version
20
OS
Amazon Linux 2
Language
TypeScript
Language Version
No response
Other information
It is not the end of the world when this happens - if the pipelines are executed again, and presumably do not need to do the self-mutation then, it works fine.
The text was updated successfully, but these errors were encountered: