Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

(pipelines): changeset conflicts when running multiple self-mutating pipelines (e.g. triggered from same repository) #31060

Open
eriklztiqqe opened this issue Aug 8, 2024 · 2 comments
Labels
@aws-cdk/pipelines CDK Pipelines library bug This issue is a bug. effort/small Small work item – less than a day of effort p3

Comments

@eriklztiqqe
Copy link

Describe the bug

We have a repository which creates 7 separate CDK pipelines, all of them in their own pipeline stack, but also residing in the same GitHub repository. The pipelines are set up to trigger on an update of the main branch in the repository.

This means that all the pipelines will trigger at the same time, whenver there is a change in the main branch.

This usually works fine. However, if there is a change that triggers the self-mutation step to change the pipeline, it can fail because it tries to delete a change set that is in state CREATE_IN_PROGRESS.
This seems to be that it tries to delete a changeset from another stack than the one it should be deleting.

So there seems to be some collision/race condition when multiple pipelines self-mutate at the same time, possibly also because there is cross-region execution involved.

As can be seen in the included error message, the tanmbz-pipeline tries to delete a changeset for arz-pipeline-support, which fails.

Expected Behavior

I would expect the pipelines to run normally without errors.

Current Behavior

Example error message, with project /stack name/AWSaccount details replaced:

[Container] 2024/07/31 08:57:49.811612 Running command cdk -a . deploy tanmbz-pipeline --require-approval=never --verbose
[08:57:50] CDK toolkit version: 2.150.0 (build 3f93027)
[08:57:50] Command line arguments: {
_: [ 'deploy' ],
a: '.',
app: '.',
'require-approval': 'never',
requireApproval: 'never',
verbose: 1,
v: 1,
lookups: true,
'ignore-errors': false,
ignoreErrors: false,
json: false,
j: false,
debug: false,
ec2creds: undefined,
i: undefined,
'version-reporting': undefined,
versionReporting: undefined,
'path-metadata': undefined,
pathMetadata: undefined,
'asset-metadata': undefined,
assetMetadata: undefined,
'role-arn': undefined,
r: undefined,
roleArn: undefined,
staging: true,
'no-color': false,
noColor: false,
ci: false,
all: false,
'build-exclude': [],
E: [],
buildExclude: [],
force: false,
f: false,
parameters: [ {} ],
'previous-parameters': true,
previousParameters: true,
logs: true,
concurrency: 1,
'asset-prebuild': true,
assetPrebuild: true,
'ignore-no-stacks': false,
ignoreNoStacks: false,
'$0': '/usr/local/bin/cdk',
STACKS: [ 'tanmbz-pipeline' ],
'S-t-a-c-k-s': [ 'tanmbz-pipeline' ]
}
[08:57:50] merged settings: {
versionReporting: true,
assetMetadata: true,
pathMetadata: true,
output: 'cdk.out',
app: '.',
context: {},
debug: false,
requireApproval: 'never',
toolkitBucket: {},
staging: true,
bundlingStacks: [ '' ],
lookups: true,
assetPrebuild: true,
ignoreNoStacks: false
}
[08:57:50] Toolkit stack: CDKToolkit
[08:57:50] Setting "CDK_DEFAULT_REGION" environment variable to eu-west-1
[08:57:50] Resolving default credentials
[08:57:50] Looking up default account ID from STS
[08:57:50] Notices refreshed
[08:57:50] Failed to store notices in the cache: Error: ENOENT: no such file or directory, open '/root/.cdk/cache/notices.json'
[08:57:50] Default account ID: 123456789012
[08:57:50] Setting "CDK_DEFAULT_ACCOUNT" environment variable to 123456789012
[08:57:50] context: {
'aws:cdk:enable-path-metadata': true,
'aws:cdk:enable-asset-metadata': true,
'aws:cdk:version-reporting': true,
'aws:cdk:bundling-stacks': [ '
' ]
}
[08:57:50] --app points to a cloud assembly, so we bypass synth
Including dependency stacks: cross-region-stack-123456789012:us-east-1
✨ Synthesis time: 0.18s
[08:57:50] Checking for previously published assets
[08:57:50] Retrieved account ID 123456789012 from disk cache
[08:57:50] Assuming role 'arn:aws:iam::123456789012:role/cdk-hnb659fds-deploy-role-123456789012-eu-west-1'.
[08:57:50] Retrieved account ID 123456789012 from disk cache
[08:57:50] Assuming role 'arn:aws:iam::123456789012:role/cdk-hnb659fds-deploy-role-123456789012-eu-west-1'.
[08:57:50] Retrieved account ID 123456789012 from disk cache
[08:57:50] Assuming role 'arn:aws:iam::123456789012:role/cdk-hnb659fds-file-publishing-role-123456789012-eu-west-1'.
[08:57:50] Retrieved account ID 123456789012 from disk cache
[08:57:50] Assuming role 'arn:aws:iam::123456789012:role/cdk-hnb659fds-file-publishing-role-123456789012-eu-west-1'.
[08:57:50] tanmbz-pipeline: check: Check s3://cdk-hnb659fds-assets-123456789012-eu-west-1/940f7968a86058a112f2a09ea3ca569fde773c6e2f78677bb843922e63bb86f3.zip
[08:57:50] tanmbz-pipeline: check: Check s3://cdk-hnb659fds-assets-123456789012-eu-west-1/2c6391ecd42f493217c6982a937dc0e862184528f45c3d0b6a24ebf71020cbc5.json
[08:57:50] tanmbz-pipeline: found: Found s3://cdk-hnb659fds-assets-123456789012-eu-west-1/940f7968a86058a112f2a09ea3ca569fde773c6e2f78677bb843922e63bb86f3.zip
[08:57:50] 2 total assets, 1 still need to be published
cross-region-stack-123456789012:us-east-1 (arz-pipeline-support-us-east-1)
cross-region-stack-123456789012:us-east-1 (arz-pipeline-support-us-east-1): deploying... [1/2]
[08:57:50] Retrieved account ID 123456789012 from disk cache
[08:57:50] Retrieved account ID 123456789012 from disk cache
[08:57:50] Assuming role 'arn:aws:iam::123456789012:role/cdk-hnb659fds-deploy-role-123456789012-eu-west-1'.
tanmbz-pipeline: start: Building 2c6391ecd42f493217c6982a937dc0e862184528f45c3d0b6a24ebf71020cbc5:123456789012-eu-west-1
tanmbz-pipeline: success: Built 2c6391ecd42f493217c6982a937dc0e862184528f45c3d0b6a24ebf71020cbc5:123456789012-eu-west-1
[08:57:51] arz-pipeline-support-us-east-1: checking if we can skip deploy
[08:57:51] arz-pipeline-support-us-east-1: template has changed
[08:57:51] arz-pipeline-support-us-east-1: deploying...
[08:57:51] Removing existing change set with name cdk-deploy-change-set if it exists
[08:57:52] Call failed: deleteChangeSet({"StackName":"arz-pipeline-support-us-east-1","ChangeSetName":"cdk-deploy-change-set"}) => Cannot delete ChangeSet in status CREATE_IN_PROGRESS (code=InvalidChangeSetStatus)
❌ cross-region-stack-123456789012:us-east-1 (arz-pipeline-support-us-east-1) failed: InvalidChangeSetStatus: Cannot delete ChangeSet in status CREATE_IN_PROGRESS
at Request.extractError (/usr/local/lib/node_modules/aws-cdk/lib/index.js:400:46723)
at Request.callListeners (/usr/local/lib/node_modules/aws-cdk/lib/index.js:400:91777)
at Request.emit (/usr/local/lib/node_modules/aws-cdk/lib/index.js:400:91225)
at Request.emit (/usr/local/lib/node_modules/aws-cdk/lib/index.js:400:199828)
at Request.transition (/usr/local/lib/node_modules/aws-cdk/lib/index.js:400:193380)
at AcceptorStateMachine.runTo (/usr/local/lib/node_modules/aws-cdk/lib/index.js:400:158252)
at /usr/local/lib/node_modules/aws-cdk/lib/index.js:400:158582
at Request. (/usr/local/lib/node_modules/aws-cdk/lib/index.js:400:193672)
at Request. (/usr/local/lib/node_modules/aws-cdk/lib/index.js:400:199903)
at Request.callListeners (/usr/local/lib/node_modules/aws-cdk/lib/index.js:400:91945) {
code: 'InvalidChangeSetStatus',
time: 2024-07-31T08:57:52.026Z,
requestId: '51170276-b619-4a84-840d-1311ac045d32',
statusCode: 400,
retryable: false,
retryDelay: 211.6141931879587
}
❌ Deployment failed: InvalidChangeSetStatus: Cannot delete ChangeSet in status CREATE_IN_PROGRESS
at Request.extractError (/usr/local/lib/node_modules/aws-cdk/lib/index.js:400:46723)
at Request.callListeners (/usr/local/lib/node_modules/aws-cdk/lib/index.js:400:91777)
at Request.emit (/usr/local/lib/node_modules/aws-cdk/lib/index.js:400:91225)
at Request.emit (/usr/local/lib/node_modules/aws-cdk/lib/index.js:400:199828)
at Request.transition (/usr/local/lib/node_modules/aws-cdk/lib/index.js:400:193380)
at AcceptorStateMachine.runTo (/usr/local/lib/node_modules/aws-cdk/lib/index.js:400:158252)
at /usr/local/lib/node_modules/aws-cdk/lib/index.js:400:158582
at Request. (/usr/local/lib/node_modules/aws-cdk/lib/index.js:400:193672)
at Request. (/usr/local/lib/node_modules/aws-cdk/lib/index.js:400:199903)
at Request.callListeners (/usr/local/lib/node_modules/aws-cdk/lib/index.js:400:91945) {
code: 'InvalidChangeSetStatus',
time: 2024-07-31T08:57:52.026Z,
requestId: '51170276-b619-4a84-840d-1311ac045d32',
statusCode: 400,
retryable: false,
retryDelay: 211.6141931879587
}
[08:57:52] Notices refreshed
Cannot delete ChangeSet in status CREATE_IN_PROGRESS
[08:57:52] InvalidChangeSetStatus: Cannot delete ChangeSet in status CREATE_IN_PROGRESS
at Request.extractError (/usr/local/lib/node_modules/aws-cdk/lib/index.js:400:46723)
at Request.callListeners (/usr/local/lib/node_modules/aws-cdk/lib/index.js:400:91777)
at Request.emit (/usr/local/lib/node_modules/aws-cdk/lib/index.js:400:91225)
at Request.emit (/usr/local/lib/node_modules/aws-cdk/lib/index.js:400:199828)
at Request.transition (/usr/local/lib/node_modules/aws-cdk/lib/index.js:400:193380)
at AcceptorStateMachine.runTo (/usr/local/lib/node_modules/aws-cdk/lib/index.js:400:158252)
at /usr/local/lib/node_modules/aws-cdk/lib/index.js:400:158582
at Request. (/usr/local/lib/node_modules/aws-cdk/lib/index.js:400:193672)
at Request. (/usr/local/lib/node_modules/aws-cdk/lib/index.js:400:199903)
at Request.callListeners (/usr/local/lib/node_modules/aws-cdk/lib/index.js:400:91945)
[Container] 2024/07/31 08:57:52.051808 Command did not exit successfully cdk -a . deploy tanmbz-pipeline --require-approval=never --verbose exit status 1
[Container] 2024/07/31 08:57:52.056035 Phase complete: BUILD State: FAILED
[Container] 2024/07/31 08:57:52.056051 Phase context status code: COMMAND_EXECUTION_ERROR Message: Error while executing command: cdk -a . deploy tanmbz-pipeline --require-approval=never --verbose. Reason: exit status 1
[Container] 2024/07/31 08:57:52.082089 Entering phase POST_BUILD
[Container] 2024/07/31 08:57:52.083763 Phase complete: POST_BUILD State: SUCCEEDED
[Container] 2024/07/31 08:57:52.083788 Phase context status code: Message:
[Container] 2024/07/31 08:57:52.119967 Set report auto-discover timeout to 5 seconds
[Container] 2024/07/31 08:57:52.120006 Expanding base directory path: .
[Container] 2024/07/31 08:57:52.121634 Assembling file list
[Container] 2024/07/31 08:57:52.121647 Expanding .
[Container] 2024/07/31 08:57:52.123259 Expanding file paths for base directory .
[Container] 2024/07/31 08:57:52.123272 Assembling file list
[Container] 2024/07/31 08:57:52.123276 Expanding */
[Container] 2024/07/31 08:57:52.125540 No matching auto-discover report paths found
[Container] 2024/07/31 08:57:52.125561 Report auto-discover file discovery took 0.005594 seconds
[Container] 2024/07/31 08:57:52.125577 Phase complete: UPLOAD_ARTIFACTS State: SUCCEEDED
[Container] 2024/07/31 08:57:52.125587 Phase context status code: Message:

Reproduction Steps

A bit tricky to set up something short and simple that reliably would reproduce this behaviour.

Possible Solution

No response

Additional Information/Context

No response

CDK CLI Version

2.150.0

Framework Version

No response

Node.js Version

20

OS

Amazon Linux 2

Language

TypeScript

Language Version

No response

Other information

It is not the end of the world when this happens - if the pipelines are executed again, and presumably do not need to do the self-mutation then, it works fine.

@eriklztiqqe eriklztiqqe added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Aug 8, 2024
@github-actions github-actions bot added the @aws-cdk/pipelines CDK Pipelines library label Aug 8, 2024
@pahud
Copy link
Contributor

pahud commented Aug 12, 2024

Sounds like you have one single pipeline stack that has 7 pipelines provisioned and they could mutate the same pipeline stack at the same time? Is it correct? If that's the case I am afraid the best way is to have 7 separate pipeline stacks with each can only mutate itself. Does it work for you?

@pahud pahud added p3 response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. effort/small Small work item – less than a day of effort and removed needs-triage This issue or PR still needs to be triaged. labels Aug 12, 2024
@eriklztiqqe
Copy link
Author

No, there are 7 separate pipeline stacks, but they are all created by the same CDK code.
The code iterates over a number of configuration directories and creates a pipeline stack for each config directory, and the pipeline in each stack is set up based on the config settings present in each directory.

Most respositories in use only have a single config directory, but this particular one has 7 of them.

@github-actions github-actions bot removed the response-requested Waiting on additional info and feedback. Will move to "closing-soon" in 7 days. label Aug 13, 2024
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
@aws-cdk/pipelines CDK Pipelines library bug This issue is a bug. effort/small Small work item – less than a day of effort p3
Projects
None yet
Development

No branches or pull requests

2 participants