Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

CARM ack-role-account-map ConfigMap updates are not propagated to ACK controllers #2088

Closed
itaiatu opened this issue Jun 13, 2024 · 13 comments
Closed
Assignees
Labels
area/runtime Issues or PRs as related to controller runtime, common reconciliation logic, etc kind/bug Categorizes issue or PR as related to a bug. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now.

Comments

@itaiatu
Copy link

itaiatu commented Jun 13, 2024

Describe the bug
CARM ack-role-account-map ConfigMap updates are not propagated to ACK controllers.
Maybe related to #2011.

Steps to reproduce
I performed the following scenario:

  • CARM ack-role-account-map ConfigMap
apiVersion: v1
data:
  "00000000000": arn:aws:iam::00000000000:role/X-ack-iam-controller
  "11111111111": arn:aws:iam::11111111111:role/X-ack-iam-controller-assumedrole
kind: ConfigMap
metadata:
  name: ack-role-account-map
  namespace: ack-iam-controller
  • Changed the value of 11111111111 in the data field in the to
"11111111111": arn:aws:iam::11111111111:role/X-ack-iam-controller-assumedrole2
  • IAM controller detected the change
{"level":"debug","ts":"2024-06-13T09:56:32.974Z","logger":"cache.account","msg":"updated account config map","name":"ack-role-account-map"}
  • After changing the value of "11111111111" AWS subscription id to a dummy value (adding the 2 at the end of the value), I still managed to create a Role and a Policy in the ns-ack-test namespace (see below the Namespace spec). This is not the desired behaviour, since that role doesn't exist (with the 2 character at the end of the string).

  • After I restarted the IAM controller's pod, got 403s, the expected behaviour, since that role is not a valid one

{"level":"debug","ts":"2024-06-13T10:54:38.873Z","logger":"ackrt","msg":"<<< rm.sdkFind","kind":"Policy","namespace":"ns-ack-test","name":"ack-test-policy","account":"11111111111","role":"arn:aws:iam::11111111111:role/X-ack-iam-controller-assumedrolee","region":"us-west-2","is_adopted":false,"generation":1,"error":"AccessDenied: User: arn:aws:sts::00000000000:assumed-role/X-ack-iam-controller/1718276077280567570 is not authorized to perform: sts:AssumeRole on resource: arn:aws:iam::11111111111:role/X-ack-iam-controller-assumedrolee\n\tstatus code: 403, request id: ad4f99b2-5b84-403e-ac16-ee1e56293ef0"}
  • Vice-versa, if I start with a CARM wrong configuration (wrong assumedrole names), I get 403s as expected, and after I fix the role name in ack-role-account-map to match the correct assumedrole, I still receive 403s when trying to create resources.

  • This makes me think that although the log messages say that the runtime cache is updated, the change is not propagated to ACK controllers.

Expected outcome
When ack-role-account-map is edited, ACK controllers will use the updated values from the data field.

Environment

apiVersion: v1
kind: Namespace
metadata:
  annotations:
    services.k8s.aws/owner-account-id: "11111111111"
  name: ns-ack-test
spec:
  finalizers:
    - kubernetes
  • Kubernetes version: 1.28
  • Using EKS (yes/no), if so version? 1.28
  • AWS service targeted (S3, RDS, etc.): Multiple (at least IAM, EC2)
@a-hilaly a-hilaly added kind/bug Categorizes issue or PR as related to a bug. area/runtime Issues or PRs as related to controller runtime, common reconciliation logic, etc priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. labels Jun 17, 2024
@a-hilaly a-hilaly self-assigned this Aug 6, 2024
@a-hilaly
Copy link
Member

a-hilaly commented Aug 6, 2024

@itaiatu thank you for opening this issue. I was able to reproduce the problem. It appears that the code we generate to operate "ResourceManagers" uses caching to avoid re-initializing the SDK client at every reconciliation loop - the cache uses $accountID/$roleARN as the key/index, causing the controller to always reuse the same cache for every of accountID and region combination. Will send a fix to introduce a better mechanism here.

/cc @adriananeci @mumlawski @mattzesh @victorvarza

@mumlawski
Copy link

We see this problem fixed now in IAM controller v1.3.10.

@itaiatu
Copy link
Author

itaiatu commented Aug 20, 2024

I tested again with EC2 (v1.2.17) and IAM (v1.3.10) and the problem is still there.
cc: @a-hilaly

@swapnachagam
Copy link

Is there an update on this solution please?

@itaiatu
Copy link
Author

itaiatu commented Sep 4, 2024

Tested it now with the EC2 controller, version 1.2.19 and it works.

@itaiatu
Copy link
Author

itaiatu commented Sep 4, 2024

I am going to keep this issue opened for a while to see if the problem is solved for all controllers, but I assume everything should be ok, because the change was done in the runtime package, so it will be contained in all ACK controller repos.

@a-hilaly
Copy link
Member

a-hilaly commented Sep 4, 2024

@itaiatu Indeed the fix made it to the core runtime and all of the controllers have been patched.
cc @swapnachagam @mumlawski

@swapnachagam
Copy link

swapnachagam commented Sep 5, 2024

Hi @a-hilaly @itaiatu I tested for IAM Controller using version 1.3.12. I have updated the configmap to a different role than the one originally which is required for cross account mapping but controller is still running with the previous role only. I don't see anhy reconcilation happening in the controller end
As soon as i restart, it is picking the new role though. Let me know what logs i should share here

I see that logs being updated for s3 Controller but i don't see the same for IAM/CLOUDWATCHLOGS/LAMBDA CONTROLLERS

{"level":"debug","ts":"2024-09-05T15:16:02.761Z","logger":"cache.account","msg":"updated account config map","name":"ack-role-account-map"}

@a-hilaly
Copy link
Member

a-hilaly commented Sep 5, 2024

@swapnachagam Updating the role in CARM, doesn't necessarily trigger a reconciliation. You'll have to manually trigger the reconciliation of wait for the periodic reconcile.

@swapnachagam
Copy link

Thank you.. It is working by triggering a reconciliation

@swapnachagam
Copy link

Is there a way instead of us manually triggering a reconcilation, can ack-controllers have a second watch for these configmap updates and reconcile itself in a low priority queue?
We have many controllers and also lot of AWS accounts(~100) which have to be onboarded

@a-hilaly
Copy link
Member

@swapnachagam I'm not sure whether we need to implement this, I imagine controllers managing a lot of resource would reconcile everything even for a simple CARM configMap modification. But i'm not opposed to it. Can you please open a Github issue describing this feature? we'll discuss it in our next community meeting https://github.com/aws-controllers-k8s/community/?tab=readme-ov-file#community-meeting

For now i'll close this as it's resolved

@swapnachagam
Copy link

Thank you. I will raise a separate requst for this.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
area/runtime Issues or PRs as related to controller runtime, common reconciliation logic, etc kind/bug Categorizes issue or PR as related to a bug. priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now.
Projects
None yet
Development

No branches or pull requests

4 participants