-
Notifications
You must be signed in to change notification settings - Fork 102
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Azure CLI Login Method Does Not Work Properly with AAD Enabled AKS Clusters and Multiple Deployment Service Principals #104
Comments
It looks like the option token-cache-dir should work and do what we need, but kubelogin doesn't accept it as a valid option. In our Azure DevOps pipelines, we could set it to $(Agent.TempDirectory). |
It looks like the token-cache-dir option IS on the kubeconfig get-token command: kubelogin get-token --token-cache-dir=/mnt/src/test/ So we would have to edit the config file kubelogin convert-kubeconfig generates to include the token-cache-dir value |
Oh interesting, we never had |
Yes please! Then we can specify that in our pipelines and not have to manually edit the converted cluster configuration file. When would a new kubelogin be available for me to try out? Asking because I may need to do a temporary hack an update the converted cluster configuration file. We have a lot of deployment pipelines stepping on each other now. |
@kkizer working on it. should have a PR out soon |
* added --token-cache-dir in convert-kubeconfig sub-command. Addresses #104
@kkizer do you want to give it a try before I cut the release? |
Thank you very much for your quick support on this issue. I am working on getting it tested in our environment today. I will post a comment with test results. |
The fix does exactly what we need. In my Azure DevOps pipeline I get the cluster configuration using az aks-getcredentials and convert it using: and the cluster configuration file contains: - name: clusterUser_RG_KUB11614NP01_AZAMERES01KUB11614NP01
user:
exec:
apiVersion: client.authentication.k8s.io/v1beta1
args:
- get-token
- --login
- azurecli
- --server-id
- 6dae42f8-4368-4678-94ff-3960e28e3630
- --token-cache-dir
- /AzDOAgents/Agent1/_work/_temp
command: kubelogin
env: null
provideClusterInfo: false It has the token-cache-dir argument! We still need to test the change with multiple builds on the same build machine and each build uses a different service principal. I don't expect any issues since each build would have its own agent temp directory. |
actually v0.0.16 introduced a bug such that |
@kkizer you should upgrade to v0.0.17 or higher. the |
Thanks, but my day ended at 4 pm EDT. Please cut the release and we can
test tomorrow. I can't just take an executable: I will need to get it
loaded into our corporate Artifactory repository. Please be sure to note
new/changed options in the release notes. Thank you very much for your
support on this issue.
…On Mon, Jul 18, 2022, 4:16 PM Weinong Wang ***@***.***> wrote:
@kkizer <https://github.com/kkizer> do you want to give it a try before I
cut the release?
—
Reply to this email directly, view it on GitHub
<#104 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/APUSKGBWZVJ55QJFUDGVUV3VUW3TDANCNFSM532DEQCA>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Great! So we would be able to specify the token cache dir on the
kubelogin convert-kubeconfig command? If true, that would be a simple
thing for us to define in our pipelines.
When would a new kubelogin be available that has this fix?
Regards,
Kevin Kizer
…On Mon, Jul 18, 2022 at 12:15 PM Weinong Wang ***@***.***> wrote:
@kkizer looks like I missed this arg when I refactor the conversion code from exec to exec.. will fix it today
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
@kkizer the latest v0.0.20 has this change |
Hey @kkizer what is the |
@mbaykara https://github.com/Azure/kubelogin#exec-plugin-format
|
I am not using azure managed cluster, using another provider actually. Will not work right? @weinong |
Another public cloud provider i.e GKE @weinong |
@mbaykara the --server-id should be the AAD application ID you configure your kubernetes cluster with. you can refer to https://github.com/Azure/kubelogin#setup-for-kubernetes-oidc-provider-using-azure-ad |
Azure CLI Login Method Does Not Work Properly with AAD Enabled AKS Clusters and Multiple Deployment Service Principals
Overview
We are deploying applications to Azure Active Directory (AAD) enabled clusters and are required to use kubelogin. Since these are automated deployment pipelines we cannot have interactive logins. We deploy using Azure service principals so we need to use the Azure CLI login method in kubelogin.
The Azure CLI login method does not work properly with AAD enabled AKS clusters when more than one service principal is used by Azure DevOps deployment pipelines on the same build agent. Each pipeline is producing the same token cache file name even though they are for different service principals. This results in helm and kubectl commands failing due to authorization errors.
This problem became an issue when some deployment pipelines were using kubelogin v0.0.14 and others were using v0.0.10.
Scenario
az aks get-credentials
Azure CLI command. The configuration needs to be converted to use Azure CLI logins (non-interactive) as documented here.This causes pipeline 2 to fail with authorization errors since the access token in the cache file is for the wrong service principal.
Source and cluster config info
Source code to generate token file
Cache token file name is generated here:
And you see the name is formed as follows:
The last two tokens in the generated file name are blank, so the ClientID and TenantID must not be set. The user information in the original cluster configuration is:
And after kubelogin convert-kubeconfig -l azurecli it is (Azure DevOps is masking the server-id value using ***):
This would explain how clientID and tenantID are blank in the generated token cache file name: neither the clientId or tenantID are passed to kubelogin as arguments.
Changes Between kubelogin v0.0.10 and v0.0.14
It doesn't look like the code that generates the token cache file name has changed between v.0.0.10 and v.0.0.14. So I'm not sure how this ever worked for us. Perhaps we were lucky and builds ran on different build agents and the name collision wasn't an issue. More likely the token expired and a new one was created (and cache file updated) so the issue was hidden.
However, the contents on the file DID change.
In v0.0.10, the resource property in the token file is blank:
While in v.0.14, it has a value:
It looks like in v0.0.14 it is putting the object id of the service principal in the resource property. That won't work if two different pipelines using different service principals use the same kubelogin token cache file!
We need to update to the latest version of kubelogin especially since Kubernetes 1.24 requires the use of kubelogin with AAD enabled clusters. The current version of kubelogin (v0.0.14) is also required since it handles the new cluster configuration file that is generated starting in Kubernetes 1.24. We cannot stay on older versions of kubelogin (nor should we so that we keep up with bug fixes and security fixes).
Possible Workaround?
The kubelogin docs do not mention this (but
kubelogin convert-kubeconfig --help
does), but the kubelogin executable allows you to specify--client-id <client ID>
. In my test YAML pipeline, theconvert-kubeconfig -l azurecli
is running in an Azure CLI task. I made the following changes:addSpnToEnvironment: true
to the task properties.kubelogin convert-kubeconfig -l azurecli --client-id $env:servicePrincipalId
This results in the following in the converted cluster config file (Azure DevOps masks the service principal id in log output. The *** is the service principal id):
And I get file names like AzurePublicCloud-6dae42f8-4368-4678-94ff-3960e28e3630-***-.json
*** is the service principal id. Azure DevOps masks the service principal id in log output.
So the clientID value is set in the file name.
Kubelogin also supports
-t <tenant ID>
but that did not get written to the converted cluster config file.Is this an appropriate solution to this issue?
kubelogin v.0.0.14 Does not Honor --client-id Flag with Kubernetes 1.24 Cluster Config Files
With Kubernetes 1.24, kubelogin v0.0.14 is also broken. The cluster config file you get is:
and if you convert to to use Azure CLI login like our workaround before:
kubelogin convert-kubeconfig -l azurecli --client-id $env:servicePrincipalId
The converted file no longer contains the client-id argument:
AND YOU GET DUPLICATE FILE NAMES AGAIN!
Conclusion
The kubelogin code needs to generate unique file names for the token cache file name when using the Azure CLI login method. It cannot assume that the same service principal is being used.
Kevin Kizer
Lead Engineer
MetLife
The text was updated successfully, but these errors were encountered: