You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
User on Slack reported that after an upgrade of their Flux components, the image-automation-controller (which at the moment still depends on the Git libraries from this controller, and recently started using libgit2 only), stopped working with the following error:
{"level":"error","ts":"2021-07-01T17:52:47.736Z","logger":"controller-runtime.manager.controller.imageupdateautomation","msg":"Reconciler error","reconciler group":"image.toolkit.fluxcd.io","reconciler kind":"ImageUpdateAutomation","name":"flux-system","namespace":"flux-system","error":"unable to clone 'ssh://git@example.com/repo.git', error: Certificate"}
Isolating the issue, we discovered that while the known_hosts entry in their Secret did contain a ssh-rsa item that matched the host key of the server, it resulted in a false mismatch.
Once the user had updated the known_hosts entry in the Secret with the output of ssh-keyscan example.com 2>/dev/null | base64 (containing a ssh-rsaandssh-ed25519 item), the image-automation-controller started working again.
The reason around hostkey's are not being 'properly agreed' on is that such agreement is based on preferred advertised algorithms during the handshake. The problem we face can be shown here:
Server Preferred Host Keys: "ssh-rsa", "ecdsa-sha2-nistp256", "ssh-ed25519"
Client Preferred Host Keys: "ssh-rsa", "ecdsa-sha2-nistp256", "ssh-ed25519"
Known Key type provided: "ssh-ed25519"
This would just not work. Reason being, both peers prefer "ssh-rsa", that is at the top of their preference and will therefore always be used as the host key algorithm of choice. However, Flux does not taken into account that the algorithm used by the user in the known_hosts content is actually "ssh-ed25519".
Users will be able to enforce (or prefer) specific algorithms with the new flag --ssh-hostkey-algos, which will make it easier to get the intended HostKey type to be used. On that ground, I think we should close this issue (once the PR merges).
User on Slack reported that after an upgrade of their Flux components, the image-automation-controller (which at the moment still depends on the Git libraries from this controller, and recently started using
libgit2
only), stopped working with the following error:Isolating the issue, we discovered that while the
known_hosts
entry in theirSecret
did contain assh-rsa
item that matched the host key of the server, it resulted in a false mismatch.Once the user had updated the
known_hosts
entry in theSecret
with the output ofssh-keyscan example.com 2>/dev/null | base64
(containing assh-rsa
andssh-ed25519
item), the image-automation-controller started working again.My educated guess is that something is not working correctly at all times in the custom bit of code we have for validating host keys with
libgit2
: https://github.com/fluxcd/source-controller/blob/main/pkg/git/libgit2/transport.go#L147-L239, as the error as logged by the controller matches thegit2go.ErrCertificate
returned by thecertCallback
.Slack thread reference: https://cloud-native.slack.com/archives/CLAJ40HV3/p1625162540293300
The text was updated successfully, but these errors were encountered: