-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
server: gRPC server does not set NextProtocols #136367
Comments
This works around the issue discussed in cockroachdb#136367. Apparently, we have some misconfiguration or bug in gRPC v1.56.3 that makes the gRPC server seem unable to properly support HTTP2. This effectively breaks communication between CRDB nodes at these two different gRPC versions. Switch to a fork that disables the check (there is no other way to disable it other than changing code).
Oh I understand it now. Our TLS config uses /pkg/security/certificate_manager.go#L416-L421 func (cm *CertificateManager) GetServerTLSConfig() (*tls.Config, error) {
if _, err := cm.getEmbeddedServerTLSConfig(nil); err != nil {
return nil, err
}
return &tls.Config{
GetConfigForClient: cm.getEmbeddedServerTLSConfig, in v1.56.3, gRPC does add // NewTLS uses c to construct a TransportCredentials based on TLS.
func NewTLS(c *tls.Config) TransportCredentials {
tc := &tlsCreds{credinternal.CloneTLSConfig(c)}
tc.config.NextProtos = credinternal.AppendH2ToNextProtos(tc.config.NextProtos)
return tc
} In v1.68.0, this has been rectified in a fairly recent change: grpc/grpc-go#7813 https://github.com/grpc/grpc-go/blob/v1.68.0/credentials/tls.go#L201-L215 The long and short of it is that we need to make sure to disable this ALPN check until every CRDB version our binary might need to communicate with contains the v1.68.0 upgrade. That should be straightforward, and thanks to mixed-version testing it'll be pretty obvious if we get it wrong either. I'll sprinkle comments in |
This works around the issue discussed in cockroachdb#136367. Apparently, we have some misconfiguration or bug in gRPC v1.56.3 that makes the gRPC server seem unable to properly support HTTP2. This effectively breaks communication between CRDB nodes at these two different gRPC versions. Switch to a fork that disables the check (there is no other way to disable it other than changing code).
This works around the issue discussed in cockroachdb#136367. Apparently, we have some misconfiguration or bug in gRPC v1.56.3 that makes the gRPC server seem unable to properly support HTTP2. This effectively breaks communication between CRDB nodes at these two different gRPC versions. Switch to a fork that disables the check (there is no other way to disable it other than changing code).
This works around the issue discussed in cockroachdb#136367. Apparently, we have some misconfiguration or bug in gRPC v1.56.3 that makes the gRPC server seem unable to properly support HTTP2. This effectively breaks communication between CRDB nodes at these two different gRPC versions. Switch to a fork that disables the check (there is no other way to disable it other than changing code).
Describe the problem
While upgrading gRPC from v1.56.3 to v1.68.0, I noticed that post-bump CRDB was unable to connect to pre-bump CRDB nodes, the nodes using gRPC at v1.68.0 would print
This check is new in v1.68.0 (as in, it's not yet there in v1.56.3). So this problem likely persists, but it has not caused any issues. This new check was added to provide a better failure mode. Apparently every HTTP2 server needs to support ALPN, but our HTTP2 server is gRPC and it is a bit of a mystery how we end up failing this check, especially given that everything works once we disable the check, which we can do via an env var.
See the full initial analysis here: #136278 (comment)
With a fork of grpc-go that turns the check off, #136278 can merge. This issue serves as a reminder to get to the bottom of this problem, as future versions of gRPC are likely to hard-code the check, and we don't want to have to be on a fork forever.
To Reproduce
See #136278 (comment)
Jira issue: CRDB-45001
The text was updated successfully, but these errors were encountered: