Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

xds: Fail xDS Server Serve() if called after Stop() or GracefulStop() #6410

Merged
merged 3 commits into from
Jun 27, 2023

Conversation

zasweq
Copy link
Contributor

@zasweq zasweq commented Jun 23, 2023

Fixes #5838.

This PR fails Serve() if called after Stop() or GracefulStop(). This is important because Serve() eventually increments the global singleton xDS Client Ref, and thus if called without a corresponding cleanup, Stop() or GracefulStop(), the singleton xDS Client will eventually leak.

RELEASE NOTES: N/A

@zasweq zasweq requested a review from easwars June 23, 2023 23:58
@zasweq zasweq added this to the 1.57 Release milestone Jun 23, 2023
// Do this in a goroutine, even though should just receive error. This is to
// prevent blocking for (which comes after, so doesn't apply here), but
// mainly to wrap this err Receive operation with a testing timeout.
go func() { serveDone.Send(server.Serve(lis)) }()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any value in doing this in a goroutine and plumbing the returned error value through a channel for the main test goroutine to inspect. I think all we need here is a direct call to server.Serve() and inspecting the error returned from the client.

If you want to test Stop racing with Serve, you should probably do something like this:

func (s) TestXxx(t *testing.T) {
  // Setup a real management server
  // Setup bootstrap file to point to the management server

  for i := 0 ; i < 100; i++ {
    // Start an xDS enabled server.
    // Spawn a goroutine which calls Serve
    // Spawn a goroutine which calls Stop/GracefulStop
  }

  // Wait for all the above goroutines to complete using a sync.WaitGroup

  // Rely on the leakchecker to detect if any goroutines are leaked (which will be the case if not all references to the real xdsClient being created are not released).
}

Copy link
Contributor Author

@zasweq zasweq Jun 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I had this same thought. I had option 1 but wanted to wrap the operation with a timeout, since it could possibly hang if it doesn't hit the error and does the while true, and then we have to wait for the test to hit 7 minute timeout. I think the simplicity of just doing err := server.Serve beats the reason you need to wrap in a timeout. Hmmmmm ok I never thought about testing with the leak checker. Oh I didn't want to call Serve before Close in a test because I didn't want to send an LDS response so it can call into server.Server in grpc, but thinking about it you don't need a good LDS response, because it will receive from this quit here to exit: https://github.com/grpc/grpc-go/blob/master/xds/server.go#L270, and will already have refed the client conn up 1, which is the part that is interesting to test. Will switch to err := server.Serve and implement your test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack.

xds/server_test.go Show resolved Hide resolved
xds/server.go Show resolved Hide resolved
@easwars easwars assigned zasweq and unassigned easwars Jun 27, 2023
@zasweq zasweq assigned easwars and unassigned zasweq Jun 27, 2023
}

// TestServeAndCloseDoNotRace tests that Serve and Close on the xDS Server do
// not race and leak the xDS Client. A race would be found by the leak checker.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: the leak would be found by the leak checker and not the race :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hahahaha fair. Switched.

@easwars easwars assigned zasweq and unassigned easwars Jun 27, 2023
@zasweq zasweq merged commit 575a936 into grpc:master Jun 27, 2023
1 check passed
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 25, 2023
# for free to subscribe to this conversation on GitHub. Already have an account? #.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

xds: server should fail Serve() if called after Close()
2 participants