-
Notifications
You must be signed in to change notification settings - Fork 656
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
don't deliver events for unregistered fds #341
don't deliver events for unregistered fds #341
Conversation
Sources/NIO/Selector.swift
Outdated
@@ -540,7 +543,8 @@ final class Selector<R: Registration> { | |||
Int(try KQueue.kevent(kq: self.fd, changelist: nil, nchanges: 0, eventlist: events, nevents: Int32(eventsCapacity), timeout: ts)) | |||
} | |||
|
|||
for i in 0..<ready { | |||
self.deregistrationsHappened = false | |||
for i in 0..<ready where !self.deregistrationsHappened { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as noted offline I think while this should work we should treat this as workaround as this depends highly on the "level-triggered" semantics of kqueue / epoll.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. This should be thought of as a temporary fix to avoid the nasty behaviour: we should come up with a better long-term solution for dealing with this in a future release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Totally agreed. Let’s get this in (I actually found another bug (and a missing assertion) whilst fixing it) and then fix properly. I agree we shouldn’t rely on it being level triggered here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Works for me
27b4ff0
to
9d951f1
Compare
Sources/NIO/Selector.swift
Outdated
@@ -259,6 +259,7 @@ final class Selector<R: Registration> { | |||
private var eventsCapacity = 64 | |||
private var events: UnsafeMutablePointer<EventType> | |||
private var registrations = [Int: R]() | |||
private var deregistrationsHappened: Bool = false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you put a doc comment on this guy indicating that it's a temporary fix?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
^
Sources/NIO/Selector.swift
Outdated
@@ -500,7 +502,8 @@ final class Selector<R: Registration> { | |||
ready = Int(try Epoll.epoll_wait(epfd: self.fd, events: events, maxevents: Int32(eventsCapacity), timeout: -1)) | |||
} | |||
|
|||
for i in 0..<ready { | |||
self.deregistrationsHappened = false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually don't think we need to do this for epoll. I realised this this morning, but epoll generally only returns each FD once, with a mask of all event flags for that FD.
This means that our epoll code is immune to this bug: the channel is concretely held to dispatch all the events to it. This also suggests a sensible long-term fix: we should coalesce the kqueue notifications so that each channel has events dispatched to it only once.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That’s actually a very good point 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's not true because (just like the test does), if you close lots of fds and then open lots of fds, there might very well be events for file descriptors that are no longer valid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, you're right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doh... true :(
9d951f1
to
755bc7a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, mostly looks really good. Some notes inline.
Sources/NIO/BaseSocketChannel.swift
Outdated
@@ -776,6 +781,9 @@ class BaseSocketChannel<T: BaseSocket>: SelectableChannel, ChannelCore { | |||
} | |||
|
|||
final func readEOF() { | |||
print("readEOF \(self)") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Print statement left behind.
Sources/NIO/Selector.swift
Outdated
@@ -259,6 +259,7 @@ final class Selector<R: Registration> { | |||
private var eventsCapacity = 64 | |||
private var events: UnsafeMutablePointer<EventType> | |||
private var registrations = [Int: R]() | |||
private var deregistrationsHappened: Bool = false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
^
Tests/NIOTests/SelectorTest.swift
Outdated
// impossible to get a read notification in the very same event loop tick that you got registered | ||
XCTAssertTrue(self.hasReConnectEventLoopTickFinished.value) | ||
|
||
// TODO: check the bytebuffer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO
Tests/NIOTests/SelectorTest.swift
Outdated
} | ||
|
||
func channelRead(ctx: ChannelHandlerContext, data: NIOAny) { | ||
// we expect these channels to be get data only a while after the re-connect event loop tick as it's |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/to be get/to get/
Tests/NIOTests/SelectorTest.swift
Outdated
func userInboundEventTriggered(ctx: ChannelHandlerContext, event: Any) { | ||
// this is the `.readEOF` that is triggered by the `ServerHandler`'s `close` calls because our channel | ||
// supports half-closure | ||
if self.allChannels.value.count == SelectorTest.testWeDoNotDeliverEventsForPreviouslyClosedChannels_numberOfChannelsToUse { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be cleaner as guard
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not
Tests/NIOTests/SelectorTest.swift
Outdated
Note: if you changed anything about the pipeline's handler adding/removal | ||
you might also have a bug there. | ||
""") | ||
return () |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't need the return statement.
Sources/NIO/BaseSocketChannel.swift
Outdated
@@ -776,6 +781,9 @@ class BaseSocketChannel<T: BaseSocket>: SelectableChannel, ChannelCore { | |||
} | |||
|
|||
final func readEOF() { | |||
print("readEOF \(self)") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@weissi remove
private static let testWeDoNotDeliverEventsForPreviouslyClosedChannels_numberOfChannelsToUse = 10 | ||
func testWeDoNotDeliverEventsForPreviouslyClosedChannels() { | ||
/// We use this class to box mutable values, generally in this test anything boxed should only be read/written | ||
/// on the event loop `el`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@weissi should we use a computed property and check MultiThreadedEventLoopThread.currentEventLoop
returns what we expect ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure why not
Tests/NIOTests/SelectorTest.swift
Outdated
case didNotReadGotReadComplete | ||
} | ||
|
||
/// This handler is inserted in the channels that are re-connected. So we're closing a bunch of channels and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... is inserted in the ChannelPipeline
of the channels that are re-connected
Tests/NIOTests/SelectorTest.swift
Outdated
|
||
private let didReadPromise: EventLoopPromise<Void> | ||
private var didRead: Bool = false | ||
private let hasReConnectEventLoopTickFinished: Box<Bool> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we group the let
together ?
Tests/NIOTests/SelectorTest.swift
Outdated
} | ||
let el = MultiThreadedEventLoopGroup(numThreads: 1).next() | ||
defer { | ||
try! el.syncShutdownGracefully() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
XCTAssertNoThrow(...) ?
Tests/NIOTests/TestUtils.swift
Outdated
} | ||
} | ||
templateBytes.removeLast() | ||
let udsTempDir = String(decoding: templateBytes, as: UTF8.self) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: just return.. not net to create the let first.
755bc7a
to
f57010d
Compare
@normanmaurer / @Lukasa all addressed |
7f1cadf
to
c7c8b6f
Compare
assert(eventLoop.inEventLoop) | ||
assert(self.lifecycleManager.isRegistered) | ||
|
||
guard !self.lifecycleManager.hasSeenEOFNotification else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be a guard
or an assert
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Lukasa this will be run if the user does ctx.read()
channel.pipeline.add(handler: countingHandler) | ||
} | ||
.bind(host: "127.0.0.1", port: 0) | ||
.thenIfError { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Surely bind
isn't going to error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indeed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
Motivation: Since forever we had a major bug in the Selector: In this condition: - kqueue/epoll had many events - in one of the earlier events we unregister a Channel whose fd is on of the later events - we subsequently (still in the same event loop tick) register a new channel which gets the same fd as the previously closed one then we would deliver an event that was meant for a previous channel to a newly opened one. Thanks to @mcdappdev for hitting this bug, helping us debug it and also providing a repeatedly working repro. Modifications: if during event delivery any fd gets unregistered, we stop delivering the remaining events and rely on the selector to redeliver them again next time. Result: we don't deliver events for previously closed channels to new ones.
c7c8b6f
to
01b5e71
Compare
did over the shoulder review with @Lukasa who is happy, so pressing the 'dismiss review' button :) |
Motivation:
Since forever we had a major bug in the Selector: In this condition:
the later events
channel which gets the same fd as the previously closed one
then we would deliver an event that was meant for a previous channel to
a newly opened one.
Thanks to @mcdappdev for hitting this bug, helping us debug it and also
providing a repeatedly working repro.
Modifications:
if during event delivery any fd gets unregistered, we stop delivering
the remaining events and rely on the selector to redeliver them
again next time.
Result:
we don't deliver events for previously closed channels to new ones.