-
Notifications
You must be signed in to change notification settings - Fork 656
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
make Channel lifecycle statemachine explicit #220
make Channel lifecycle statemachine explicit #220
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, the general shape of this is really good. Some notes inline.
self.isActiveAtomic.store(false) | ||
default: | ||
() | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume the reason this conditional is a bit complex is to avoid doing too many atomic ops?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Lukasa indeed, also don't think it's too complex tbh
Sources/NIO/BaseSocketChannel.swift
Outdated
} | ||
} | ||
|
||
@inline(__always) // so we're not actually returning a closure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice to confirm that the compiler actually gets this right.
Presumably the other option is to pass a collection of non-escaping closures into this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@normanmaurer / @Lukasa the problem here is that the compiler might change at any point in time and there's nothing that we can do to make sure that it's inlined unfortunately.
And passing the closures in is just so ugly, no?
Sources/NIO/BaseSocketChannel.swift
Outdated
|
||
switch (self.currentState, event) { | ||
// origin: .neverRegistered | ||
case (.neverRegistered, .activate): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we intend to allow activating an unregistered socket? I think I'd kinda prefer we didn't.
Sources/NIO/BaseSocketChannel.swift
Outdated
self.badTransition(event: event) | ||
|
||
case (.registeredNeverActivated, .register): | ||
self.badTransition(event: event) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mild nit, but it may be a bit clearer to read this state machine if you fold all the badTransition cases for a given state together, e.g
case (.registeredNeverActivated, .deactivate),
(.registeredNeverActivated, .register):
self.badTransition(event: event)
Sources/NIO/BaseSocketChannel.swift
Outdated
enum State { | ||
case neverRegistered | ||
case registeredNeverActivated | ||
case registered |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hrm, do we want allow a channel to go registered -> active -> registered -> active? I'd be inclined to want Channel
s to be single-use constructs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Lukasa I think you are right... it should be single use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed
Sources/NIO/BaseSocketChannel.swift
Outdated
} | ||
} | ||
|
||
internal var isOpen: Bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is such a weird property. I wonder if we really need it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Lukasa we use it all over the place to check if a channel has not been closed. Sure, I could introduce a isClosed
property but then we'd need to guard !...isClosed
which gives us the inversion always...
Sources/NIO/BaseSocketChannel.swift
Outdated
self._pipeline = ChannelPipeline(channel: self) | ||
self.lifecycleManager = SocketChannelLifecycleManager(eventLoop: self.eventLoop, | ||
channelPipeline: ChannelPipeline(channel: self), | ||
isActiveAtomic: Atomic(value: false)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a reference cycle here, right? lifecycleManager
holds a reference to self
, meaning self
holds a cyclic reference to self
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Lukasa no because ChannelPipeline
holds Channel
as an unowned
reference. It's totally non-obvious but has always been like this.
Sources/NIO/BaseSocketChannel.swift
Outdated
@@ -373,6 +561,11 @@ class BaseSocketChannel<T: BaseSocket>: SelectableChannel, ChannelCore { | |||
return | |||
} | |||
|
|||
guard self.isActive else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably check the lifecycleManager
directly, as it's definitionally called from on the event loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, yes, forgot to record that change
Sources/NIO/BaseSocketChannel.swift
Outdated
@@ -410,7 +603,12 @@ class BaseSocketChannel<T: BaseSocket>: SelectableChannel, ChannelCore { | |||
|
|||
self.markFlushPoint(promise: nil) | |||
|
|||
guard self.isActive else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same note, we're on the event loop and can avoid the atomic.
Sources/NIO/BaseSocketChannel.swift
Outdated
if !self.neverRegistered { | ||
pipeline.fireChannelUnregistered0() | ||
} | ||
self.lifecycleManager.moveState(event: .close, promise: p)() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to wrap these in convenience methods that let the states and inputs become private to lifecycleManager
? e.g. self.lifecycleManager.close()
.
Sources/NIO/Channel.swift
Outdated
@@ -278,6 +278,9 @@ public enum ChannelError: Error { | |||
|
|||
/// A `DatagramChannel` `write` was made with an address that was not reachable and so could not be delivered. | |||
case writeHostUnreachable | |||
|
|||
/// An operation that was inappropriate given the current `Channel` state was attempted. | |||
case inappropriateOperationForState |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need adding to the Equatable
conformance?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @ianpartridge indeed
3f920cb
to
f1c810e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few style changes... Also did you run a benchmark to verify how much more costly this is with all the closures ?
Sources/NIO/BaseSocketChannel.swift
Outdated
} | ||
|
||
@inline(__always) | ||
internal mutating func register(promise: EventLoopPromise<()>?) -> (() -> Void) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@weissi Please change to use EventLoopPromise<Void>
to be consistent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, will do
Sources/NIO/BaseSocketChannel.swift
Outdated
} | ||
|
||
@inline(__always) | ||
internal mutating func close(promise: EventLoopPromise<()>?) -> (() -> Void) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@weissi Please change to use EventLoopPromise<Void>
to be consistent
Sources/NIO/BaseSocketChannel.swift
Outdated
} | ||
|
||
@inline(__always) | ||
internal mutating func activate(promise: EventLoopPromise<()>?) -> (() -> Void) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@weissi Please change to use EventLoopPromise<Void>
to be consistent
Sources/NIO/BaseSocketChannel.swift
Outdated
|
||
// MARK: private API | ||
@inline(__always) // so we're not actually returning a closure | ||
private mutating func moveState(event: Event, promise: EventLoopPromise<()>?) -> (() -> Void) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@weissi Please change to use EventLoopPromise<Void>
to be consistent
Sources/NIO/BaseSocketChannel.swift
Outdated
} | ||
|
||
// MARK: private API | ||
@inline(__always) // so we're not actually returning a closure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So just to clarify this is to safe the heap allocation ? If so can you just add this to the comment ?
Sources/NIO/BaseSocketChannel.swift
Outdated
// MARK: API | ||
internal init(eventLoop: EventLoop, | ||
channelPipeline: ChannelPipeline, | ||
isActiveAtomic: Atomic<Bool>) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason we're injecting this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh thanks, no there isn't anymore
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Lukasa now found the reason why this was injected: lifecycle manager is of course held mutable by the BaseSocketChannel. But we need to get to the atomic from any thread. Therefore the atomic needs to be stored on the BaseSocketChannel in a let
and passed to the lifecycle manager. Sorry I did forget about that. Patch coming.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--> #294
Sources/NIO/BaseSocketChannel.swift
Outdated
|
||
// MARK: private API | ||
@inline(__always) // so we're not actually returning a closure | ||
private mutating func moveState(event: Event, promise: EventLoopPromise<()>?) -> (() -> Void) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we still sure we don't heap-allocate a closure here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let me read some SIL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Lukasa ok I had a look at this and it seems totally fine. SIL didn't work as the compiler kept crashing, looked at the assembly instead. To make it easier I planted to bogus fcntl
calls in there just so I know where we are:
and then I traced them back in the assembly which you can find below. I marked the beginning and the end of the interesting section as well as all the CALL
instructions (which could lead to allocations):
as you can see all the calls are either retain
or release
, some optional projections or the call to badTransition
which is expected. Do you agree that this is fine?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hang on, something's not quite right! I can't see the calls to pipeline.becomeActive
and so either. LOoking
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry about that @Lukasa but this is the best I could do but there you can see quite well a run through of what happens:
- the beginning (
fcntl
marker call) - fulfill the promise
- fire
channelActive
- the end (
fcntl
marker call)
all the CALL
instructions are highlighted and none of them should allocate. So it's all good I think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, in step 4 we can also see the atomic store (to mark the channel as active) which is also important but I forgot to annotate.
} | ||
|
||
// Call before triggering the close of the Channel. | ||
pipeline.fireChannelReadComplete0() | ||
if self.lifecycleManager.isActive { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this right? It's not clear to me that the contract of readComplete
requires that it only fire on active channels. If it does, is that contract in tension with the contract that readComplete
always fires after channelRead
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in general it should only fire when the channel is active.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Lukasa, chatted to @normanmaurer and added a test which always gets into this situation by doing
- one successful read
- one read that fails with
ECONNRESET
- in the
errorCaught
callsctx.close
to close the channel - sees
channelInactive
(from the close) - does not ever see
channelReadComplete
.
9f109d8
to
87ba71c
Compare
Sources/NIO/BaseSocketChannel.swift
Outdated
|
||
// this is called from Channel's deinit, so don't assert we're on the EventLoop! | ||
internal var canBeDestroyed: Bool { | ||
if case .closed = self.currentState { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can be simplified as:
return self.currentState == .closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks @normanmaurer it can indeed now. Used to have more states :)
Sources/NIO/BaseSocketChannel.swift
Outdated
/// until the Channel is closed. | ||
internal var isOpen: Bool { | ||
assert(self.eventLoop.inEventLoop) | ||
if case .closed = self.currentState { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just do:
return self.currentState != .closed
Sources/NIO/BaseSocketChannel.swift
Outdated
internal let isActiveAtomic = Atomic(value: false) | ||
// these are only to be accessed on the EventLoop | ||
internal let channelPipeline: ChannelPipeline | ||
private let eventLoop: EventLoop |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we remove this and just use channelPipeline.eventLoop
when needed ?
private mutating func doStateTransfer(newState: State, promise: EventLoopPromise<Void>?, _ callouts: @escaping (ChannelPipeline) -> Void) -> (() -> Void) { | ||
self.currentState = newState | ||
|
||
let pipeline = self.channelPipeline |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why we store the pipeline here first ? So we not escape self ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@normanmaurer yes so that self
isn't captured. I can remove now as the @inline(_always)
should make the whole closure go away anyway. Want me to?
Sources/NIO/BaseSocketChannel.swift
Outdated
return false | ||
case .activated: | ||
return true | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not just use?:
return self.currentState == .activated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, same here, used to be more states
985466d
to
20ea88f
Compare
@weissi please fix the test compile error. |
20ea88f
to
7abbb03
Compare
@normanmaurer sorry, should be fixed |
@weissi can you also run some benchmarks with this ? |
@normanmaurer the same: before:
after:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, ship it.
Motivation: We had a lot of problems with the Channel lifecycle statemachine as it wasn't explicit, this fixes this. Additionally, it asserts a lot more. Modifications: - made Channel lifecycle statemachine explicit - lots of asserts Result: - hopefully the state machine works better - the asserts should guide our way to work on in corner cases as well
5cc3082
to
b1375ac
Compare
@weissi thanks ... merged! |
@normanmaurer / @Lukasa this is a preview of what I'm working on. It passes tests, also happy with some monkey channel test which previously used to crash stuff a lot. But we should have a chat what we want to do in which situations. Also CC @vlm .
Motivation:
We had a lot of problems with the Channel lifecycle statemachine as it
wasn't explicit, this fixes this. Additionally, it asserts a lot more.
Modifications:
Result: