-
-
Notifications
You must be signed in to change notification settings - Fork 727
New issue
Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? # to your account
Read & write all frames in one pass #4506
Conversation
02fd115
to
c9b9e54
Compare
Hrm, neat. :) |
Yeah we might be able to do better with |
My guess is that going beyond this will have diminishing returns, at least on the scheduler side where we generally have small messages. I could easily be wrong though. |
1b7d376
to
7021240
Compare
Made a few more changes. I think this captures the same idea as what |
83f3e8b
to
ff4047f
Compare
Ensures that the separate `frames` are freed from memory before proceeding to the next step of sending them out.
Instead of doing multiple reads with `async`, just allocate one big chunk of memory for all of the frames and read into it. Should cutdown on the number of passes through Tornado needed to fill these frames.
This should allow us to allocate space for the entirety of the rest of the message including the size of each frames and all following frames. We can then unpack all of this information once received. By doing this we are able to cutdown on addition send/recv calls that would otherwise occur and spend less time in Tornado's IO handling.
Simplifies the code in the TCP path by leveraging existing utility functions.
Make it a little easier to follow how the variables relate to each other.
ff4047f
to
e27dce0
Compare
These are just binary serialization steps that are not really depended on communication or issues that may come from sockets. So go ahead and move them out of the `try` block.
Simplified this a bit more using On the receiving end, we just get the message size first. Then use that to preallocate a buffer to hand off to Tornado to fill. The remaining unpacking is just handled by |
Also group `frames` and `frames_nbytes` steps together. Finally rewrites the code to avoid use of constant for size of `"Q"`, which should make it invariant to changes in that size.
Should avoid issues on platforms where this may not be the exact size.
08bddb1
to
d12d0d9
Compare
To simplify the logic, just concatenate small frames before doing any sends. This way we can use the same code path for all sends.
72e9dde
to
f9f8602
Compare
f9f8602
to
5860ccd
Compare
Also figured out how to offload all frames to Tornado. So it now only uses one |
5860ccd
to
67be160
Compare
Tornado has an internal queue that uses to hold frames before writing them. To avoid needing to track and wait on various `Future`s and the amount of data sent, we can just enqueue all of the frames we want to send before a send even happens and then start the write. This way Tornado already has all of the data we plan to send once it starts working. In the meantime, we are able to carry on with other tasks while this gets handled in the background. https://github.com/tornadoweb/tornado/blob/6cdf82e927d962290165ba7c4cccb3e974b541c3/tornado/iostream.py#L537-L538
67be160
to
3e37aa9
Compare
Some profiling details in issue ( quasiben/dask-scheduler-performance#108 ). Most notably this is giving us a |
Please let me know if anything else is needed here 🙂 |
Thanks Matt! 😄 |
All I did was press the green button :)
…On Wed, Feb 17, 2021 at 11:52 AM jakirkham ***@***.***> wrote:
Thanks Matt! 😄
—
You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
<#4506 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AACKZTFFC6YXDX3LXBRALZ3S7P67PANCNFSM4XRKYH6Q>
.
|
black distributed
/flake8 distributed
Instead of doing multiple reads with
async
, just allocate one big chunk of memory for all of the frames and read into it. Should cutdown on the number of passes through Tornado needed to fill these frames.Also knowing that
IOStream
has an internal queue of buffers for writing, we are able to push all of the frames into that queue beforehand. Then ask Tornado towrite
after they are in the queue. This also cuts down on the number of passes through Tornado by simply entering the write handling code once and writing all the buffers.