Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[manifold] make partitioner faster. #3

Merged
merged 19 commits into from
Feb 4, 2018
Merged

Conversation

jhgg
Copy link
Contributor

@jhgg jhgg commented Feb 4, 2018

Partitioner Optimizations

  • instead of Utils.group_by, we are using a new function called Utils.partition_pids which uses a tuple instead of a map internally. This ends up being roughly 2.2x faster than Utils.group_by in the benchmarks attached.
## GroupByBench
benchmark name     iterations   average time
partition_pids 8         5000   340.75 µs/op
partition_pids 24        5000   382.38 µs/op
partition_pids 48        5000   432.45 µs/op
group by 8               5000   600.24 µs/op
group by 24              2000   730.41 µs/op
group by 48              1000   1154.78 µs/op
  • make state not need to be modified during iteration. (pre-spawn the workers) in the Partitioner. This means we don't have to use Enum.reduce and can instead use a more optimized code path for this, do_send, which operates on a tuple of lists, and sends to the respective worker. Turns out this isn't really any faster...
## WorkerSendBenches
benchmark name    iterations   average time
enum reduce send       50000   65.56 µs/op
do_send send           50000   68.85 µs/op
  • add a specific case for sending to a single pid (which we do quite often, when doing manifold sends to a single pid). There is no practical speedup in avoiding the group by operation, but the resultant send operation ends up being faster, as it doesn't have to start an Enum.reducer, or iterate over the result of Utils.partition_pids, so that avoids ~0.15µs/op + ~0.70-1.5 µs/op. additionally, this function consumes less reductions as it is doing less things (which we want!)
## GroupByOneBench
benchmark name     iterations   average time
partition_pids 8     10000000   0.11 µs/op
partition_pids 24    10000000   0.14 µs/op
group by 48          10000000   0.14 µs/op
group by 8           10000000   0.14 µs/op
group by 24          10000000   0.16 µs/op
partition_pids 48    10000000   0.20 µs/op

## WorkerSendOneBenches (worker count=48)
benchmark name    iterations   average time
enum reduce send    10000000   0.70 µs/op
do_send send         1000000   1.47 µs/op

worker gained some optimizations too:

  • use list comprehension instead of Enum.each (it's a little bit faster), here's sending to 200 pids:
## SendBench
benchmark name     iterations   average time
send list comp           2000   961.87 µs/op
send fast reducer        2000   982.81 µs/op
send enum each           1000   1153.00 µs/op
  • add a special case for sending to a single pid:
## SendBenchOne
benchmark name     iterations   average time
send one             10000000   0.22 µs/op
send fast reducer    10000000   0.27 µs/op
send enum each       10000000   0.34 µs/op
send list comp       10000000   0.34 µs/op

Manifold.send

  • make a specialized code path for sending to a single pid.
  • use list comprehension instead of Enum.each

@jhgg jhgg requested a review from vishnevskiy February 4, 2018 04:32
@@ -31,7 +31,11 @@ defmodule Manifold.Partitioner do
# Set optimal process flags
Process.flag(:trap_exit, true)
Process.flag(:message_queue_data, :off_heap)
{:ok, Tuple.duplicate(nil, partitions)}
workers = for _ <- 0..partitions do
{:ok, pid} = Worker.start_link()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is probably a fine change. But this no longer supports respawning a worker if it somehow crashes. If you want to make this change you should remove :trap_exit and the :EXIT match.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

@jhgg
Copy link
Contributor Author

jhgg commented Feb 4, 2018

this is clearly faster, i'm wanting to merge this then work on sharding partitions in #4. wdyt?

@jhgg jhgg changed the title [manifold] make partitioner's partition operation 2.2x faster and other [manifold] make partitioner faster. Feb 4, 2018
@jhgg jhgg requested a review from ihumanable February 4, 2018 06:17
Copy link
Contributor

@vishnevskiy vishnevskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only change that seems worth it is the group by optimization.

@jhgg jhgg merged commit 05c01cd into master Feb 4, 2018
jhgg added a commit that referenced this pull request Feb 4, 2018
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants