Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

ARROW-15732: [C++] Do not use any CPU threads in execution plan when use_threads is false #15104

Conversation

westonpace
Copy link
Member

@westonpace westonpace commented Dec 28, 2022

This PR gets rid of the "executor==nullptr" style of execution from Acero. Now a CPU executor must always be defined. This also updates the DeclarationToXyz methods to support 5 "standard" modes of operation:

  • sync / threaded - All CPU work is done on the CPU thread pool. The calling thread sleeps until the plan is done.
  • sync / serial - All CPU work is done on the calling thread. The calling thread becomes a serial executor until plan is done.
  • async / threaded - All CPU work is done on the CPU thread pool. The calling thread returns immediately.
  • async / serial - All CPU work is done on a plan-specific 1-thread pool. The calling thread returns immediately.
  • async / custom exec context - All CPU work is done on the provided executor which cannot be nullptr and the caller is responsible for keeping it alive.

For backwards compatibility the "old style" of serial execution (which is still used by R/python) will still be accepted. Under the hood a 1-thread pool will be created and its lifetime will be bound to the exec plan (this is similar to async / serial when using DeclarationToXyz).

Since DeclarationToXyz doesn't support "ordered sinks" we cannot yet migrate the bindings and so they will continue to rely on the deprecated API.

All unit tests are updated to use the new API. Some unit tests were not actually running in parallel (even though they said they were) and now that they are running in parallel a few changes had to be made to accept / handle non-deterministic result order.

As part of this process some unit tests were migrated away from StartAndCollect in favor of DeclarationToXyz. Hopefully we can eventually migrate almost all of the unit tests to DeclarationToXyz but that is a longer term effort.

@github-actions
Copy link

@github-actions
Copy link

⚠️ Ticket has not been started in JIRA, please click 'Start Progress'.

@westonpace westonpace force-pushed the feature/ARROW-15732--proper-exec-plan-serial-mode branch from 9100491 to 7fdf8b3 Compare December 30, 2022 03:46
@westonpace westonpace marked this pull request as ready for review December 30, 2022 04:22
@westonpace westonpace requested a review from lidavidm December 30, 2022 06:23
Copy link
Member

@lidavidm lidavidm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall. Thanks for cleaning up the tests.

We do have a lot of different modes of execution, though I suppose it's just because of sync/async × threaded/serial.

@westonpace
Copy link
Member Author

Thanks for the quick review.

We do have a lot of different modes of execution, though I suppose it's just because of sync/async × threaded/serial.

Yep. The async-serial mode is newly introduced here and it would be nice to avoid it. However, I can't easily switch R/python over to sync-serial because I need to wait until we fix ordering (or add ordering to the DeclarationToXyz methods which I don't want to do).

Fixing ordering in this PR would make the PR too large so I think I'm stuck with this.

@ursabot
Copy link

ursabot commented Dec 31, 2022

Benchmark runs are scheduled for baseline = 237bc30 and contender = 498b645. 498b645 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Finished ⬇️0.0% ⬆️0.0%] ec2-t3-xlarge-us-east-2
[Failed ⬇️0.0% ⬆️0.0%] test-mac-arm
[Finished ⬇️2.3% ⬆️0.0%] ursa-i9-9960x
[Failed ⬇️0.0% ⬆️0.0%] ursa-thinkcentre-m75q
Buildkite builds:
[Finished] 498b645e ec2-t3-xlarge-us-east-2
[Failed] 498b645e test-mac-arm
[Finished] 498b645e ursa-i9-9960x
[Failed] 498b645e ursa-thinkcentre-m75q
[Finished] 237bc304 ec2-t3-xlarge-us-east-2
[Finished] 237bc304 test-mac-arm
[Finished] 237bc304 ursa-i9-9960x
[Finished] 237bc304 ursa-thinkcentre-m75q
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

@ursabot
Copy link

ursabot commented Dec 31, 2022

['Python', 'R'] benchmarks have high level of regressions.
ursa-i9-9960x

EpsilonPrime pushed a commit to EpsilonPrime/arrow that referenced this pull request Jan 5, 2023
…use_threads is false (apache#15104)

This PR gets rid of the "executor==nullptr" style of execution from Acero.  Now a CPU executor must always be defined.  This also updates the DeclarationToXyz methods to support 5 "standard" modes of operation:

* sync / threaded - All CPU work is done on the CPU thread pool.  The calling thread sleeps until the plan is done.
* sync / serial - All CPU work is done on the calling thread.  The calling thread becomes a serial executor until plan is done.
* async / threaded - All CPU work is done on the CPU thread pool.  The calling thread returns immediately.
* async / serial - All CPU work is done on a plan-specific 1-thread pool.  The calling thread returns immediately.
* async / custom exec context - All CPU work is done on the provided executor which cannot be nullptr and the caller is responsible for keeping it alive.

For backwards compatibility the "old style" of serial execution (which is still used by R/python) will still be accepted.  Under the hood a 1-thread pool will be created and its lifetime will be bound to the exec plan (this is similar to async / serial when using DeclarationToXyz).

Since DeclarationToXyz doesn't support "ordered sinks" we cannot yet migrate the bindings and so they will continue to rely on the deprecated API.

All unit tests are updated to use the new API.  Some unit tests were not actually running in parallel (even though they said they were) and now that they are running in parallel a few changes had to be made to accept / handle non-deterministic result order.

As part of this process some unit tests were migrated away from StartAndCollect in favor of DeclarationToXyz.  Hopefully we can eventually migrate almost all of the unit tests to DeclarationToXyz but that is a longer term effort.

Authored-by: Weston Pace <weston.pace@gmail.com>
Signed-off-by: Weston Pace <weston.pace@gmail.com>
EpsilonPrime pushed a commit to EpsilonPrime/arrow that referenced this pull request Jan 5, 2023
…use_threads is false (apache#15104)

This PR gets rid of the "executor==nullptr" style of execution from Acero.  Now a CPU executor must always be defined.  This also updates the DeclarationToXyz methods to support 5 "standard" modes of operation:

* sync / threaded - All CPU work is done on the CPU thread pool.  The calling thread sleeps until the plan is done.
* sync / serial - All CPU work is done on the calling thread.  The calling thread becomes a serial executor until plan is done.
* async / threaded - All CPU work is done on the CPU thread pool.  The calling thread returns immediately.
* async / serial - All CPU work is done on a plan-specific 1-thread pool.  The calling thread returns immediately.
* async / custom exec context - All CPU work is done on the provided executor which cannot be nullptr and the caller is responsible for keeping it alive.

For backwards compatibility the "old style" of serial execution (which is still used by R/python) will still be accepted.  Under the hood a 1-thread pool will be created and its lifetime will be bound to the exec plan (this is similar to async / serial when using DeclarationToXyz).

Since DeclarationToXyz doesn't support "ordered sinks" we cannot yet migrate the bindings and so they will continue to rely on the deprecated API.

All unit tests are updated to use the new API.  Some unit tests were not actually running in parallel (even though they said they were) and now that they are running in parallel a few changes had to be made to accept / handle non-deterministic result order.

As part of this process some unit tests were migrated away from StartAndCollect in favor of DeclarationToXyz.  Hopefully we can eventually migrate almost all of the unit tests to DeclarationToXyz but that is a longer term effort.

Authored-by: Weston Pace <weston.pace@gmail.com>
Signed-off-by: Weston Pace <weston.pace@gmail.com>
# for free to join this conversation on GitHub. Already have an account? # to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants