Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Support affinity/pinning of parallel flows to different CPUs #1738

Open
marcosfsch opened this issue Jul 23, 2024 · 2 comments
Open

Support affinity/pinning of parallel flows to different CPUs #1738

marcosfsch opened this issue Jul 23, 2024 · 2 comments

Comments

@marcosfsch
Copy link
Contributor

marcosfsch commented Jul 23, 2024

Enhancement Request

  • Current behavior
    iperf3 supports setting affinity to a single CPU, which made sense when it was single threaded

  • Desired behavior
    Support setting a rage/list of CPUs to affinity of multiple flows: Ex: "iperf3 -c localhost -P 4 -A 1-4,1-4"

  • Implementation notes
    There are at least two way this can be implementated and I'd suggest the second approach:

  1. Replicate numactl behavior, where "numactl -C 1-4" would bind each flow to the CPU range "1-4" and Linux would be responsible for the scheduling, allowing for dinamically rebalancing the flows but generating noise and perfromance drops.
  2. Statically schedule each flow to a different CPU, i.e. in a round-robin fashion, so that when you have a maching set of parallel threads and CPUs you'd have 1 flow per CPU, minimizing reschedules during the transfer and optimizing performance

Another note, is that ideally you should be able to explicitly define a CPU list, which normally uses comma as a separator, i.e. "1,3,5,7". But this would impact either in I) changing the current client/server CPU separator ("iperf3 -c localhost -P 4 -A 1,3,5,7/2,4,6,8") or to use a different delimiter character for defining the list.

@bmah888
Copy link
Contributor

bmah888 commented Jul 29, 2024

You raise a good point in that the current -A behavior doesn't work very well in a multi-threaded iperf3. So far our standard practice within ESnet is just to do numactl, as you suggested in the first approach. How bad are the downsides you mentioned...have you or others observed these problems? (I don't think we have, but multithreaded application performance analysis is not my forte.)

In considering these different implementations, we also want to keep in mind other OS platforms that support -A, such as FreeBSD (a supported platform), and Windows (while not officially supported, an environment that I'd like to avoid gratuitously breaking).

@davidBar-On
Copy link
Contributor

Submitted PR #1778 with a suggested multi-CPUs Affinity support.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants