Skip to content

High pasta and rootlessport CPU load #23686

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Open
pwige opened this issue Aug 20, 2024 · 144 comments
Open

High pasta and rootlessport CPU load #23686

pwige opened this issue Aug 20, 2024 · 144 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. network Networking related issue or feature pasta pasta(1) bugs or features

Comments

@pwige
Copy link

pwige commented Aug 20, 2024

Issue Description

I'm encountering unexpectedly high CPU load from pasta and rootlessport when running certain network operations.
Scenario 1 – Downloading large files:
Downloading Hetzner test files (although any large file download should do) from within a rootless container.
Scenario 2 – Wireguard VPN server:
Hosting a Wireguard server in a rootless container using the docker.io/linuxserver/wireguard image.

Steps to reproduce the issue

Steps to reproduce the issue:
Scenario 1:

  1. Run rootless container and download the test file to tmpfs:
    podman container run \
        --interactive \
        --tty 
        --rm \
        --tmpfs /mnt \
        --workdir /mnt \
        debian /bin/bash -c 'apt update && apt install -y wget && wget https://ash-speed.hetzner.com/10GB.bin'
    
  2. Monitor download speed reported by wget and CPU usage reported by htop.
    If Hetzner's Ashburn datacenter is far from you, check their other locations and modify the URL's subdomain as needed.

Scenario 2:

  1. Create a custom network.
    podman network create testnet
    
  2. Spawn a Wireguard server in a rootless container.
    podman container run \
        --deattach \
        --rm \
        --cap-add NET_RAW \
        --cap-add NET_ADMIN \
        --sysctl net.ipv4.conf.all.forwarding=1 \
        --env PEERS=1 \
        --network=testnet \
        --publish 51820:51820/udp \
        docker.io/linuxserver/wireguard
    
  3. Install the client profile on a separate device using either the QR code provided in the server container's log or the peer1.conf file stored within the container itself.
  4. Run a speedtest first with multiple connections and then with only a single connection enabled.
  5. Observe CPU load via htop.

Describe the results you received

Scenario 1:
When downloading Hetzner test files with wget in a rootless container I reach consistent download speeds of around ~43MiB/s, but htop reports the pasta process using ~40-43% CPU load.

Scenario 2:
Running a network speedtest on the client using results in extremely slow network speeds reported on the client and very high CPU load on the server. htop shows 90-100% CPU load from the pasta process and 10-17% CPU load from several rootlessport processes.

Wireguard's poor performance renders it essentially unusable. I was told the CPU load is unexpectedly high and that I should raise a bug report here. Please let me know if what I am encountering is actually to be expected from rootless containers.

Describe the results you expected

I expected less severe CPU load from the rootless networking backend and better performance/network speeds via the Wireguard VPN tunnel.

podman info output

host:
  arch: amd64
  buildahVersion: 1.37.1
  cgroupControllers:
  - cpu
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-1:2.1.12-1
    path: /usr/bin/conmon
    version: 'conmon version 2.1.12, commit: e8896631295ccb0bfdda4284f1751be19b483264'
  cpuUtilization:
    idlePercent: 96.76
    systemPercent: 2.25
    userPercent: 0.99
  cpus: 4
  databaseBackend: sqlite
  distribution:
    distribution: arch
    version: unknown
  eventLogger: journald
  freeLocks: 2046
  hostname: fern
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1003
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    - container_id: 65537
      host_id: 165536
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1001
      size: 1
    - container_id: 1
      host_id: 100000
      size: 65536
    - container_id: 65537
      host_id: 165536
      size: 65536
  kernel: 6.10.5-arch1-1
  linkmode: dynamic
  logDriver: journald
  memFree: 15708397568
  memTotal: 16651931648
  networkBackend: netavark
  networkBackendInfo:
    backend: netavark
    dns:
      package: aardvark-dns-1.12.1-1
      path: /usr/lib/podman/aardvark-dns
      version: aardvark-dns 1.12.1
    package: netavark-1.12.2-1
    path: /usr/lib/podman/netavark
    version: netavark 1.12.2
  ociRuntime:
    name: crun
    package: crun-1.16.1-1
    path: /usr/bin/crun
    version: |-
      crun version 1.16.1
      commit: afa829ca0122bd5e1d67f1f38e6cc348027e3c32
      rundir: /run/user/1001/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +YAJL
  os: linux
  pasta:
    executable: /usr/bin/pasta
    package: passt-2024_08_14.61c0b0d-1
    version: |
      pasta 2024_08_14.61c0b0d
      Copyright Red Hat
      GNU General Public License, version 2 or later
        <https://www.gnu.org/licenses/old-licenses/gpl-2.0.html>
      This is free software: you are free to change and redistribute it.
      There is NO WARRANTY, to the extent permitted by law.
  remoteSocket:
    exists: false
    path: /run/user/1001/podman/podman.sock
  rootlessNetworkCmd: pasta
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /etc/containers/seccomp.json
    selinuxEnabled: false
  serviceIsRemote: false
  slirp4netns:
    executable: ""
    package: ""
    version: ""
  swapFree: 6442446848
  swapTotal: 6442446848
  uptime: 0h 31m 3.00s
  variant: ""
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries: {}
store:
  configFile: /home/containers/.config/containers/storage.conf
  containerStore:
    number: 1
    paused: 0
    running: 1
    stopped: 0
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /storage/containers/storage
  graphRootAllocated: 6001156685824
  graphRootUsed: 94199808
  graphStatus:
    Backing Filesystem: btrfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Supports shifting: "false"
    Supports volatile: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 3
  runRoot: /run/user/1001/containers
  transientStore: false
  volumePath: /storage/containers/storage/volumes
version:
  APIVersion: 5.2.1
  Built: 1723672228
  BuiltTime: Wed Aug 14 17:50:28 2024
  GitCommit: d0582c9e1e6c80cc08c3f042b91993c853ddcbc6
  GoVersion: go1.23.0
  Os: linux
  OsArch: linux/amd64
  Version: 5.2.1

Podman in a container

No

Privileged Or Rootless

Rootless

Upstream Latest Release

Yes

Additional environment details

Running a fresh Arch installation on an Intel NUC5i7RYH, which is connected to my home's router via ethernet. No other services or containers running during testing.

Additional information

The experience I described above is consistent.

@pwige pwige added the kind/bug Categorizes issue or PR as related to a bug. label Aug 20, 2024
@sbrivio-rh sbrivio-rh added pasta pasta(1) bugs or features network Networking related issue or feature labels Aug 20, 2024
@sbrivio-rh
Copy link
Collaborator

Thanks for reporting this!

Scenario 2: Running a network speedtest on the client using results in extremely slow network speeds reported on the client and very high CPU load on the server. htop shows 90-100% CPU load from the pasta process and 10-17% CPU load from several rootlessport processes.

I wouldn't expect rootlessport processes to be around in this case. How many are running? Can you have a look at their command line? I'm wondering if there's some unintended loop between port mappings or suchlike.

@sbrivio-rh
Copy link
Collaborator

About Scenario 1: I can reproduce something similar, in my case downloading from the fsn1 Hetzner area. I download at approximately 1gbps (110MB/s) and pasta uses approximately 70% of a CPU thread (wget is at ~50%).

The good news is that with higher transfer rates I get, on the same system, approximately 14gbps at 100% CPU thread load, and if I CPU-starve pasta with the test file download, I get approximately the same transfer rates. That is, if we have bigger chunks of data to transfer (because we have less CPU time), the CPU load doesn't increase linearly, so I don't see a practical issue with it.

Anyway, this is what perf:

$ perf record -g ./pasta --config-net
# wget https://fsn1-speed.hetzner.com/1GB.bin -O /dev/null
--2024-08-21 10:07:11--  https://fsn1-speed.hetzner.com/1GB.bin
Resolving fsn1-speed.hetzner.com (fsn1-speed.hetzner.com)... 2a01:4f8:0:a232::2, 78.46.170.2
Connecting to fsn1-speed.hetzner.com (fsn1-speed.hetzner.com)|2a01:4f8:0:a232::2|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1073741824 (1.0G) [application/octet-stream]
Saving to: ‘/dev/null’

/dev/null           100%[===================>]   1.00G   109MB/s    in 9.4s    

2024-08-21 10:07:21 (109 MB/s) - ‘/dev/null’ saved [1073741824/1073741824]

# 
logout
[ perf record: Woken up 18 times to write data ]
[ perf record: Captured and wrote 5.080 MB perf.data (37680 samples) ]
$ perf report

says about it:

Samples: 37K of event 'cycles', Event count (approx.): 30751209003
  Children      Self  Command     Shared Object         Symbol
-   80.23%     0.56%  passt.avx2  [kernel.kallsyms]     [k] entry_SYSCALL_64_af◆
     79.67% entry_SYSCALL_64_after_hwframe                                     ▒
      - do_syscall_64                                                          ▒
         + 20.90% do_writev                                                    ▒
         + 19.63% __sys_recvmsg                                                ▒
         + 13.53% __x64_sys_epoll_wait                                         ▒
         + 5.35% ksys_read                                                     ▒
         + 4.93% __x64_sys_epoll_ctl                                           ▒
         + 4.01% __x64_sys_timerfd_settime                                     ▒
         + 3.86% __x64_sys_recvfrom                                            ▒
         + 3.29% syscall_trace_enter.constprop.0                               ▒
         + 2.75% syscall_exit_to_user_mode                                     ▒

do_writev() is the system call writing to the tap interface of the container (we have to write one frame at a time, because tap gives us a file descriptor, not a socket), and __sys_recvmsg() are the reads from the socket.

There isn't much we can do about that, except for a planned VDUSE ¹ ² back-end. At that point, we'll get rid of those system calls and just "move" data between a shared memory ring and sockets.

The rest is something we could probably try to improve on in the shorter term, for example by trying to get bigger chunks of data at a time and reducing the wakeup frequency (__x64_sys_epoll_wait() and friends). I tried this:

diff --git a/tcp_buf.c b/tcp_buf.c
index c31e9f3..c15de64 100644
--- a/tcp_buf.c
+++ b/tcp_buf.c
@@ -477,6 +477,11 @@ int tcp_buf_data_from_sock(struct ctx *c, struct tcp_tap_conn *conn)
 		len = recvmsg(s, &mh_sock, MSG_PEEK);
 	while (len < 0 && errno == EINTR);
 
+	if (len > (1 << 16)) {
+		int lowat = 1 << 17;
+		setsockopt(s, SOL_SOCKET, SO_RCVLOWAT, &lowat, sizeof(lowat));
+	}
+
 	if (len < 0)
 		goto err;
 

CPU load is 15-18% at the same transfer rate. The transfer hangs at the end (this would need the same adaptive logic we have in tcp_splice.c), and we substantially decrease overhead from wakeups and bookkeeping:

  Children      Self  Command     Shared Object         Symbol
-   91.64%     0.12%  passt.avx2  [kernel.kallsyms]     [k] entry_SYSCALL_64_af◆
     91.53% entry_SYSCALL_64_after_hwframe                                     ▒
      - do_syscall_64                                                          ▒
         + 36.24% __sys_recvmsg                                                ▒
         + 24.94% ksys_read                                                    ▒
         + 20.90% do_writev                                                    ▒
         + 3.35% __x64_sys_epoll_wait                                          ▒
         + 1.93% __x64_sys_recvfrom                                            ▒
         + 1.08% __x64_sys_timerfd_settime                                     ▒
           0.73% syscall_exit_to_user_mode                                     ▒
           0.62% syscall_trace_enter.constprop.0                               ▒
           0.51% __x64_sys_setsockopt                                          ▒

So this is something we could consider, even though, again, that CPU load shouldn't look that scary, because it's quite adaptive.

I didn't look into Scenario 2 yet.

@pwige
Copy link
Author

pwige commented Aug 21, 2024

I wouldn't expect rootlessport processes to be around in this case. How many are running? Can you have a look at their command line? I'm wondering if there's some unintended loop between port mappings or suchlike.

After rebooting my system to start with a clean slate, I started my Wireguard container and observed eight rootlessport and six rootlessport-child processes spawn. I spent some time on the client using the network: Opening webpages, downloading Hetzner test files, etc. After several minutes I noticed that a couple more rootlessport processes had appeared, bringing the total number to ten. All of the rootlessport and rootlessport-child processes were reaped once I stopped the container.

As an additional piece of info, I am using Quadlet (systemctl --user start wireguard) to manage this container.

# wireguard.container
[Unit]
Description=Wireguard server quadlet

[Container]
Image=docker.io/linuxserver/wireguard:latest
AddCapability=NET_ADMIN
AddCapability=NET_RAW
Sysctl=net.ipv4.conf.all.forwarding=1
Environment=PEERS=1
Environment=INTERNAL_SUBNET=10.30.0.0/24
# Volume=wireguard-config.volume:/config
Network=wireguard.network
PublishPort=51820:51820/udp

[Service]
Restart=always

[Install]
WantedBy=multi-user.target default.target

No options aside from the heading ([Network]) are provided in the wireguard.network file.

@kalelkenobi
Copy link

kalelkenobi commented Aug 22, 2024

jumping in to say that I'm experiencing the exact same issue, running almost an identical configuration as OP (I'm also using Quadlet to run the VPN container). Feels good not to be alone, this thing was driving me crazy. As a piece of additional information that could help zero in on this, I'll say that I've been running this configuration for more than 4 month now and the issue started only recently (the last month or so).
Here's my podman info:

host:
  arch: amd64
  buildahVersion: 1.37.1
  cgroupControllers:
  - cpu
  - memory
  - pids
  cgroupManager: systemd
  cgroupVersion: v2
  conmon:
    package: conmon-2.1.12-1.fc40.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.1.12, commit: '
  cpuUtilization:
    idlePercent: 98.31
    systemPercent: 1.14
    userPercent: 0.55
  cpus: 4
  databaseBackend: sqlite
  distribution:
    distribution: fedora
    variant: cloud
    version: "40"
  eventLogger: journald
  freeLocks: 2040
  hostname: hekate
  idMappings:
    gidmap:
    - container_id: 0
      host_id: 1001
      size: 1
    - container_id: 1
      host_id: 589824
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1001
      size: 1
    - container_id: 1
      host_id: 589824
      size: 65536
  kernel: 6.10.6-200.fc40.x86_64
  linkmode: dynamic
  logDriver: journald
  memFree: 933859328
  memTotal: 6203207680
  networkBackend: netavark
  networkBackendInfo:
    backend: netavark
    dns:
      package: aardvark-dns-1.12.1-1.fc40.x86_64
      path: /usr/libexec/podman/aardvark-dns
      version: aardvark-dns 1.12.1
    package: netavark-1.12.2-1.fc40.x86_64
    path: /usr/libexec/podman/netavark
    version: netavark 1.12.2
  ociRuntime:
    name: crun
    package: crun-1.15-1.fc40.x86_64
    path: /usr/bin/crun
    version: |-
      crun version 1.15
      commit: e6eacaf4034e84185fd8780ac9262bbf57082278
      rundir: /run/user/1001/crun
      spec: 1.0.0
      +SYSTEMD +SELINUX +APPARMOR +CAP +SECCOMP +EBPF +CRIU +LIBKRUN +WASM:wasmedge +YAJL
  os: linux
  pasta:
    executable: /usr/bin/pasta
    package: passt-0^20240814.g61c0b0d-1.fc40.x86_64
    version: |
      pasta 0^20240814.g61c0b0d-1.fc40.x86_64
      Copyright Red Hat
      GNU General Public License, version 2 or later
        <https://www.gnu.org/licenses/old-licenses/gpl-2.0.html>
      This is free software: you are free to change and redistribute it.
      There is NO WARRANTY, to the extent permitted by law.
  remoteSocket:
    exists: true
    path: /run/user/1001/podman/podman.sock
  rootlessNetworkCmd: pasta
  security:
    apparmorEnabled: false
    capabilities: CAP_CHOWN,CAP_DAC_OVERRIDE,CAP_FOWNER,CAP_FSETID,CAP_KILL,CAP_NET_BIND_SERVICE,CAP_SETFCAP,CAP_SETGID,CAP_SETPCAP,CAP_SETUID,CAP_SYS_CHROOT
    rootless: true
    seccompEnabled: true
    seccompProfilePath: /usr/share/containers/seccomp.json
    selinuxEnabled: true
  serviceIsRemote: false
  slirp4netns:
    executable: ""
    package: ""
    version: ""
  swapFree: 6202322944
  swapTotal: 6202322944
  uptime: 24h 59m 31.00s (Approximately 1.00 days)
  variant: ""
plugins:
  authorization: null
  log:
  - k8s-file
  - none
  - passthrough
  - journald
  network:
  - bridge
  - macvlan
  - ipvlan
  volume:
  - local
registries:
  search:
  - registry.fedoraproject.org
  - registry.access.redhat.com
  - docker.io
store:
  configFile: /home/podman/.config/containers/storage.conf
  containerStore:
    number: 4
    paused: 0
    running: 4
    stopped: 0
  graphDriverName: overlay
  graphOptions: {}
  graphRoot: /home/podman/.local/share/containers/storage
  graphRootAllocated: 428340129792
  graphRootUsed: 24915517440
  graphStatus:
    Backing Filesystem: btrfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Supports shifting: "false"
    Supports volatile: "true"
    Using metacopy: "false"
  imageCopyTmpDir: /var/tmp
  imageStore:
    number: 6
  runRoot: /run/user/1001/containers
  transientStore: false
  volumePath: /home/podman/.local/share/containers/storage/volumes
version:
  APIVersion: 5.2.1
  Built: 1723593600
  BuiltTime: Wed Aug 14 02:00:00 2024
  GitCommit: ""
  GoVersion: go1.22.5
  Os: linux
  OsArch: linux/amd64
  Version: 5.2.1

please let me know if there's anything else I can provide to help with this.

@sbrivio-rh
Copy link
Collaborator

I'll say that I've been running this configuration for more than 4 month now and the issue started only recently (the last month or so)

Do you happen to know starting from which version of Podman and pasta (the package is named passt) you are seeing this?

@kalelkenobi
Copy link

I'll say that I've been running this configuration for more than 4 month now and the issue started only recently (the last month or so)

Do you happen to know starting from which version of Podman and pasta (the package is named passt) you are seeing this?

I started noticing the issue around the beginning of August. Looking at my DNF history I can make some educated guesses.
it could be either this update (applied the 25 of July):

Packages Altered:
    Upgrade  podman-5:5.2.0~rc2-1.fc40.x86_64            @updates-testing
    Upgraded podman-5:5.1.2-1.fc40.x86_64                @@System

or this one (August 4th):

Packages Altered:
    Upgrade  aardvark-dns-2:1.12.1-1.fc40.x86_64        @updates-testing
    Upgraded aardvark-dns-2:1.11.0-3.fc40.x86_64        @@System
    Upgrade  glibc-2.39-22.fc40.x86_64                  @updates-testing
    Upgraded glibc-2.39-17.fc40.x86_64                  @@System
    Upgrade  glibc-common-2.39-22.fc40.x86_64           @updates-testing
    Upgraded glibc-common-2.39-17.fc40.x86_64           @@System
    Upgrade  glibc-gconv-extra-2.39-22.fc40.x86_64      @updates-testing
    Upgraded glibc-gconv-extra-2.39-17.fc40.x86_64      @@System
    Upgrade  glibc-langpack-en-2.39-22.fc40.x86_64      @updates-testing
    Upgraded glibc-langpack-en-2.39-17.fc40.x86_64      @@System
    Upgrade  glibc-locale-source-2.39-22.fc40.x86_64    @updates-testing
    Upgraded glibc-locale-source-2.39-17.fc40.x86_64    @@System
    Upgrade  glibc-minimal-langpack-2.39-22.fc40.x86_64 @updates-testing
    Upgraded glibc-minimal-langpack-2.39-17.fc40.x86_64 @@System
    Upgrade  libgcc-14.2.1-1.fc40.x86_64                @updates-testing
    Upgraded libgcc-14.1.1-7.fc40.x86_64                @@System
    Upgrade  libgomp-14.2.1-1.fc40.x86_64               @updates-testing
    Upgraded libgomp-14.1.1-7.fc40.x86_64               @@System
    Upgrade  libstdc++-14.2.1-1.fc40.x86_64             @updates-testing
    Upgraded libstdc++-14.1.1-7.fc40.x86_64             @@System
    Upgrade  netavark-2:1.12.1-1.fc40.x86_64            @updates-testing
    Upgraded netavark-2:1.11.0-3.fc40.x86_64            @@System
    Upgrade  podman-5:5.2.0-1.fc40.x86_64               @updates-testing
    Upgraded podman-5:5.2.0~rc2-1.fc40.x86_64           @@System

The only update I see for passt seems too recent (16th of August)

@kalelkenobi
Copy link

kalelkenobi commented Aug 22, 2024

my bad. There was another update on the 28 of July. Sorry I missed it.

Packages Altered:
    Upgrade       passt-0^20240726.g57a21d2-1.fc40.x86_64                   @updates-testing
    Upgraded      passt-0^20240624.g1ee2eca-1.fc40.x86_64                   @@System
    Upgrade       passt-selinux-0^20240726.g57a21d2-1.fc40.noarch           @updates-testing
    Upgraded      passt-selinux-0^20240624.g1ee2eca-1.fc40.noarch           @@System

the one before was of on June the 25th, but it feels too early and the the one after that was on August 9 (too late).
hope this helps

@sbrivio-rh
Copy link
Collaborator

Any chance you could try downgrading to passt-0^20240624.g1ee2eca-1.fc40.x86_64 just for a quick test and see if that helps with CPU load?

@kalelkenobi
Copy link

Any chance you could try downgrading to passt-0^20240624.g1ee2eca-1.fc40.x86_64 just for a quick test and see if that helps with CPU load?

I can't seem to find the old package. Not easily at least. do you know where I can get it?

@kalelkenobi
Copy link

kalelkenobi commented Aug 22, 2024

I managed to dowgrade to passt 0^20240326.g4988e2b-1.fc40.x86_64 and the issue seems in fact mostly resolved. I still see a lot of rootlessport processes (see below), but CPU usage has improved greatly and the server is back to being usable.

podman      1138  0.5  0.4  72424 27420 ?        Ss   14:16   0:01 /usr/bin/pasta --config-net --address 10.0.2.0 --netmask 24 --gateway 10.0.2.2 --dns-forward 10.0.2.3 --pid /run/user/1001/containers/networks/rootless-netns/rootless-netns-conn.pid -t none -u none -T none -U none --no-map-gw --quiet --netns /run/user/1001/containers/networks/rootless-netns/rootless-netns
podman      1263  0.0  0.0 1745796 5268 ?        Sl   14:16   0:00 rootlessport
podman      1273  0.0  0.0 1598332 4836 ?        Sl   14:16   0:00 rootlessport-child
podman      1377  1.0  0.0 1819784 5696 ?        Sl   14:16   0:02 rootlessport
podman      1382  0.0  0.0 1524344 4988 ?        Sl   14:16   0:00 rootlessport-child

@kalelkenobi
Copy link

It is notably better, but I can't say for certain that it is back to "normal". I will have to invest some more time checking and testing.

@sbrivio-rh
Copy link
Collaborator

I can't seem to find the old package. Not easily at least. do you know where I can get it?

I'm not sure how you solved this, and for how long Fedora packages are retained, but we keep a permanent mirror of Copr builds (including Fedora packages) at https://passt.top/builds/copr/.

I managed to dowgrade to passt 0^20240326.g4988e2b-1.fc40.x86_64 and the issue seems in fact mostly resolved.

It is notably better, but I can't say for certain that it is back to "normal". I will have to invest some more time checking and testing.

Thanks, this is extremely helpful.

@kalelkenobi
Copy link

Thank you! I used the mirror and I was able to downgrade to passt-0^20240624.g1ee2eca-1.fc40.x86_64. I can confirm that this version is working fine (as far as I can tell).
I also did a quick test and updated back to the latest version available on Fedora 40 (passt-0^20240814.g61c0b0d-1.fc40.x86_64) and CPU usage is back to be abnormally high (60%-70% with only wireguard running). Pasta (see below) is also back to being the process with the highest CPU usage.

/usr/bin/pasta --config-net --address 10.0.2.0 --netmask 24 --gateway 10.0.2.2 --dns-forward 10.0.2.3 --pid /run/user/1001/containers/networks/rootless-netns/rootless-netns-conn.pid -t none -u none -T none -U none --no-map-gw --quiet --netns /run/user/1001/containers/networks/rootless-netns/rootless-netns

@kalelkenobi
Copy link

I tested passt-0^20240726.g57a21d2-1.fc40.x86_64 and it looks like this is the first problematic version. I hope this helps narrow down the problem.

@Luap99
Copy link
Member

Luap99 commented Aug 22, 2024

I wouldn't expect rootlessport processes to be around in this case. How many are running? Can you have a look at their command line? I'm wondering if there's some unintended loop between port mappings or suchlike.

After rebooting my system to start with a clean slate, I started my Wireguard container and observed eight rootlessport and six rootlessport-child processes spawn. I spent some time on the client using the network: Opening webpages, downloading Hetzner test files, etc. After several minutes I noticed that a couple more rootlessport processes had appeared, bringing the total number to ten. All of the rootlessport and rootlessport-child processes were reaped once I stopped the container.

Your original reproducer didn't show you using a custom network, that uses a different architecture from the default pasta network mode, i.e. see #22943 (comment)

So using rootlessport is normal when using custom networks. And 10-17% CPU doesn't sound particular hight to me. The process must proxy all data which is not exactly cheap.

But I do agree that pasta numbers look way to high for the amount of throughput.

@dgibson
Copy link
Collaborator

dgibson commented Aug 27, 2024

@pwige Hi. I'm looking specifically at Scenario 2. As @sbrivio-rh notes that looks like a regression caused by the flow table, which I imlpemented.

Unfortunately, I haven't yet been able to reproduce the problem. I ran the wireguard container as described and ran a speedtest through it. While the transfer speeds are significantly slower than without the tunnel, they were still respectable (6-18 Mbps depending on which exact variant). pasta CPU usage topped out at 10-20%. I'm not seeing any rootlessport processes. There are a lot of possible differences between my setup and yours, so I don't have a quick guess as to which one is making the difference.

So, here's a grab bag of questions, hoping that something will give a clue as to what the triggering circumstances are:

  1. Can you roughly quantify the "extremely slow" speeds you're seeing.
    1.1. Roughly what throughput does speedtest report through wireguard+pasta?
    1.2. Roughly what throughput does it report from the same client without the wireguard tunnel?
    1.3. Roughly what throughput is reported using wireguard+pasta, but with the older pasta version (passt-0^20240624.g1ee2eca-1.fc40.x86_64 or earlier)?
  2. What's the connection between the client machine and the wireguard server? Are they different containers (or VMs) on the same host? Is the client on a physically different machine on the same physical network? On an entirely different network?
  3. Do the rootlessport processes appear immediately after starting the wireguard container? Or are they only created once traffic starts?
  4. What sort of host is the wireguard container running on? In particular how many cores & threads does it have?

@dgibson
Copy link
Collaborator

dgibson commented Aug 27, 2024

@pwige Oh, sorry, one more
5.

Run a speedtest first with multiple connections and then with only a single connection enabled.

I'm not entirely sure what you mean here. I couldn't see any obvious options for multiple vs. single connections.

@dgibson
Copy link
Collaborator

dgibson commented Aug 27, 2024

@pwige Oh, sorry, one more 5.

Run a speedtest first with multiple connections and then with only a single connection enabled.

I'm not entirely sure what you mean here. I couldn't see any obvious options for multiple vs. single connections.

Sorry, somehow missed that option on the speedtest page. Found it now.

Did some further testing, and I sometimes see pasta CPU load up to 45-50%, but throughput still seems respectable.

@Luap99
Copy link
Member

Luap99 commented Aug 27, 2024

Do the rootlessport processes appear immediately after starting the wireguard container? Or are they only created once traffic starts?

Rootlessport is used when using the bridge mode (i.e. custom networks) and is started before the container. They handle all port forwarding traffic and in this case (so no pasta port forward option involved). The traffic that goes through pasta in this case are only the connections initiated from the container (i.e. the ones that are not a directly reply to the ip/port from rootlessport). So in scenario 2 it would very much depend on how wireguard binds/connects to the udp sockets to know how the connection is flowing.

@grtcdr
Copy link

grtcdr commented Aug 27, 2024

I'm using version 2024_08_21.1d6142f and also experiencing very high load, my use case is a bit different.

I'm running qbittorrentofficial/qbittorrent-nox rootless and I'm observing 100% CPU load for the passt process despite setting the network concurrency settings of the program in question to lower values than default.

passt floats at about 35% CPU load for some time and then after a couple of hours hits the 100% mark.

Pausing the seeding torrents does not have an effect on the load of the process, so despite receiving no substantial traffic, the process continues to saturate its thread.

EDIT: I should've mentioned that this problem was happening using the default podman bridge network. I say this because @kalelkenobi reportedly (see later comments) only had problems using a custom network.

@dgibson
Copy link
Collaborator

dgibson commented Aug 28, 2024

Do the rootlessport processes appear immediately after starting the wireguard container? Or are they only created once traffic starts?

Rootlessport is used when using the bridge mode (i.e. custom networks) and is started before the container. They handle all port forwarding traffic and in this case (so no pasta port forward option involved). The traffic that goes through pasta in this case are only the connections initiated from the container (i.e. the ones that are not a directly reply to the ip/port from rootlessport). So in scenario 2 it would very much depend on how wireguard binds/connects to the udp sockets to know how the connection is flowing.

Ok, so I'm a bit confused at this point. The original instructions for reproducing don't seem to be setting up a custom network - just starting a regular container. However the presence of rootlessport seems to indicate that there is a custom network in play. @pwige can you shed some light on what network configuration you're using?

@dgibson
Copy link
Collaborator

dgibson commented Aug 28, 2024

I'm using version 2024_08_21.1d6142f and also experiencing very high load, my use case is a bit different.

I'm running qbittorrentofficial/qbittorrent-nox rootless and I'm observing 100% CPU load for the passt process despite setting the network concurrency settings of the program in question to lower values than default.

passt floats at about 35% CPU load for some time and then after a couple of hours hits the 100% mark.

Pausing the seeding torrents does not have an effect on the load of the process, so despite receiving no substantial traffic, the process continues to saturate its thread.

Huh, interesting. The good news is this strongly suggests we're doing something Just Plain Wrong (as opposed to a more subtle inefficiency), and once we find it, it should be a relatively easy fix. The bad news is that I don't yet have any good theories as to what's going on.

@kalelkenobi
Copy link

Did some further testing thanks to you guys pointing out that rootless port is only needed in cases where a custom network is used and I can confirm that the problem is NOT present when using the default network (or the host network). Sorry if I've not mentioned this before, but this was out of sheer ignorance on my part. Please let me know if there's anything else I can test or information I can provide about this.

P.S. this whole thing has been hugely educational for me and has made me realize that my setup (where I have multiple containers using a custom network) is not very efficient CPU wise. Could you guys give me a pointer as to what would be a better way to have multiple containers communicate with each other? If that needs to be a longer conversation I'd be happy to post a question in discussion or do some further research myself. Thanks anyway :)

@pwige
Copy link
Author

pwige commented Aug 28, 2024

@dgibson Hi!

1. Can you roughly quantify the "extremely slow" speeds you're seeing.
   1.1. Roughly what throughput does speedtest report through wireguard+pasta?
   1.2. Roughly what throughput does it report from the same client without the wireguard tunnel?

The extremely slow speeds I had previously seen were around 0.5MB/s down and 0.1MB/s up. A "clean" speedtest where I'm not connected to the vpn gives me ~450MB/s down and ~165MB/s up.

Here are two videos showing the speedtest on the client and the CPU usage on the server as they happened in real-time. The client was my laptop, which exists on the same physical network as the server.

multi-connection.mp4
single-connection.mp4

Despite the speeds shown here being not nearly as bad as they were previously (no complaints here), the CPU usage isn't noticeably different. I'm not sure what could account for the change in throughput other than general instability from my ISP, as I haven't changed anything on my end. It make me a bit embarrassed tbh. I haven't had the opportunity today to perform this test when connected to a different physical network, but may have the chance to tomorrow.

   1.3. Roughly what throughput is reported using wireguard+pasta, but with the older pasta version (`passt-0^20240624.g1ee2eca-1.fc40.x86_64` or earlier)?

I'm on Arch, so I can't use any of the Copr builds. I can build from source from tags 2024_06_24.1ee2eca or 2024_06_07.8a83b53. Please let me know if simply copying the resulting passt binary to /usr/local/bin is enough for testing. Thanks.

2. What's the connection between the client machine and the wireguard server?  Are they different containers (or VMs) on the same host?  Is the client on a physically different machine on the same physical network?  On an entirely different network?

The server is running in a rootless container directly on the host. The client(s) are other devices connected to the same physical network. As mentioned above, I might have the chance to test from different physical networks tomorrow.

3. Do the rootlessport processes appear immediately after starting the wireguard container?  Or are they only created once traffic starts?

The first eight appear immediately after starting the container. I can get an additional rootlessport process to spawn when I first connect a client to the wireguard vpn. The maximum I've seen is ten.

4. What sort of host is the wireguard container running on?  In particular how many cores & threads does it have?

I am using an Intel NUC5i7RYH, with the containers running directly on the host. I haven't imposed any resource limits on the container, so the it have unimpeded access to all 2 cores (4 threads) and all 16GB of RAM. The device itself is connected to my home LAN via ethernet.

@pwige
Copy link
Author

pwige commented Aug 28, 2024

Do the rootlessport processes appear immediately after starting the wireguard container? Or are they only created once traffic starts?

Rootlessport is used when using the bridge mode (i.e. custom networks) and is started before the container. They handle all port forwarding traffic and in this case (so no pasta port forward option involved). The traffic that goes through pasta in this case are only the connections initiated from the container (i.e. the ones that are not a directly reply to the ip/port from rootlessport). So in scenario 2 it would very much depend on how wireguard binds/connects to the udp sockets to know how the connection is flowing.

Ok, so I'm a bit confused at this point. The original instructions for reproducing don't seem to be setting up a custom network - just starting a regular container. However the presence of rootlessport seems to indicate that there is a custom network in play. @pwige can you shed some light on what network configuration you're using?

The steps I provided to reproduce the issue don't use a custom network, something which I am doing in "production." That's an oversight on my part. Sorry! I've updated the issue's original text to reflect this.

My network configuration is defined in a Quadlet .network file containing nothing but an empty [Network] section. Here is the output of podman network inspect:

[
     {
          "name": "systemd-wireguard",
          "id": "41f1cabb2b1a194ec63a730798a6d972cba65b22699be76714cf6259558c207c",
          "driver": "bridge",
          "network_interface": "podman1",
          "created": "2024-08-27T17:36:04.729533184-04:00",
          "subnets": [
               {
                    "subnet": "10.89.0.0/24",
                    "gateway": "10.89.0.1"
               }
          ],
          "ipv6_enabled": false,
          "internal": false,
          "dns_enabled": true,
          "ipam_options": {
               "driver": "host-local"
          },
          "containers": {
               "f9bacc5a3db909b3362249fa39088241d3a842cb3d3868d04d410c6cf3fbe53d": {
                    "name": "systemd-wireguard",
                    "interfaces": {
                         "eth0": {
                              "subnets": [
                                   {
                                        "ipnet": "10.89.0.7/24",
                                        "gateway": "10.89.0.1"
                                   }
                              ],
                              "mac_address": "3a:4a:98:4d:6e:7f"
                         }
                    }
               }
          }
     }
]

@sbrivio-rh
Copy link
Collaborator

By the way:

I'm running qbittorrentofficial/qbittorrent-nox rootless and I'm observing 100% CPU load for the passt process despite setting the network concurrency settings of the program in question to lower values than default.

passt floats at about 35% CPU load for some time and then after a couple of hours hits the 100% mark.

Pausing the seeding torrents does not have an effect on the load of the process, so despite receiving no substantial traffic, the process continues to saturate its thread.

I tried to reproduce this case, but I couldn't, at least not yet. Perhaps it's because of the torrent content I chose (Linux distribution images): I download very fast (50-100 MB/s, and pasta reaches 40-50% load on a CPU thread) but seeding is not constantly using so much bandwidth. I see spikes in upload rates from time to time, but otherwise I'm constantly seeding at some dozen KB/s to a handful of peers.

It would be interesting if you could share the output of one second or less of strace -f -p <pasta's PID> when the CPU load is unexpectedly high (that is, when you don't expect any substantial traffic). You'll need to run strace as root, because pasta makes itself un-ptrace()able by regular users.

@dgibson
Copy link
Collaborator

dgibson commented Aug 29, 2024

@pwige, thanks for the additional information. Between your comment and the one from @kalelkenobi it certainly seems like this is triggered by the custom network setup. I'm suspecting maybe some sort of forwarding loop between pasta and rootlesskit.

I'm not really familiar with quadlet and custom network configuration though, @Luap99 any chance you could interpret @pwige's configuration into specific commands I'd need to run to create something similar?

@luckylinux
Copy link

The new passt version 2025_01_21.4f2c8e7 contains a number of fixes for issues similar to this one. To whom is still experiencing this issue: can you please give it a try? Thanks.

@sbrivio-rh: I gave it a Try now on Proxmox VE where I build everything from Source related to Podman (just upgraded to Podman 5.4.0 amongst other Things).

I'd say the CPU Load related to pasta and the Frigate Container is approximatively the same (20-30% CPU Usage when there really isn't much Activity at all, just 5 CCTV Cameras + 1-2 Clients).

@sbrivio-rh
Copy link
Collaborator

sbrivio-rh commented Feb 15, 2025

@sbrivio-rh I found 2 commands you can run to reproduce it locally (assuming you have an audiobook to stream)

Thanks! I'll give that a try.

So, I tried a while ago, but I can't reproduce this. I used some sample MP3 files, not actual audiobooks. I stream them for a while, and yes, I see splice() system calls for rather small quantity of data. However they're not frequent enough to cause any noticeable CPU load. They come at most every few dozen milliseconds.

@kylem0 I would have three requests:

  • you attached a flame chart, but the names of the system calls are truncated. Is it possible to have one (even in text format) displaying the actual name?
  • would it be possible to have a strace output taken with -r (without grepping a particular pattern, just an excerpt of some milliseconds) so that I can see relative timestamps and how frequent those system calls are?
  • separately (wihout strace): podman run ... net=pasta,--trace,-l,/tmp/pasta.log and share a few hundred lines of /tmp/pasta.log

Thanks.

@luckylinux
Copy link

@sbrivio-rh

I tried a bit of strace on pasta where I have the Frigate Container with approx. 20-30% CPU Usage even after upgrading to the latest pasta Version (pasta 2025_01_21.4f2c8e7-44-g71249ef).

Find the Process ID using:

ps -axw -eo pid,pcpu,cmd,args | grep -i pasta

Trace Command:

strace -f -p 204416 -o strace_pasta_$(date +"%Y%m%d_%Hh%Mm%Ss").log

I took a few Seconds of Tracing. 30k Lines in Total out of which 3k Lines / 10% are Resource temporarily unavailable:

204416 read(14, 0x5c7XX1e8XX4a, 65535)  = -1 EAGAIN (Resource temporarily unavailable)

Not sure how much sensitive Information is contained in those Trace Dumps (I could see only 1 Time a Part on an IPv6 Prefix). Most of the stuff is sort-of Encoded ?

204416 read(14, "\232U\232U...

@sbrivio-rh
Copy link
Collaborator

sbrivio-rh commented Feb 15, 2025

I tried a bit of strace on pasta where I have the Frigate Container with approx. 20-30% CPU Usage even after upgrading to the latest pasta Version (pasta 2025_01_21.4f2c8e7-44-g71249ef).

Hey, thanks, but, despite being @luckylinux, you don't have this issue "unfortunately". :) Yours is 1. not high CPU load and 2. not on the splice()d (loopback) path, where we're seeing those new reports.

By the way:

Not sure how much sensitive Information is contained in those Trace Dumps (I could see only 1 Time a Part on an IPv6 Prefix). Most of the stuff is sort-of Encoded ?

204416 read(14, "\232U\232U...

Oh, a lot, I could see the whole data (that's octal encoding of bytes we read/write). But if it's SSL I can't read much into it.

@luckylinux
Copy link

@sbrivio-rh: Sad to hear to some Extent 😥 . Maybe it's more what @kylem0 reported, as I'm also using Caddy.

I still do NOT find it normal to be using 20-30% CPU when basically there are less than 2 Users (plus 5 Cameras getting streamed by Frigate).

Imagine if this was a 1000 Users Visits/Second Website. It isn't going to fly.

I'll maybe try to run a customized NGINX Entrypoint instead of Caddy as an entrypoint. Or Traefik possibly (but I might go full-nginx directly since that probably offers the best Performance).

@sbrivio-rh
Copy link
Collaborator

I still do NOT find it normal to be using 20-30% CPU when basically there are less than 2 Users (plus 5 Cameras getting streamed by Frigate).

Imagine if this was a 1000 Users Visits/Second Website. It isn't going to fly.

The CPU load you see is actually very adaptive, as I analysed and explained in: #23686 (comment).

TL;DR: with more requests we get proportionally less frequent wake-ups and we read bigger chunks of data. It actually scales. We run TCP CRR (connection-request-response) tests and they look pretty good, (> dozen) thousand connections per second are not a problem. Improving the situation further than that is something I'd defer to VDUSE support.

@luckylinux
Copy link

luckylinux commented Feb 15, 2025

TL;DR: with more requests we get proportionally less frequent wake-ups and we read bigger chunks of data. It actually scales. We run TCP CRR (connection-request-response) tests and they look pretty good, (> dozen) thousand connections per second are not a problem. Improving the situation further than that is something I'd defer to VDUSE support.

So you don't think it's even worth it to try replacing Caddy with NGINX to rule out Caddy as a Possible Issue/Factor ?

@sbrivio-rh
Copy link
Collaborator

So you don't think it's even worth it to try replacing Caddy with NGINX to rule out Caddy as a Possible Issue/Factor ?

In the problematic case, yes, definitely worth a try I would say. @kylem0 ^^.

But in your case (different data path) I don't really see a problem, and I doubt you would see a difference.

@kylem0
Copy link

kylem0 commented Feb 15, 2025

@sbrivio-rh I was able to reproduce the issue with nginx as well. At first my nginx config didn't work with websockets and the AudiobookShelf UI showed an error establishing a websocket connection. There was no high CPU usage without the websocket connection. I updated my nginx config to work with websockets and now I have 2 100% pasta processes (for audiobookshelf and nginx).

This is a pretty vanilla ARM64 Fedora CoreOS VM, but I initially saw the issue on my x86_64 Arch bare-metal server.

--net pasta:--trace

I feel like I'm losing my mind. If I add --trace,-l,/tmp/pasta.log to the end of --net past for the audiobook or proxy container, the pasta processes use normal (almost 0%) CPU usage. If I remove the --trace option I see 2 100% pasta processes...

proxy:

$ tail -5000 pasta.log | head -20
38.4580:          pasta: epoll event on connected TCP socket 253 (events: 0x00000001)
38.4580:          pasta: epoll event on connected TCP socket 253 (events: 0x00000001)
38.4580:          pasta: epoll event on connected TCP socket 253 (events: 0x00000001)
38.4580:          pasta: epoll event on connected TCP socket 253 (events: 0x00000001)
38.4580:          pasta: epoll event on connected TCP socket 253 (events: 0x00000001)
38.4580:          pasta: epoll event on connected TCP socket 253 (events: 0x00000001)
38.4580:          pasta: epoll event on connected TCP socket 253 (events: 0x00000001)
38.4580:          pasta: epoll event on connected TCP socket 253 (events: 0x00000001)
38.4580:          pasta: epoll event on connected TCP socket 253 (events: 0x00000001)
38.4580:          pasta: epoll event on connected TCP socket 253 (events: 0x00000001)
38.4580:          pasta: epoll event on connected TCP socket 253 (events: 0x00000001)
38.4580:          pasta: epoll event on connected TCP socket 253 (events: 0x00000001)
38.4580:          pasta: epoll event on connected TCP socket 253 (events: 0x00000001)
38.4580:          pasta: epoll event on connected TCP socket 253 (events: 0x00000001)
38.4580:          pasta: epoll event on connected TCP socket 253 (events: 0x00000001)
38.4581:          pasta: epoll event on connected TCP socket 253 (events: 0x00000001)
38.4581:          pasta: epoll event on connected TCP socket 253 (events: 0x00000001)
38.4581:          pasta: epoll event on connected TCP socket 253 (events: 0x00000001)
38.4581:          pasta: epoll event on connected TCP socket 253 (events: 0x00000001)
38.4581:          pasta: epoll event on connected TCP socket 253 (events: 0x00000001)

audiobookshelf:

$ tail -5000 pasta-abs.log | head -20
41.0230:          Flow 1 (TCP connection (spliced)): -1 from write-side call (passed 8192)
41.0230:          Flow 1 (TCP connection (spliced)): event at tcp_splice_sock_handler:577
41.0230:          pasta: epoll event on connected spliced TCP socket 108 (events: 0x00000001)
41.0230:          Flow 1 (TCP connection (spliced)): -1 from read-side call
41.0230:          Flow 1 (TCP connection (spliced)): -1 from write-side call (passed 8192)
41.0230:          Flow 1 (TCP connection (spliced)): event at tcp_splice_sock_handler:577
41.0230:          pasta: epoll event on connected spliced TCP socket 108 (events: 0x00000001)
41.0230:          Flow 1 (TCP connection (spliced)): -1 from read-side call
41.0230:          Flow 1 (TCP connection (spliced)): -1 from write-side call (passed 8192)
41.0230:          Flow 1 (TCP connection (spliced)): event at tcp_splice_sock_handler:577
41.0230:          pasta: epoll event on connected spliced TCP socket 108 (events: 0x00000001)
41.0230:          Flow 1 (TCP connection (spliced)): -1 from read-side call
41.0230:          Flow 1 (TCP connection (spliced)): -1 from write-side call (passed 8192)
41.0230:          Flow 1 (TCP connection (spliced)): event at tcp_splice_sock_handler:577
41.0230:          pasta: epoll event on connected spliced TCP socket 108 (events: 0x00000001)
41.0230:          Flow 1 (TCP connection (spliced)): -1 from read-side call
41.0230:          Flow 1 (TCP connection (spliced)): -1 from write-side call (passed 8192)
41.0230:          Flow 1 (TCP connection (spliced)): event at tcp_splice_sock_handler:577
41.0230:          pasta: epoll event on connected spliced TCP socket 108 (events: 0x00000001)
41.0230:          Flow 1 (TCP connection (spliced)): -1 from read-side call

Flamegraph

Let me know if I can improve this at all, I don't use/create flamegraphs often:

# perf (flamegraph)
$ perf record -F 99 -p 213348 -g -- sleep 60
$ perf script > out.perf
$ ./FlameGraph/stackcollapse-perf.pl out.perf > out.folded
$ ./FlameGraph/flamegraph.pl out.folded > flamegraph.svg

Image

Strace

# strace
$ strace -p 213348 -tt -r -o strace-output.log
$ head -50 strace-output.log
18:05:44.108715 (+     0.000000) splice(250, NULL, 96, NULL, 8192, SPLICE_F_MOVE|SPLICE_F_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable)
18:05:44.109266 (+     0.000482) epoll_pwait(3, [{events=EPOLLIN, data=0x500006902}], 8, 1000, NULL, 8) = 1
18:05:44.109504 (+     0.000232) splice(105, NULL, 251, NULL, 8192, SPLICE_F_MOVE|SPLICE_F_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable)
18:05:44.109542 (+     0.000035) splice(250, NULL, 96, NULL, 8192, SPLICE_F_MOVE|SPLICE_F_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable)
18:05:44.109577 (+     0.000033) epoll_pwait(3, [{events=EPOLLIN, data=0x500006902}], 8, 1000, NULL, 8) = 1
18:05:44.109612 (+     0.000035) splice(105, NULL, 251, NULL, 8192, SPLICE_F_MOVE|SPLICE_F_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable)
18:05:44.109645 (+     0.000032) splice(250, NULL, 96, NULL, 8192, SPLICE_F_MOVE|SPLICE_F_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable)
18:05:44.109692 (+     0.000049) epoll_pwait(3, [{events=EPOLLIN, data=0x500006902}], 8, 1000, NULL, 8) = 1
18:05:44.109736 (+     0.000043) splice(105, NULL, 251, NULL, 8192, SPLICE_F_MOVE|SPLICE_F_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable)
18:05:44.109770 (+     0.000032) splice(250, NULL, 96, NULL, 8192, SPLICE_F_MOVE|SPLICE_F_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable)
18:05:44.109803 (+     0.000033) epoll_pwait(3, [{events=EPOLLIN, data=0x500006902}], 8, 1000, NULL, 8) = 1
18:05:44.109838 (+     0.000035) splice(105, NULL, 251, NULL, 8192, SPLICE_F_MOVE|SPLICE_F_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable)
18:05:44.109870 (+     0.000032) splice(250, NULL, 96, NULL, 8192, SPLICE_F_MOVE|SPLICE_F_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable)
18:05:44.109903 (+     0.000032) epoll_pwait(3, [{events=EPOLLIN, data=0x500006902}], 8, 1000, NULL, 8) = 1
18:05:44.109937 (+     0.000034) splice(105, NULL, 251, NULL, 8192, SPLICE_F_MOVE|SPLICE_F_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable)
18:05:44.109970 (+     0.000033) splice(250, NULL, 96, NULL, 8192, SPLICE_F_MOVE|SPLICE_F_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable)
18:05:44.110003 (+     0.000032) epoll_pwait(3, [{events=EPOLLIN, data=0x500006902}], 8, 1000, NULL, 8) = 1
18:05:44.110038 (+     0.000035) splice(105, NULL, 251, NULL, 8192, SPLICE_F_MOVE|SPLICE_F_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable)
18:05:44.110071 (+     0.000032) splice(250, NULL, 96, NULL, 8192, SPLICE_F_MOVE|SPLICE_F_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable)
18:05:44.110105 (+     0.000034) epoll_pwait(3, [{events=EPOLLIN, data=0x500006902}], 8, 1000, NULL, 8) = 1
18:05:44.110141 (+     0.000035) splice(105, NULL, 251, NULL, 8192, SPLICE_F_MOVE|SPLICE_F_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable)
18:05:44.110173 (+     0.000032) splice(250, NULL, 96, NULL, 8192, SPLICE_F_MOVE|SPLICE_F_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable)
18:05:44.110206 (+     0.000032) epoll_pwait(3, [{events=EPOLLIN, data=0x500006902}], 8, 1000, NULL, 8) = 1
18:05:44.110241 (+     0.000035) splice(105, NULL, 251, NULL, 8192, SPLICE_F_MOVE|SPLICE_F_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable)
18:05:44.110274 (+     0.000033) splice(250, NULL, 96, NULL, 8192, SPLICE_F_MOVE|SPLICE_F_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable)
18:05:44.110307 (+     0.000032) epoll_pwait(3, [{events=EPOLLIN, data=0x500006902}], 8, 1000, NULL, 8) = 1
18:05:44.110341 (+     0.000034) splice(105, NULL, 251, NULL, 8192, SPLICE_F_MOVE|SPLICE_F_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable)
18:05:44.110374 (+     0.000032) splice(250, NULL, 96, NULL, 8192, SPLICE_F_MOVE|SPLICE_F_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable)
18:05:44.110407 (+     0.000033) epoll_pwait(3, [{events=EPOLLIN, data=0x500006902}], 8, 1000, NULL, 8) = 1
18:05:44.110441 (+     0.000034) splice(105, NULL, 251, NULL, 8192, SPLICE_F_MOVE|SPLICE_F_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable)
18:05:44.110474 (+     0.000032) splice(250, NULL, 96, NULL, 8192, SPLICE_F_MOVE|SPLICE_F_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable)
18:05:44.110515 (+     0.000041) epoll_pwait(3, [{events=EPOLLIN, data=0x500006902}], 8, 1000, NULL, 8) = 1
18:05:44.110551 (+     0.000035) splice(105, NULL, 251, NULL, 8192, SPLICE_F_MOVE|SPLICE_F_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable)
18:05:44.110584 (+     0.000032) splice(250, NULL, 96, NULL, 8192, SPLICE_F_MOVE|SPLICE_F_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable)
18:05:44.110617 (+     0.000033) epoll_pwait(3, [{events=EPOLLIN, data=0x500006902}], 8, 1000, NULL, 8) = 1
18:05:44.110652 (+     0.000034) splice(105, NULL, 251, NULL, 8192, SPLICE_F_MOVE|SPLICE_F_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable)
18:05:44.110697 (+     0.000045) splice(250, NULL, 96, NULL, 8192, SPLICE_F_MOVE|SPLICE_F_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable)
18:05:44.110735 (+     0.000038) epoll_pwait(3, [{events=EPOLLIN, data=0x500006902}], 8, 1000, NULL, 8) = 1
18:05:44.110770 (+     0.000034) splice(105, NULL, 251, NULL, 8192, SPLICE_F_MOVE|SPLICE_F_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable)
18:05:44.110804 (+     0.000033) splice(250, NULL, 96, NULL, 8192, SPLICE_F_MOVE|SPLICE_F_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable)
18:05:44.110837 (+     0.000033) epoll_pwait(3, [{events=EPOLLIN, data=0x500006902}], 8, 1000, NULL, 8) = 1
18:05:44.110872 (+     0.000034) splice(105, NULL, 251, NULL, 8192, SPLICE_F_MOVE|SPLICE_F_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable)
18:05:44.110905 (+     0.000032) splice(250, NULL, 96, NULL, 8192, SPLICE_F_MOVE|SPLICE_F_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable)
18:05:44.110938 (+     0.000032) epoll_pwait(3, [{events=EPOLLIN, data=0x500006902}], 8, 1000, NULL, 8) = 1
18:05:44.110973 (+     0.000035) splice(105, NULL, 251, NULL, 8192, SPLICE_F_MOVE|SPLICE_F_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable)
18:05:44.111006 (+     0.000032) splice(250, NULL, 96, NULL, 8192, SPLICE_F_MOVE|SPLICE_F_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable)
18:05:44.111038 (+     0.000032) epoll_pwait(3, [{events=EPOLLIN, data=0x500006902}], 8, 1000, NULL, 8) = 1
18:05:44.111072 (+     0.000034) splice(105, NULL, 251, NULL, 8192, SPLICE_F_MOVE|SPLICE_F_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable)
18:05:44.111105 (+     0.000032) splice(250, NULL, 96, NULL, 8192, SPLICE_F_MOVE|SPLICE_F_NONBLOCK) = -1 EAGAIN (Resource temporarily unavailable)
18:05:44.111139 (+     0.000033) epoll_pwait(3, [{events=EPOLLIN, data=0x500006902}], 8, 1000, NULL, 8) = 1

Let me know what other info I can provide. I can try to reproduce the issue with something like Jellyfin and a proxy if that would help. I think Jellyfin and AudiobookShelf are the only 2 service I've seen this issue with, but they're also the only 2 services that use websockets that I run that I can think of.
Because I watch and listen to a lot of long content I have been using host networking for these containers which I would prefer not to do. I'd rather use pasta to help keep container traffic isolated. It seems like pasta prevets containers from communicating with each other unless I allow the specific ports which I really like. I also went to keep using a reverse proxy because it makes HTTPS traffic a lot easier and I can use subdomains (I use a lot of services/subdomains).

@sbrivio-rh
Copy link
Collaborator

@sbrivio-rh I was able to reproduce the issue with nginx as well. At first my nginx config didn't work with websockets and the AudiobookShelf UI showed an error establishing a websocket connection. There was no high CPU usage without the websocket connection. I updated my nginx config to work with websockets and now I have 2 100% pasta processes (for audiobookshelf and nginx).

Thanks a lot for all the tests.

This is a pretty vanilla ARM64 Fedora CoreOS VM, but I initially saw the issue on my x86_64 Arch bare-metal server.

--net pasta:--trace

I feel like I'm losing my mind. If I add --trace,-l,/tmp/pasta.log to the end of --net past for the audiobook or proxy container, the pasta processes use normal (almost 0%) CPU usage.

At least you have a workaround :) By default the log file is limited to one meg, so that won't bother you.

41.0230:          pasta: epoll event on connected spliced TCP socket 108 (events: 0x00000001)
41.0230:          Flow 1 (TCP connection (spliced)): -1 from read-side call
41.0230:          Flow 1 (TCP connection (spliced)): -1 from write-side call (passed 8192)
41.0230:          Flow 1 (TCP connection (spliced)): event at tcp_splice_sock_handler:577
41.0230:          pasta: epoll event on connected spliced TCP socket 108 (events: 0x00000001)
41.0230:          Flow 1 (TCP connection (spliced)): -1 from read-side call
41.0230:          Flow 1 (TCP connection (spliced)): -1 from write-side call (passed 8192)

!

kernel> you got data
pasta> where?
kernel> haha nope

Let me know what other info I can provide.

That's everything I need at the moment I think. This looks like a side effect of https://passt.top/passt/commit/tcp_splice.c?id=7e6a606c32341c81b0889a6791ec12e418a4eeec, which is however correct. I have a number of ideas now, I'll need a bit to try them out.

@sbrivio-rh
Copy link
Collaborator

41.0230:          pasta: epoll event on connected spliced TCP socket 108 (events: 0x00000001)
41.0230:          Flow 1 (TCP connection (spliced)): -1 from read-side call
41.0230:          Flow 1 (TCP connection (spliced)): -1 from write-side call (passed 8192)
41.0230:          Flow 1 (TCP connection (spliced)): event at tcp_splice_sock_handler:577
41.0230:          pasta: epoll event on connected spliced TCP socket 108 (events: 0x00000001)
41.0230:          Flow 1 (TCP connection (spliced)): -1 from read-side call
41.0230:          Flow 1 (TCP connection (spliced)): -1 from write-side call (passed 8192)

kernel> you got data
pasta> where?
kernel> haha nope

...or more subtly:

kernel> you got data
pasta> move it to the pipe
kernel> it's full
pasta> ...move some to the receiver?
kernel> also full
kernel> you got data
...

It looks like the receiver isn't receiving, quite simply, and then we should just wait. Yeah, it sounds like obvious functionality, but I think that this issue was hidden by another, less obvious one (the commit I pointed out, probably) we recently fixed.

Could you please try this (lightly tested) patch:

diff --git a/tcp_splice.c b/tcp_splice.c
index f1a9223..8a39a6f 100644
--- a/tcp_splice.c
+++ b/tcp_splice.c
@@ -131,8 +131,12 @@ static void tcp_splice_conn_epoll_events(uint16_t events,
 		ev[1].events = EPOLLOUT;
 	}
 
-	flow_foreach_sidei(sidei)
-		ev[sidei].events |= (events & OUT_WAIT(sidei)) ? EPOLLOUT : 0;
+	flow_foreach_sidei(sidei) {
+		if (events & OUT_WAIT(sidei)) {
+			ev[sidei].events |= EPOLLOUT;
+			ev[!sidei].events &= ~EPOLLIN;
+		}
+	}
 }
 
 /**

?

@kylem0
Copy link

kylem0 commented Feb 15, 2025

@sbrivio-rh I think that solves my issue! I played an audio book for ~10 minutes and the pasta processes weren't visible in top. To double-check I removed the build pasta binary from ~/.local/bin and the 100% processes were back. How long does a change like this usually take to test and make it into the next tag / release version? 👀

core@oc0:~/passt$ git status
HEAD detached at 2025_01_21.4f2c8e7
Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
        modified:   Makefile
        modified:   tcp_splice.c

no changes added to commit (use "git add" and/or "git commit -a")

core@oc0:~/passt$ git diff
diff --git a/Makefile b/Makefile
index 464eef1..d9635b1 100644
--- a/Makefile
+++ b/Makefile
@@ -9,7 +9,7 @@
 # Copyright (c) 2021 Red Hat GmbH
 # Author: Stefano Brivio <sbrivio@redhat.com>

-VERSION ?= $(shell git describe --tags HEAD 2>/dev/null || echo "unknown\ version")
+VERSION ?= kyle

 # Does the target platform allow IPv4 connections to be handled via
 # the IPv6 socket API? (Linux does)
diff --git a/tcp_splice.c b/tcp_splice.c
index 3a000ff..d7ed3b8 100644
--- a/tcp_splice.c
+++ b/tcp_splice.c
@@ -131,8 +131,12 @@ static void tcp_splice_conn_epoll_events(uint16_t events,
                ev[1].events = EPOLLOUT;
        }

-       flow_foreach_sidei(sidei)
-               ev[sidei].events |= (events & OUT_WAIT(sidei)) ? EPOLLOUT : 0;
+       flow_foreach_sidei(sidei) {
+               if (events & OUT_WAIT(sidei)) {
+                       ev[sidei].events |= EPOLLOUT;
+                       ev[!sidei].events &= ~EPOLLIN;
+               }
+       }
 }

 /**

core@oc0:~/passt$ pasta --version
pasta kyle
Copyright Red Hat
GNU General Public License, version 2 or later
  <https://www.gnu.org/licenses/old-licenses/gpl-2.0.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.

core@oc0:~/passt$ passt --version
passt kyle
Copyright Red Hat
GNU General Public License, version 2 or later
  <https://www.gnu.org/licenses/old-licenses/gpl-2.0.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law

core@oc0:~/passt$ ps aux | grep "PID\|pasta" | grep "PID\|1234" | grep -v grep
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
core       11954  0.0  0.1 1942956 46984 pts/2   Sl+  22:19   0:00 podman run -it --rm --security-opt label=disable --net pasta -p 1234:80 -v ./:/audiobooks:ro ghcr.io/advplyr/audiobookshelf:latest
core       11972  0.0  0.0  68040 14952 ?        Ss   22:19   0:00 /var/home/core/.local/bin/pasta --config-net -t 1234-1234:80-80 --dns-forward 169.254.1.1 -u none -T none -U none --no-map-gw --quiet --netns /run/user/1000/netns/netns-6932670b-a3c2-7510-6906-d2fb1536b011 --map-guest-addr 169.254.1.2
core       12001  0.0  0.1 1942864 46340 pts/3   Sl+  22:19   0:00 podman run -it --rm --security-opt label=disable --net pasta:-T,1234 -p 1235:1235 docker.io/library/caddy caddy reverse-proxy --from :1235 --to :1234
core       12020  0.0  0.0  68040 14936 ?        Ss   22:19   0:00 /var/home/core/.local/bin/pasta --config-net -T 1234 -t 1235-1235:1235-1235 --dns-forward 169.254.1.1 -u none -U none --no-map-gw --quiet --netns /run/user/1000/netns/netns-f86cba5f-fa85-bf64-4510-e8a285461efa --map-guest-addr 169.254.1.2

It looks like Fedora CoreOS is using an older version of pasta, I built the the latest tag without any changes and the issues is present, so I think the small change fixes it.

core@oc0:~$ pasta --version
pasta 2025_01_21.4f2c8e7

@sbrivio-rh
Copy link
Collaborator

@sbrivio-rh I think that solves my issue! I played an audio book for ~10 minutes and the pasta processes weren't visible in top.

Hah, great.

How long does a change like this usually take to test and make it into the next tag / release version? 👀

Not long, I ran the tests now, I'm pondering to write a test with receivers intermittently blocking to reproduce something similar to your case. Even if I can't reproduce this, though, the change looks relatively obvious (in hindsight ;)) and safe, so better included than not.

If we could get the OP of https://www.reddit.com/r/podman/comments/1iph50j/pasta_high_cpu_on_podman_rootless_container/ to also test it I would be somewhat more confident, let's see.

I plan to make anyway a new release within a couple of days as we fixed a few somewhat critical issues in the past week.

hswong3i pushed a commit to alvistack/passt-top-passt that referenced this issue Feb 17, 2025
If we set the OUT_WAIT_* flag (waiting on EPOLLOUT) for a side of a
given flow, it means that we're blocked, waiting for the receiver to
actually receive data, with a full pipe.

In that case, if we keep EPOLLIN set for the socket on the other side
(our receiving side), we'll get into a loop such as:

  41.0230:          pasta: epoll event on connected spliced TCP socket 108 (events: 0x00000001)
  41.0230:          Flow 1 (TCP connection (spliced)): -1 from read-side call
  41.0230:          Flow 1 (TCP connection (spliced)): -1 from write-side call (passed 8192)
  41.0230:          Flow 1 (TCP connection (spliced)): event at tcp_splice_sock_handler:577
  41.0230:          pasta: epoll event on connected spliced TCP socket 108 (events: 0x00000001)
  41.0230:          Flow 1 (TCP connection (spliced)): -1 from read-side call
  41.0230:          Flow 1 (TCP connection (spliced)): -1 from write-side call (passed 8192)
  41.0230:          Flow 1 (TCP connection (spliced)): event at tcp_splice_sock_handler:577

leading to 100% CPU usage, of course.

Drop EPOLLIN on our receiving side as long when we're waiting for
output readiness on the other side.

Link: containers/podman#23686 (comment)
Link: https://www.reddit.com/r/podman/comments/1iph50j/pasta_high_cpu_on_podman_rootless_container/
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
@sbrivio-rh
Copy link
Collaborator

Issue on spliced/loopback path fixed in passt 2025_02_17.a1e48a0, matching Fedora Rawhide update, and the Arch maintainer already picked it up: https://gitlab.archlinux.org/archlinux/packaging/packages/passt/-/commit/47f7605d1d88095e49f39577417ea00af6d3b28a.

@kylem0
Copy link

kylem0 commented Feb 17, 2025

@sbrivio-rh That was faster than I thought haha. I've updated my home server and I can't reproduce the 100% CPU usage. Thank you very much for the help!

HarshithaMS005 pushed a commit to HarshithaMS005/kubevirt that referenced this issue Feb 19, 2025
This introduces vhost-user support with --vhost-user.

Notable fixes:

- possible EPOLLRDHUP event storms with half-closed TCP connections,
  leading to periods of high CPU load:
  containers/podman#23686 and
  https://bugs.passt.top/show_bug.cgi?id=94

- possible EPOLLERR event storms with UDP flows:
  https://bugs.passt.top/show_bug.cgi?id=95

- properly handle TCP keep-alive segments:
  containers/podman#24572

- set PSH flag at end of TCP batches:
  https://bugs.passt.top/show_bug.cgi?id=107

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
@arajczy
Copy link

arajczy commented Apr 4, 2025

Hello,

I am having high cpu load with the latest pasta again recently:

❯ pasta --version
pasta 0^20250320.g32f6212-2.fc41.x86_64

Can someone check please?

Thank you

@sbrivio-rh
Copy link
Collaborator

Hello,

I am having high cpu load with the latest pasta again recently:

...meaning? What kind of CPU load? While doing what?

❯ pasta --version pasta 0^20250320.g32f6212-2.fc41.x86_64

...no known issue with this regard at the moment.

@arajczy
Copy link

arajczy commented Apr 4, 2025

Hello,
I am having high cpu load with the latest pasta again recently:

...meaning? What kind of CPU load? While doing what?

❯ pasta --version pasta 0^20250320.g32f6212-2.fc41.x86_64

...no known issue with this regard at the moment.

pasta service related to podman container (nextcloud) went up to 100%

❯ top -bn1 -p 610499
top - 18:01:31 up 13:39,  4 users,  load average: 0,94, 0,51, 0,29
Tasks:   1 total,   1 running,   0 sleeping,   0 stopped,   0 zombie
%Cpu(s):  3,7 us, 11,1 sy,  0,0 ni, 85,2 id,  0,0 wa,  0,0 hi,  0,0 si,  0,0 st
MiB Mem :  31708,3 total,  22055,1 free,   3112,9 used,   7288,2 buff/cache
MiB Swap:   8192,0 total,   8192,0 free,      0,0 used.  28595,4 avail Mem

    PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
 610499 pod       20   0  206260  16956   1180 R 100,0   0,1   4:52.12 pasta.avx2
❯ ps -up 610499
USER         PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
pod       610499  2.6  0.0 206260 16956 ?        Rs   14:49   5:03 /usr/bin/pasta --config

It happened several times in the last 2-3 days without any real hard usage of the containers. When I restart the container it goes back to normal CPU usage but later suddenly produces the same phenomenon. I used to have this issue with podman / pasta but recently this seemed to be solved up until the last 2-3 days when it came up again.

Thanks

@sbrivio-rh
Copy link
Collaborator

@arajczy could you strace the process (in this case it would be strace -f -p 610499, as root) for a few milliseconds when this happens, and see if it's looping over some system call or similar?

@arajczy
Copy link

arajczy commented Apr 5, 2025

@arajczy could you strace the process (in this case it would be strace -f -p 610499, as root) for a few milliseconds when this happens, and see if it's looping over some system call or similar?

This morning reoccurred:

strace

top+ps

@sbrivio-rh
Copy link
Collaborator

sbrivio-rh commented Apr 5, 2025

strace

Hah, thanks a lot. It looks very similar to what I fixed recently in https://passt.top/passt/commit/?id=667caa09c6d46d937b3076254176eded262b3eca, but there must be something else.

What's particular here is that the writer on pasta's reading side is done (it sent a FIN / shutdown(x, SHUT_RW)), but the receiver on pasta's writing side isn't ready. I guess we should make sure that the same conceptual change from that commit also covers this particular case. Maybe in tcp_splice_sock_handler() we should move:

			conn_event(c, conn, OUT_WAIT(!fromsidei));

before:

			if (conn->read[fromsidei] == conn->written[fromsidei])
				break;

but I didn't really think it through yet (let alone tested).

@dgibson
Copy link
Collaborator

dgibson commented Apr 9, 2025

@sbrivio-rh, @arajczy, I think I've spotted the cause of the latest problem here. Patches coming shortly, with any luck.

@sbrivio-rh
Copy link
Collaborator

@arajczy patches are at https://archives.passt.top/passt-dev/20250409063541.1411177-1-david@gibson.dropbear.id.au/. Would you have a chance to try them out? It takes just a few seconds to build passt / pasta locally:

  • git clone git://passt.top/passt && cd passt
  • apply the series with b4 (you might need to dnf install b4): b4 shazam https://archives.passt.top/passt-dev/20250409063541.1411177-1-david@gibson.dropbear.id.au/
  • make
  • as root, or with sudo: make install will install the new binaries under /usr/local/bin. You can clean it up properly with make uninstall afterwards, and Podman will pick up the new binaries by default

@arajczy
Copy link

arajczy commented Apr 9, 2025

@sbrivio-rh, I have followed your steps and installed patches. Let me monitor my system for a while - will let you know the results. Thank you

@arajczy
Copy link

arajczy commented Apr 11, 2025

@sbrivio-rh, @dgibson, looking at the past two days, it seems you have solved the issue. After cca. 12:00 CET on 9th April, when I applied the patch, you can't really see high peaks:
Image
Image

@sbrivio-rh
Copy link
Collaborator

@sbrivio-rh, @dgibson, looking at the past two days, it seems you have solved the issue.

Thanks for confirming. Let's ship it then!

By the way those charts look pretty sleek.

@sbrivio-rh
Copy link
Collaborator

At least the part of this issue reported in #23686 (comment) is now fixed in passt 2025_04_15.2340bbf (and matching Fedora 41 update).

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. network Networking related issue or feature pasta pasta(1) bugs or features
Projects
None yet
Development

No branches or pull requests