-
Notifications
You must be signed in to change notification settings - Fork 438
Open
Open
Copy link
Description
What happened?
Issue affects our prod systems and constantly appears during load tests.
This was initially discovered when using own gRPC agent which consumes events from tetragon
directly, but this could be easily reproduced using tetra
.
In a container which is being monitored run:
while true; do cat /etc/pam.conf > /dev/null && awk 'BEGIN {system("whoami")}' > /dev/null && sleep 0.25 || break; done
In tetragon
container run:
tetra getevents --pods test-pod -o compact
This will fail after some time (~5-60 min) with following error:
<...>
🚀 process default/test-pod-debian /usr/bin/whoami
💥 exit default/test-pod-debian /usr/bin/whoami 0
💥 exit default/test-pod-debian /bin/sh -c whoami 0
💥 exit default/test-pod-debian /usr/bin/awk "BEGIN {system("whoami")}" 0
🚀 process default/test-pod-debian /usr/bin/sleep 0.25
time="2024-12-26T14:17:58Z" level=fatal msg="Failed to receive events" error="rpc error: code = Internal desc = grpc: error while marshaling: marshaling tetragon.GetEventsResponse: size mismatch (see https://github.com/golang/protobuf/issues/1609): calculated=0, measured=134"
This reproduces even without any Tracing Policy.
Tetragon Version
v1.1.2
Kernel Version
5.14.0-284.30.1.el9_2.x86_64
Kubernetes Version
v1.27.6
Metadata
Metadata
Assignees
Labels
No labels