Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

[Flaky Test]: <TestConnInfoConnCloseThenAnotherConn> – failed to start connection credentials listener: listen unix ...: bind: invalid argument #6977

Open
belimawr opened this issue Feb 21, 2025 · 2 comments
Labels
flaky-test Unstable or unreliable test cases. Team:Elastic-Agent Label for the Agent team

Comments

@belimawr
Copy link
Contributor

belimawr commented Feb 21, 2025

Failing test case

TestConnInfoConnCloseThenAnotherConn

Error message

conn_info_server_test.go:190: failed to start connection credentials listener: listen unix /var/folders/1w/pb98dgl15sd6jdcx2yy6j2500000gn/T/TestConnInfoConnCloseThenAnotherConn3117723250/001/.teaci.sock: bind: invalid argument

Build

Local run of the tests

OS

Linux, Mac

Stacktrace and notes

This test is failing because the test creates a unix socket on a path that can exceed the maximum size for the OS.

man unix on darwin says:

UNIX-domain addresses are variable-length filesystem pathnames of at most 104 characters.

On the listed example the path is 111 characters long.

The issue can also affect Linux:

$ cat /usr/include/linux/un.h | grep "define UNIX_PATH_MAX"
#define UNIX_PATH_MAX   108

The test usually passes on Linux because the generated path is < 100 characters. E.g: /tmp/TestConnInfoConnCloseThenAnotherConn233963498/001/.teaci.sock

tiago@Not-A-Linux~/devel/elastic-agent/pkg/component/runtime % go test -run=TestConnInfoConnCloseThenAnotherConn -v -count=1
=== RUN   TestConnInfoConnCloseThenAnotherConn
=== RUN   TestConnInfoConnCloseThenAnotherConn/port
=== RUN   TestConnInfoConnCloseThenAnotherConn/local
    conn_info_server_test.go:190: failed to start connection credentials listener: listen unix /var/folders/1w/pb98dgl15sd6jdcx2yy6j2500000gn/T/TestConnInfoConnCloseThenAnotherConn3117723250/001/.teaci.sock: bind: invalid argument
--- FAIL: TestConnInfoConnCloseThenAnotherConn (0.04s)
    --- PASS: TestConnInfoConnCloseThenAnotherConn/port (0.00s)
    --- FAIL: TestConnInfoConnCloseThenAnotherConn/local (0.04s)
FAIL
exit status 1
FAIL    github.com/elastic/elastic-agent/pkg/component/runtime  0.456s
1:WARN tiago@Not-A-Linux~/devel/elastic-agent/pkg/component/runtime %

The problem comes from the getAddress called by runTests function that generates the address for the sockets without doing any length validation.

func getAddress(dir string, isLocal bool) string {
if isLocal {
u := url.URL{}
u.Path = "/"
if runtime.GOOS == "windows" {
u.Scheme = "npipe"
return u.JoinPath("/", testSock).String()
}
u.Scheme = "unix"
return u.JoinPath(dir, testSock).String()
}
return fmt.Sprintf("127.0.0.1:%d", testPort)
}

func runTests(t *testing.T, fn func(*testing.T, string)) {
sockdir := t.TempDir()
tests := []struct {
name string
address string
}{
{
name: "port",
address: getAddress("", false),
},
{
name: "local",
address: getAddress(sockdir, true),
},
}
for _, tc := range tests {
t.Run(tc.name, func(t *testing.T) {
fn(t, tc.address)
})
}
}

@belimawr belimawr added flaky-test Unstable or unreliable test cases. Team:Elastic-Agent Label for the Agent team labels Feb 21, 2025
@elasticmachine
Copy link
Contributor

Pinging @elastic/elastic-agent (Team:Elastic-Agent)

@belimawr
Copy link
Contributor Author

belimawr commented Feb 21, 2025

We already have some code to handle socket paths that are too long:

Using it in our tests should fix the problem

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
flaky-test Unstable or unreliable test cases. Team:Elastic-Agent Label for the Agent team
Projects
None yet
Development

No branches or pull requests

2 participants