Skip to content

Bad syscall 202 (sys_futex) on instance start with glibc #2044

New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

Closed
s-hamann opened this issue Jul 25, 2020 · 10 comments
Closed

Bad syscall 202 (sys_futex) on instance start with glibc #2044

s-hamann opened this issue Jul 25, 2020 · 10 comments
Assignees

Comments

@s-hamann
Copy link

I built firecracker for glibc and found that it fails on instance start with the following error:

[anonymous-instance:ERROR:src/vmm/src/signal_handler.rs:37] Shutting down VM after intercepting a bad syscall (202).

Version: 0.21.1 with #2039 applied

Running strace firecracker --config-file config.json gives (last few lines only):

ioctl(7, KVM_CREATE_VCPU, 0)            = 23
mmap(NULL, 12288, PROT_READ|PROT_WRITE, MAP_SHARED, 23, 0) = 0x7f93e601d000
ioctl(23, KVM_SET_CPUID2, {nent=40, entries=[...]}) = 0
ioctl(23, KVM_SET_MSRS, 0x561b7aef6900) = 10
ioctl(23, KVM_SET_REGS, {rax=0, ..., rsp=0x8ff0, rbp=0x8ff0, ..., rip=0x1000000, rflags=0x2}) = 0
ioctl(23, KVM_SET_FPU, 0x7ffe910b4650)  = 0
ioctl(23, KVM_GET_SREGS, {cs={base=0xffff0000, limit=65535, selector=61440, type=11, present=1, dpl=0, db=0, s=1, l=0, g=0, avl=0}, ...}) = 0
ioctl(23, KVM_SET_SREGS, {cs={base=0, limit=1048575, selector=8, type=11, present=1, dpl=0, db=0, s=1, l=1, g=1, avl=0}, ...}) = 0
ioctl(23, KVM_GET_LAPIC, 0x7ffe910b4650) = 0
ioctl(23, KVM_SET_LAPIC, 0x7ffe910b5660) = 0
ioctl(0, TCGETS, {B38400 opost -isig icanon echo ...}) = 0
ioctl(0, TCGETS, {B38400 opost -isig icanon echo ...}) = 0
ioctl(0, SNDCTL_TMR_START or TCSETS, {B38400 opost -isig -icanon -echo ...}) = 0
epoll_ctl(4, EPOLL_CTL_ADD, 14, {EPOLLIN, {u32=5, u64=5}}) = 0
epoll_ctl(4, EPOLL_CTL_ADD, 0, {EPOLLIN, {u32=0, u64=0}}) = 0
rt_sigaction(SIGRT_2, {sa_handler=0x561b79f374f0, sa_mask=~[RTMIN RT_1], sa_flags=SA_RESTORER|SA_SIGINFO, sa_restorer=0x7f93e5ff2180}, NULL, 8) = 0
futex(0x7f93e6004048, FUTEX_WAKE_PRIVATE, 2147483647) = 0
mmap(NULL, 2101248, PROT_NONE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = 0x7f93ddc06000
mprotect(0x7f93ddc07000, 2097152, PROT_READ|PROT_WRITE) = 0
clone(child_stack=0x7f93dde05eb0, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tid=[21766], tls=0x7f93dde06700, child_tidptr=0x7f93dde069d0) = 21766
prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0)  = 0
prctl(PR_SET_SECCOMP, SECCOMP_MODE_FILTER, {len=359, filter=0x561b7af0e670}) = 0
getpid()                                = 21762
tgkill(21762, 21766, SIGRT_2)           = 0
futex(0x561b7aef2a98, FUTEX_WAIT_BITSET_PRIVATE, 0, {tv_sec=305999, tv_nsec=584167774}, FUTEX_BITSET_MATCH_ANY) = 202
--- SIGSYS {si_signo=SIGSYS, si_code=SYS_SECCOMP, si_call_addr=0x7f93e5fed898, si_syscall=__NR_futex, si_arch=AUDIT_ARCH_X86_64} ---
openat(AT_FDCWD, "/etc/localtime", O_RDONLY|O_CLOEXEC) = 24
fstat(24, {st_mode=S_IFREG|0644, st_size=2298, ...}) = 0
fstat(24, {st_mode=S_IFREG|0644, st_size=2298, ...}) = 0
read(24, "TZif2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\t\0\0\0\t\0\0\0\0"..., 4096) = 2298
lseek(24, -1449, SEEK_CUR)              = 849
read(24, "TZif2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\t\0\0\0\t\0\0\0\0"..., 4096) = 1449
close(24)                               = 0
write(2, "2020-07-25T08:07:45.063030326 [a"..., 1462020-07-25T08:07:45.063030326 [anonymous-instance:ERROR:src/vmm/src/signal_handler.rs:37] Shutting down VM after intercepting a bad syscall (202).) = 146
write(2, "\n", 1
)                       = 1
exit_group(148)                         = ?
+++ exited with 148 +++

Note: The official musl build works, running my glibc build with --seccomp-level 0 works. So this seems unrelated to my particular rootfs, kernel and config.

@alindima
Copy link
Contributor

Hello and thank you for raising this issue.

I have tried to reproduce it using the standard kernel and rootfs we provide in the docs, building with glibc.
However, this doesn't reproduce.

Could you please provide the clear configuration steps you took, the kernel and rootfs you used?

@alindima alindima self-assigned this Jul 27, 2020
@s-hamann
Copy link
Author

Thank you for looking into this.

I can reproduce this with hello-vmlinux.bin and hello-rootfs.ext4 from the quick start guide.
I run firecracker simply as firecracker --config-file hello.json with the following config file:

{
  "boot-source": {
    "kernel_image_path": "hello-vmlinux.bin",
    "boot_args": "console=ttyS0 reboot=k panic=1 pci=off quiet i8042.noaux i8042.nomux i8042.nopnp i8042.dumbkbd"
  },
  "drives": [
    {
      "drive_id": "rootfs",
      "path_on_host": "hello-rootfs.ext4",
      "is_root_device": true,
      "is_read_only": true
    }
  ],
  "machine-config": {
    "vcpu_count": 1,
    "mem_size_mib": 128,
    "ht_enabled": false
  }
}

I do not use jailer and I run firecracker as root to rule out any permission issues.
I used Gentoo's ebuild to build firecracker, with the aforementioned pull request's patch applied to the source. It does not compile in a container, but directly on the system that runs firecracker.

@andreeaflorescu
Copy link
Member

The problem with glibc builds is that the seccomp allow list is not reliable as it depends on the version and implementation of libc on the system where Firecracker runs. With that in mind, what libc are you using where you're running Firecracker?

Probably the only way to make this work long term with any libc version is to implement #1366

@s-hamann
Copy link
Author

I use glibc 2.30, which is also the version firecracker was compiled with.

@s-hamann
Copy link
Author

I upgraded to glibc 2.31 and rebuilt firecracker with that version. That does not seem to change anything at all.

@alindima
Copy link
Contributor

Hello, as @andreeaflorescu pointed out, the only reliable way to address these seccomp issues with glibc is to implement #1366 , an issue which is currently in research.
This is because glibc used syscalls are highly dependent on the glibc version.

@s-hamann
Copy link
Author

@andreeaflorescu asked for the version, so I thought I'd update the info here ;)

Regarding the issue at hand: Is there anything (less-reliable) you can/will do about this? Maybe generating the list of required syscalls at compile-time, i.e. for the version of glibc that firecracker is built with? Is that possible?

If your course of action is limited to implementing #1366, I do not see the point in keeping this issue open.

Personally I'd prefer if firecracker worked "out-of-the-box", without the user/administrator having to figure out what syscalls to allow. But if that's not feasible for glibc, I'll switch to using the musl builds.

@alindima
Copy link
Contributor

Thank you for the suggestions.

Currently, we decided there is nothing feasible we should do, apart from implementing #1366 . Regarding your suggestion of generating the whitelist at compile-time, this could actually break the intended behaviour of seccomp.
By whitelisting all of the syscalls Firecracker might need with glibc, we get a very large number of syscalls to whitelist, some of which Firecracker may never need (since this is done at compile-time, there is no way of telling what syscalls are actually used at runtime). We have decided that this not what we want.

I recommend using the musl builds instead of glibc in this case. Are there any reasons you would prefer using glibc?

@s-hamann
Copy link
Author

Thank you for the information and the reasoning.

I would prefer using the glibc build simply because that is what I can easily install and update using the package manager on Gentoo. I already suggested packaging the official musl-builds (gentoo/gentoo#16219). Using someone else's binaries feels somewhat strange on Gentoo, but that seems to be the easiest solution here.

@alindima
Copy link
Contributor

Thank you for the explanation, I see. I will close this issue for now.

# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants