Skip to content

Latest commit

 

History

History
171 lines (140 loc) · 7.32 KB

Syscalls.md

File metadata and controls

171 lines (140 loc) · 7.32 KB

System Calls

The system call API to Magenta is well documented in the following pages:

  • System Calls - "man page"-like documentation, one page per system call.
  • Concepts - Overview of concepts, including about Handles and Rights.

The system call API is not stable and not intended for direct use by developers. The kernel only allows system calls from distinguished addresses in a shared library, and the shared library forms the stable API that developers should use. In the future, the mapping of system call numbers may change. In fact, in the future, these numbers may be randomized between different processes running on the same machine!

System Call Stubs

System call stubs in the shared library are generated at compile time. System call numbers are assigned linearly in a dense packing. At the current time, the same specification is used for both Aarch64 and x86_64, so both architectures share the same system call numbers. The stubs are generated by the system/public/magenta/syscalls.sysgen tool with a command such as:

./build-magenta-pc-x86-64/tools/sysgen \
    -kernel-code ./build-magenta-pc-x86-64/gen/include/magenta/syscall-invocation-cases.inc \
    -trace ./build-magenta-pc-x86-64/gen/include/magenta/syscall-ktrace-info.inc \
    -category ./build-magenta-pc-x86-64/gen/include/magenta/syscall-category.inc \
    -kernel-header ./build-magenta-pc-x86-64/gen/include/magenta/syscall-definitions.h \
    -arm-asm ./build-magenta-pc-x86-64/gen/include/magenta/syscalls-arm64.S \
    -x86-asm ./build-magenta-pc-x86-64/gen/include/magenta/syscalls-x86-64.S \
    -vdso-header ./build-magenta-pc-x86-64/gen/include/magenta/syscall-vdso-definitions.h \
    -vdso-wrappers ./build-magenta-pc-x86-64/gen/include/magenta/syscall-vdso-wrappers.inc \
    -numbers ./build-magenta-pc-x86-64/gen/include/magenta/mx-syscall-numbers.h \
    -user-header ./build-magenta-pc-x86-64/gen/include/magenta/syscalls/definitions.h \
    -rust ./build-magenta-pc-x86-64/gen/include/magenta/syscalls/definitions.rs \
    system/public/magenta/syscalls.sysgen

You can view the resulting assembly by using objdump -d on the generated files. Here is a small sample from syscalls-x86-64.S.o and syscalls-arm64.S.o:

00000000000001bd <SYSCALL_mx_socket_read>:
 1bd:   41 52                   push   %r10
 1bf:   41 53                   push   %r11
 1c1:   49 89 ca                mov    %rcx,%r10
 1c4:   b8 16 00 00 00          mov    $0x16,%eax
 1c9:   0f 05                   syscall

00000000000001cb <CODE_SYSRET_mx_socket_read_VIA_mx_socket_read>:
 1cb:   41 5b                   pop    %r11
 1cd:   41 5a                   pop    %r10
 1cf:   c3                      retq
0000000000000108 <SYSCALL_mx_socket_read>:
 108:   d28002d0    mov x16, #0x16                      // #22
 10c:   d4000001    svc #0x0

0000000000000110 <CODE_SYSRET_mx_socket_read_VIA_mx_socket_read>:
 110:   d65f03c0    ret

At the current time (subject to change!), x86_64 passes the system call number in %eax and passes arguments in %rdi, %rsi, %rdx, %r10, %r8, %r9, %r12, %r13. Aarch64 passes the system call number in x16 and arguments in x0 .. x7.

System calls are usually invoked through libmagenta.so which is a shared library provided to userland processes as a VDSO. This library is specially and cannot be unmapped or mapped over. The kernel only accepts system calls made from specific addresses within this VDSO.

System Call Handling

System calls arrive in the kernel as a fault. The x86_64 handler is found in ./kernel/arch/x86/64/syscall.S at FUNCTION(x86_syscall). It dispatches to unknown_syscall for out-of-range numbers, or jumps through a dispatch table that is automatically generated. Each entry comes from the syscall_dispatch macro, and calls to a generated wrapper function. Wrappers can be found in the generated syscall-kernel-wrappers.inc file, and currently look like:

x86_64_syscall_result wrapper_socket_read(mx_handle_t handle, uint32_t options, void* buffer, size_t size, size_t* actual, uint64_t ip) {
    return do_syscall(MX_SYS_socket_read, ip, &VDso::ValidSyscallPC::socket_read, [&]() {
        return static_cast<uint64_t>(sys_socket_read(handle, options, make_user_ptr(buffer), size, make_user_ptr(actual)));
    });
}

The do_syscall function in kernel/lib/syscalls/syscalls.cpp takes care of common syscall handling, such as implementing kernel tracing. It also performs an unusual check -- it calls a function which verifies the caller's Instruction Pointer. Any calls that don't originate from the right address in a shared library are rejected. This prevents developers from directly making system calls without going through the shared library. (Note: Magenta prevents processes from unmapping this library and mapping their own code in its place).

The check is implemented in generated functions such as VDso::ValidSyscallPC::socket_read, which are passed in by the wrapper. These can be found in the generated vdso-valid-sysret.h header. A typical example is:

    static bool socket_read(uintptr_t offset) {
        switch (offset) {
        case VDSO_CODE_SYSRET_mx_socket_read_VIA_mx_socket_read - VDSO_CODE_START:
            return true;
        }
        return false;
    }

This in turn references a generated offset, such as VDSO_CODE_SYSRET_mx_socket_read_VIA_mx_socket_read, defined in vdso-code.h:

#define VDSO_CODE_SYSRET_mx_socket_read_VIA_mx_socket_read 0x0000000000006671

If the valid_pc check succeeds, the system call handler passed in by the wrapper (ie. sys_socket_read) is finally called.

System call handlers are written in C++ and can be found under the kernel/lib/syscalls directory. For example, sys_socket_read is found in kernel/lib/syscalls/syscalls_socket.cpp. System calls typically (XXX always?) make use of a dispatcher object derived from the Dispatcher class.

Handles

System calls use handles to reference kernel objects, and these are already well documented. Handles belong to a single process (or the kernel, while in transit), and can be sent between processes. Multiple handles in multiple processes can reference the same underlying kernel object.

Handle values are obfuscated by the kernel before being revealed to userland. The mechanism used is subject to change, but currently makes use of a per-process secret. The secret is set while creating a ProcessDispatcher object. The handle_rand_ member is set to a 29-bit value generated by a cryptographic random number generator. The high bit and the two low bits are zeroed. The map_handle_to_value function maps kernel handle values to mx_handle_t values used by userland applications by an XOR with the secret (passed in mixer below):

static mx_handle_t map_handle_to_value(const Handle* handle, mx_handle_t mixer) {
    // Ensure that the last bit of the result is not zero, and make sure
    // we don't lose any base_value bits or make the result negative
    // when shifting.
    DEBUG_ASSERT((mixer & ((1<<31) | 0x1)) == 0);
    DEBUG_ASSERT((handle->base_value() & 0xc0000000) == 0);

    auto handle_id = (handle->base_value() << 1) | 0x1;
    return mixer ^ handle_id;
}