From 6b81c5821b6a6f8d3aa1ba809a54765ba3bf3a82 Mon Sep 17 00:00:00 2001 From: Kevin Krakauer Date: Fri, 13 Sep 2024 14:37:11 -0700 Subject: [PATCH] docs: add a netstack guide to the site PiperOrigin-RevId: 674438273 --- g3doc/architecture_guide/BUILD | 11 ++++ g3doc/architecture_guide/networking.md | 69 +++++++++++++++++++++++++ g3doc/architecture_guide/packetflow.svg | 1 + g3doc/user_guide/networking.md | 8 +-- website/BUILD | 1 + 5 files changed, 87 insertions(+), 3 deletions(-) create mode 100644 g3doc/architecture_guide/networking.md create mode 100644 g3doc/architecture_guide/packetflow.svg diff --git a/g3doc/architecture_guide/BUILD b/g3doc/architecture_guide/BUILD index 13cb35fd50..0b3ac99465 100644 --- a/g3doc/architecture_guide/BUILD +++ b/g3doc/architecture_guide/BUILD @@ -49,3 +49,14 @@ doc( permalink = "/docs/architecture_guide/performance/", weight = "20", ) + +doc( + name = "networking", + src = "networking.md", + category = "Architecture Guide", + data = [ + "packetflow.svg", + ], + permalink = "/docs/architecture_guide/networking/", + weight = "50", +) diff --git a/g3doc/architecture_guide/networking.md b/g3doc/architecture_guide/networking.md new file mode 100644 index 0000000000..0608f5ff26 --- /dev/null +++ b/g3doc/architecture_guide/networking.md @@ -0,0 +1,69 @@ +# Networking Guide + +[TOC] + +Applications running in gVisor often communicate with the outside world. gVisor +networking is architected to enforce a strong isolation boundary without +restricting application behavior. To that end, gVisor implements its own network +stack called **netstack**. + +This document describes how packets move to and from gVisor, the architecture of +netstack, and how netstack can be used independently as a userspace network +stack. + +## How packets get to and from gVisor + +Whether running directly via `runsc` or indirectly through [Docker][docker], +packets flow between gVisor and the host in largely the same manner. + +![Networking](packetflow.svg "Networking examples."){:style="max-width:100%"} + +The gVisor sandbox process (called the *sentry*) is started in a network +namespace with one or more virtual network devices. As with Docker, there is +typically one loopback device and one [VETH device][veth] present. gVisor +scrapes addresses, routes, and the like from those devices and configures the +sentry to use those same addresses and routes. Thus applications in gVisor +accept and generate packets as though they were running on the host **while +still maintaining the strong sandbox boundary.** + +The sentry, which for security cannot open host sockets of its own, is +initialized with a single [`AF_PACKET` socket][AF_PACKET]. `AF_PACKET` sockets +send and receive raw packets, i.e. those that include link, network, and +transport headers. gVisor ingresses and egresses all non-loopback traffic across +that socket. + +## Netstack architecture + +**Threading** in netstack is fairly simple. Link endpoints, the most common of +which is the [fdbased][fdbased] endpoint, spawn their own goroutine(s) that +receive incoming packets and pass them up netstack. TCP packets are enqueued and +asynchronously handled by the TCP implementation's own goroutines. Other +protocols are handled inline, and the dispatcher goroutine handles all +processing up to enqueueing packets at the socket where it can be read into +userspace. + +Outgoing packets can be processed on different goroutines -- syscall, TCP, or +the link endpoint's -- until typically reaching a [queueing discipline][qdisc]. +There another goroutine writes batches of queued packets out the link endpoint. + +**Netstack supports a variety of underlying link layers.** Currently supported +link layers include `AF_PACKET` sockets, `AF_XDP` sockets, shared memory, and Go +channels. + +**Netstack aims to be usable independent of gVisor.** As a fully-featured +userspace network stack, it can be (and is) easily reused in other projects. +Note that, while netstack's API is fairly stable, it doesn't guarantee stability +and is not published with Go module-style versions. + +## Host networking + +gVisor can also be run with host networking via the `--network=host` flag. This +uses the [hostinet][hostinet] package, which trades the security and isolation +of netstack for the performance of native Linux networking. + +[docker]: /docs/user_guide/quick_start/docker/ +[veth]: https://developers.redhat.com/blog/2018/10/22/introduction-to-linux-interfaces-for-virtual-networking#veth +[AF_PACKET]: https://man7.org/linux/man-pages/man7/packet.7.html +[fdbased]: https://cs.opensource.google/gvisor/gvisor/+/master:pkg/tcpip/link/fdbased/ +[qdisc]: https://cs.opensource.google/gvisor/gvisor/+/master:pkg/tcpip/link/qdisc/ +[hostinet]: https://cs.opensource.google/gvisor/gvisor/+/master:pkg/sentry/socket/hostinet/ diff --git a/g3doc/architecture_guide/packetflow.svg b/g3doc/architecture_guide/packetflow.svg new file mode 100644 index 0000000000..b6715828c9 --- /dev/null +++ b/g3doc/architecture_guide/packetflow.svg @@ -0,0 +1 @@ + \ No newline at end of file diff --git a/g3doc/user_guide/networking.md b/g3doc/user_guide/networking.md index 803cd589bd..760793181f 100644 --- a/g3doc/user_guide/networking.md +++ b/g3doc/user_guide/networking.md @@ -2,9 +2,9 @@ [TOC] -gVisor implements its own network stack called netstack. All aspects of the -network stack are handled inside the Sentry — including TCP connection state, -control messages, and packet assembly — keeping it isolated from the host +gVisor implements its own network stack called [netstack][netstack]. All aspects +of the network stack are handled inside the Sentry — including TCP connection +state, control messages, and packet assembly — keeping it isolated from the host network stack. Data link layer packets are written directly to the virtual device inside the network namespace setup by Docker or Kubernetes. @@ -84,3 +84,5 @@ Offload (GSO) to run with a kernel that is newer than 3.17. Add the } } ``` + +[netstack]: /docs/architecture_guide/networking/ diff --git a/website/BUILD b/website/BUILD index 42a656d6b4..9844d9d4fd 100644 --- a/website/BUILD +++ b/website/BUILD @@ -147,6 +147,7 @@ docs( "//g3doc:index", "//g3doc:roadmap", "//g3doc:style", + "//g3doc/architecture_guide:networking", "//g3doc/architecture_guide:performance", "//g3doc/architecture_guide:platforms", "//g3doc/architecture_guide:resources",