Fosstodon
NETDEV VIDEOS
Session
One Layer Deeper: The 10-Layer Cake Under a Linux Network Interface
Speakers
Jesse Brandeburg
Label
Nuts and Bolts
Session Type
Talk
Description
Subtitle
Queues are not Ethernet NICs, Ethernet interfaces are not single Ethernet ports, and having fast hardware doesn’t automatically make software fast.
Abstract
Linux networking presents a clean abstraction: an interface, a socket, a queue. Production systems are less polite. Under that abstraction is a 10-layer cake of NIC hardware, drivers, CPU placement, packet hooks, traffic control, firewalls, socket buffers, application runtimes, microarchitecture, and observability. The frosting on the outside hides the complexity inside.
This talk is a tour of those hidden layers from the perspective of a kernel developer who spent years writing Intel Ethernet drivers and now works on production performance at Cloudflare. The old performance playbook was often “turn everything off and go fast.” Modern production is different: everything is on because it (usually) has to be. Firewalls, observability, and multi-tenant isolation are critical production features. Cloudflare has publicly described mitigating attacks as large as 31.4 Tbps, but the hard problem is no longer making one machine’s performance scream; it is understanding which layer is quietly spending the budget, hiding the drop, or adding the delay (often stacked problems).
A frequently occurring challenge in modern production performance is the observability gap. Linux has counters, but not always the correlation you want at 03:00 from Prometheus. Packets can disappear at the NIC, XDP, tc, conntrack, qdisc, softnet backlog, socket receive queue, or application runtime queue. A service can honestly report “I processed everything I received” while the kernel is dropping packets before the application ever sees them.
We will walk through concrete examples from Linux networking and production systems: why sendmmsg() and recvmmsg() still matter, and why treating the socket receive queue as a packet warehouse causes loss and latency. Why UDP is great until you lose automatic segmentation/offload behavior, and why VLAN acceleration is helpful until userspace needs the metadata. Why security hooks such as XDP, tc, netfilter, conntrack, socket filters, and eBPF programs are powerful but not free.
Finally, we will look below the packet path at CPU behavior: cache efficiency, code size, and TLB misses. “Less code is faster” and “bigger assembly instructions are faster” are both true — it depends on whether you’re bound by instructions, frontend bandwidth, or translations.
The goal is not to memorize every knob. The goal is to build the intuition to ask better questions when fast hardware becomes slow software. Knowing one layer deeper than the abstraction you develop on makes you a better network developer, a better production engineer, and a better architect.
The 10 Layers
- NIC hardware: queues, descriptors, DMA, and offloads.
- Driver behavior: rings, NAPI, page recycling, and interrupt moderation.
- CPU placement: IRQ affinity, RSS, XPS/RFS, and NUMA locality.
- Early packet hooks: XDP, AF_XDP, and hardware/software steering.
- Traffic control: tc ingress/egress, qdisc, and pacing.
- Firewall and security layers: nftables, netfilter, conntrack, and policy BPF.
- Kernel socket layer: socket buffers, drops, SO_REUSEPORT, and *mmsg.
- Application runtimes: Go, Rust, async schedulers, queues, and backpressure.
- Microarchitecture: cachelines, code size, iTLB/dTLB misses, and AVX-512 tradeoffs.
- Observability: Prometheus, counters, missing correlation, and “I processed everything I received.”
Recent News
Bronze Sponsor, Common Net
[Tue, 16, Jun. 2026]
Bronze Sponsor, secunet
[Fri, 12, Jun. 2026]
Bronze Sponsor, Red Hat
[Fri, 12, Jun. 2026]
Bronze Sponsor, Mpiric
[Tue, 09, Jun. 2026]
Bronze Sponsor, Viasat
[Mon, 08, Jun. 2026]
Important Dates
| Closing of CFS | June 1st |
| Notification by | June 10th |
| Conference dates | July 13th-16th |