Rekindling Network Protocol Innovation with User-Level Stacks

M. Honda, F. Huici, C. Raiciu, J. Araujo, L. Rizzo
Appears in: 
CCR April 2014

Recent studies show that more than 86% of Internet paths allow well-designed TCP extensions, meaning that it is still possible to deploy transport layer improvements despite the existence of middleboxes in the network. Hence, the blame for the slow evolution of protocols (with extensions taking many years to become widely used) should be placed on end systems. In this paper, we revisit the case for moving protocols stacks up into user space in order to ease the deployment of new protocols, extensions, or performance optimizations. We present MultiStack, operating system support for user- level protocol stacks. MultiStack runs within commodity operating systems, can concurrently host a large number of isolated stacks, has a fall-back path to the legacy host stack, and is able to process packets at rates of 10Gb/s. We validate our design by showing that our mux/demux layer can validate and switch packets at line rate (up to 14.88 Mpps) on a 10 Gbit port using 1-2 cores, and that a proof-of-concept HTTP server running over a basic userspace TCP outperforms by 18–90% both the same server and nginx running over the kernel’s stack.

Public Review By: 
Sharad Agarwal

This paper has three central premises. Recent prior work has shown that despite middle boxes, 86% of Internet paths allow traffic using TCP extensions. There is still active work in new TCP extensions such as FastOpen and Multipath TCP. Deployment of TCP extensions is hampered by waiting for OS updates. This paper presents MultiStack, which is a system for user-level transport stacks. While prior work has proposed user-level stacks, this paper focuses on supporting existing OSes and allows use of the legacy OS stack along with user-level ones. It is built on FreeBSD on top of the VALE software switch with some modifications. The design isolates packet buffer access by different applications. The authors drop support for multicast, and instead scale out the number of applications that can be supported. The evaluation demonstrates almost line rate on a 10 gbps NIC with at least two cores on an Intel Core i7 with multiple applications. All the reviewers found the paper to be interesting from an engineering point of view. The network performance that the system can achieve is impressive. However, this appears to come at the cost of CPU load, and is perhaps too excessive to be used in high performance server applications. The reviewers also wondered how the performance compares to prior user-space stacks and raw sockets. Has industry forsaken TCP transport improvements by moving to UDP, or can UDP still not traverse enough Internet paths?