RHRQ first looked at the Unikernel Linux (UKL) project—a joint effort involving professors, PhD students, and engineers at the Boston University-based Red Hat Collaboratory—almost two years ago (RHRQ 3:3, November 2021). This previous article covered the background of unikernels in detail, but in brief: an application links directly to a specialized kernel, a lightly modified version of Linux in this case, so that the resulting program can boot and run on its own. Unikernels have demonstrated significant advantages in boot time, security, resource utilization, and I/O performance. They enable those advantages by linking the application and kernel together in the same address space.
UKL’s focus to date has been on minimizing changes both to the Linux kernel and to applications. By reusing Linux, we gain the advantages of Linux for free, especially wide driver support. We also studied the performance and latency characteristics of the final unikernels to see if making small, targeted changes could provide benefits.
The significant progress made by this project was detailed at the Eighteenth European Conference on Computer Systems (EuroSys ’23), May 8–12, 2023, in Rome, Italy, and published in the conference’s proceedings. Here are some of the highlights.
Unikernels have demonstrated significant advantages in boot time, security, resource utilization, and I/O performance.
The Unikernel Linux (UKL) project started as an effort to exploit Linux’s configurability to create a new unikernel in a fashion that would avoid forking the kernel. A unikernel taking this approach could support a wide range of Linux applications and hardware while becoming a standard part of the ongoing investment by the Linux community. Our experience has led us to a more general goal: creating a kernel that can be configured to span the spectrum between a general-purpose operating system, amenable to a large class of applications, and a highly optimized, possibly application- and hardware-specialized, unikernel.
Work to date has demonstrated that we can integrate unikernel techniques into a general-purpose operating system in a way that avoids forking it. It has also demonstrated performance gains. We think that most applications would run under these techniques at parity or slightly faster with no changes. With relatively little effort, targeted changes to the kernel can achieve significant gains.
A spectrum of capabilities
If we enable a base model UKL configuration (requiring 550 lines of code changes to Linux) in the kernel, we’re starting at the general purpose end of the spectrum. This simplest configuration of UKL supports most applications, albeit with only modest (5%) performance advantages.
Like many unikernels, UKL is a single application that is statically linked with the kernel and executed in supervisor mode. However, the base model of UKL preserves most of the capabilities of Linux, including a separate pageable application portion of the address space and a pinned kernel portion, distinct execution modes for application and kernel code, and the ability to run multiple processes. The main changes are that system calls are replaced by function calls and application code is linked with kernel code and executes in kernel mode.
As a result, this base model provides an avenue toward supporting all hardware and applications of the original kernel and the entire Linux ecosystem of tools for deployment, debugging, and performance tuning—which has been very useful in the course of this research. It also allows a developer to run “perf” directly inside the unikernel to collect performance information and feed that back into changes they make to the application to improve performance.
For more effort but with potentially more gain, a developer can move along the spectrum toward a specialized unikernel. A larger set of configuration options (1,250 lines of code changes total) may improve performance but will not work for all applications. Once an application is running, a developer can easily explore a number of configuration options that, while not safe for all applications, may be safe and offer performance advantages for their application.
One configuration bypasses the entry/exit code, which usually executes whenever control transitions between application and kernel through system calls, interrupts, and exceptions. Running the entry/exit code can get expensive for applications making many small kernel requests. The developer can also select between two UKL configurations that avoid stack switches, each appropriate for a different class of applications.
Taking advantage of these optimizations increased Redis throughput by up to 26%
Knowledgeable developers can also (or alternatively) improve performance by modifying the application to call internal kernel routines and violating, in a controlled fashion, the standard assumptions and invariants of kernel versus application code. For example, they may be able to assert that only one thread is accessing a file descriptor and avoid costly locking operations.
To understand the implication of UKL’s design for applications, we evaluated it with Redis, a widely used in-memory database. We saw two clear opportunities for performance improvement. First, we saw that we could shorten the execution path by bypassing the entry and exit code for read and write system calls and invoke the underlying functionality directly. We also observed that read and write calls eventually translate into
tcp_sendmsg, respectively. This led us to create a shortcut that enabled an application like Redis that always uses TCP to call the underlying routines directly. Only 10 lines of code were needed to implement this shortcut.
By taking advantage of these optimizations, researchers found that Redis throughput could be increased by up to 26% relative to standard Linux, whereas the UKL base model only improved throughput by 1.6%.
In addition to some cleanup work, such as rebasing to the latest kernel, glibc (which also requires code changes), and gcc, near-term work will focus on getting the project into the hands of more developers. The first step is adding the packages to the Fedora COPR service. The lengthy work of splitting up the Linux patches, authoring good commit messages, and checking that they pass Linux standards and tests is currently being done by Eric Munson at Boston University. After this is complete, we will submit them again to the Linux kernel community for comment.
The goal is, over time, to work with the community to add the changes to the Linux kernel as the current work is proven out and determined to be useful. In parallel with working with the kernel community, we need to demonstrate that the patches are useful for someone. To that end, we will work with other companies that have workloads requiring the highest performance and lowest latencies. We’re currently looking for additional partners, both commercial and individuals, who would like to try out their applications with UKL. Most plain C/C++ applications with few dependencies that already work on Linux can be ported to UKL in an afternoon.
While we have been working on UKL since around 2018, other technologies occupying a similar space have come along, especially
io_uring and eBPF.
io_uring is interesting because it amortizes syscall overhead. eBPF is interesting because it’s another way to run code in kernel space (albeit for a very limited definition of “code”). How do these approaches compare to UKL? We will be talking to developers who use these technologies to explore that question.