Creating a Linux-based unikernel

Is there a way to gain the performance benefits of a unikernel without severing it from an existing general-purpose code base? Boston University professors, BU PhD students, and Red Hat engineers at the Red Hat Collaboratory at Boston University are getting close to finding the answer.

A unikernel is a single bootable image consisting of user code linked with additional components that provide kernel-level functionality, such as opening files. The resulting program can boot and run on its own as a single process, in a single address space, and at an elevated privilege level without the need for a conventional operating system. This fusion of application and kernel components is very lightweight and can have performance, security, and other advantages. An ongoing Red Hat Collaboratory research project is creating a unikernel that builds off Linux with relatively few code changes.

How we came to look (back) at unikernels

The basic unikernel concept is not new. Although unikernels are often traced to research projects like Exokernel and Nemesis in the late 1990s, they’re similar in concept to the very first operating systems on early 1950s mainframes.

Early batch-processing programs were statically linked to shared libraries to perform tasks like managing buffers and input/output (I/O) devices. One of these early libraries, the SHARE Operating System, written by the SHARE User Group for the IBM 709 in the late 1950s, was an early example of users sharing code that they wrote among themselves—in effect, early open source. Yes, users wrote some of the earliest operating systems for themselves. All code ran at a single privilege level, from start to end, without multitasking or timesharing; there was no scheduler. In other words, it was a unikernel—even though the term wasn’t coined until much later.

What if we could take an open source operating system with a large community, Linux, and add unikernel capabilities into the same source code tree?

In part due to the cost of early computers, operating systems evolved over time and acquired additional functionality. Timesharing was one particularly important development in the 1960s; eventually it largely supplanted batch processing. One major milestone, released in 1969, was Multics, a collaboration between MIT, Bell Labs, and GE. Although the project was mostly a failure, in part because of its complexity, it influenced a wide range of minicomputer operating systems, like Digital Equipment’s VAX/VMS as well as Unix, which came out of Bell Labs.

Multics introduced innovations like dynamic linking and hardware support for ring-oriented security, which are familiar components of modern operating systems, including Linux.

These modern operating systems are extremely sophisticated and capable. But might we reexamine simpler approaches? For example, with a unikernel, the complex boundaries and permission checking a standard, multiuser operating system requires are no longer needed.

The unikernel program model now makes a lot more sense than it did back in the days of large, expensive computers. Today we have very inexpensive multiprocessor/hyper-threaded CPUs and the potential for virtualized environments running on a host that supports hundreds or even thousands of guests. We no longer have to be concerned about wasting compute cycles if a unikernel image blocks for I/O and does not context switch to another process. However, since a unikernel image can and does support multithreading, context switching between threads is viable.

Can Linux make it simpler?

The usual approach to building a unikernel is either building a specialized operating system from the ground up or forking an existing operating system, removing components, and modifying it as needed. Both approaches require ongoing maintenance of the resulting unikernel— which, among other problems, makes it harder to benefit from continuing enhancements to a general-purpose code base, including support for new devices.

Unikernels can also miss out on performance gains from new types of hardware acceleration, working against a key motivation for developing a unikernel. Furthermore, unikernels can require application changes, and they may not support the POSIX standard. Custom toolchains may be needed.

However, what if we could take an open source operating system with a large community, Linux, and add unikernel capabilities into the same source code tree? After all, Linux already supports a wide range of architectures and can be built in different ways depending upon the target use case.

That was the question a collaboration of professors, PhD students, and engineers at the BU Red Hat Collaboratory set out to answer.

The team set four goals:

Ensure that most applications and user libraries can be integrated into a unikernel without modification. Building the unikernel should just mean choosing a different GNU C Compiler (GCC) target.
Avoid any ring-transition overheads. Overhead experienced by any application requesting kernel functionality should be equivalent to a simple procedure call.
Allow cross-layer optimization. The compiler and/or developer should be able to co-optimize the application and kernel code.
Keep changes in Linux source code minimal, so they can be accepted upstream and the unikernel can be an integral part of Linux going forward. This will ensure unikernels are not an outsider but a build target for which anyone can compile their applications.

They wanted to meet these goals while continuing to support complex applications and a rich hardware compatibility list (HCL)—and while preserving the familiar configuration and operations model, as well as the debugging and optimization capabilities of the OS. And do all this without impacting other build targets. And, finally, do so in a way that enables, over time, the performance optimizations that have been demonstrated by other unikernel researchers.

The team examined a number of options but decided to avoid approaches that involved significant application rewrites, allowed arbitrary applications to run alongside the kernel in the ring 0 privilege level, or required one or more components running in userspace. They settled on a pure unikernel approach, whereby the kernel is statically linked to run a single application.

A unikernel Linux (UKL)

A prototype came together fairly quickly, with minor changes to the Linux kernel. After making the code changes, they created a prototype by building the Linux kernel with a UKL config option turned on. The linking stage in the kernel build process was slightly modified to link object files created from the GNU C Library (glibc), the application code, and a UKL library. The prototype served as a proof of concept, and a simple benchmark validated resulting performance gains. The work was presented in “Unikernels: the next stage of Linux’s dominance” at HotOS ‘19 in Bertinoro, Italy. Since that time, the team has continued to build on the initial work.

UKL builds an unmodified application into a Linux-based unikernel; it runs in the same privilege level as the kernel (ring 0) and allows for many optimizations. It consists of a small set of changes to the Linux kernel (less than 1,500 lines of code), which allows UKL to use Linux’s well-tested code base and work in concert with the large, established Linux development community rather than doing a standalone project.

UKL largely supports the POSIX interface. Differences are in two specific areas:

First, UKL runs as a single process. Therefore, fork(), which causes a process to make a copy of itself, doesn’t make sense in a UKL context. However, UKL does support clone(), which creates a new thread. This allows the entire POSIX threading library (libpthread), which is central to concurrent process flows, to work.

UKL gave 23% tail latency
improvement and
33% throughput improvement
over the Linux baseline.

Second, the application cannot make an explicit syscall. Instead, the far more common case of syscalls being used behind the scenes to have the kernel perform some privileged task is handled by the modified glibc library. Changes to glibc largely mask the fact that the linked application is now running in ring 0 rather than userspace (ring 3), where it would be running in the case of stock Linux. The modified glibc makes an operation such as opening a file, open(), simply call a kernel function rather than first transition from ring 3 to ring 0 and then transition back, along with the associated stack operations.

The build step is straightforward, which is often not the case with unikernels. Typically, unmodified applications are rebuilt and linked with the modified glibc. Then UKL is built as you would build Linux normally. Its final linking step takes in the partially linked user binary and creates a vmlinux that can be deployed anywhere. (Vmlinux is a statically linked executable file that contains the Linux kernel in one of the object file formats supported by Linux.)

There is no custom toolchain, although the researchers hope to encapsulate these steps into a single make step in the future.

Running UKL

Because UKL inherits Linux’s large HCL it can run in either a virtual machine or on bare metal. When UKL boots, it starts running the workload. Optionally, you can build a UKL to have a sidecar. With this sidecar, normal user space applications can run alongside the UKL main workload. This allows you to run a shell or other utilities to manage the system or debug it, for example. All the tools normally used to debug Linux can be used with UKL. These utilities run in user mode, as they would on normal Linux.

Because the UKL workload runs in kernel mode, it has access to all the internal kernel functions. This provides the ability to occasionally bypass kernel code entry/exit and invoke the underlying functionality for performance improvement. The research team has also tested versions of UKL that have no stack switches upon kernel code entry exit; doing so has also provided performance benefits. Additional performance tweaks came from manually shortening the tcp recvmsg/sendmsg paths in the kernel and calling these from network-based UKL workloads.

To date, the biggest performance boosts came from a workload containing the Redis database. UKL gave 23% tail latency improvement and 33% throughput improvement over the Linux baseline.

The biggest challenges

In keeping with the researchers’ goals, with UKL syscalls become simple kernel function calls, without involving ring transitions back and forth to kernel space. However, eliminating that ring transition has presented some of the greatest challenges associated with UKL.

The first challenge relates to differences between a normal user stack and a kernel stack. Normally, a page fault occurs when a process accesses a page that is mapped in the virtual address space but is not loaded in physical memory. These aren’t normally errors and are used to increase the amount of memory available to programs in Linux that use virtual memory.

However, when running in ring 0, the hardware does not switch stacks on a page fault. When state is pushed on the stack, a double fault results—and the system crashes. UKL addresses this for now by ensuring that pages are mapped ahead of any operations that could result in a fault. So the solution for the double fault issue is either preventing it by pre-faulting the stack before entering the kernel or switching to a wired kernel stack when a double fault does occur.

Another challenge is that during the normal transfer back from ring 0 to ring 3, the system does a great deal of post-processing in areas such as I/O, signal handling, and read-copy-update (RCU) synchronization. Not doing so caused a significant performance hit. As a result, UKL simply added calls to the kernel functions that deal with these housekeeping details. Subroutine calls are made to the existing system calls rather than using syscall instructions in the absence of a ring change.

Work continues

The BU researchers and Red Hat engineers working on this project see a variety of opportunities to continue improving the performance of complex concurrent workloads. Because the application is running in kernel mode when using a unikernel like UKL, there are many opportunities for synchronizing certain operations in ways that are difficult for user space code to accomplish.

Of equal or even greater importance, however, is working with the Linux community to get UKL code into upstream Linux. The proposed changes are relatively few and non-invasive, which should make inclusion easier. This would allow for a unikernel that both benefits the Linux community and gains the benefits of an open source development model.

The author would like to thank BU PhD candidate Ali Raza and Red Hat Senior Distinguished Engineer Larry Woodman for their invaluable assistance with this article.

SHARE THIS ARTICLE

Feature

Team threat hunting on a container platform: Kestrel as a Service

Kenneth Peeples

An automated tool developed by researchers aims to decrease the mean time to detection by enabling threat hunters to automate and collaborate within a secure, stable container environment. The automated security tools in a Security Operations Center (SOC) can handle about 80% of cybersecurity threats, leaving a substantial 20% of more sophisticated threats undetected. These […]

Feature

Where AI meets secure coding: inside SEMLA’s ambition for more resilient software

Simone Ferlin-Reiter

The industry-academia collaboration aimed at using LLMs to help generate more secure code builds on its success to expand research into infrastructure. In an era when software underpins everything from critical communications and global financial systems to lifesaving medical devices, security and reliability can never be an afterthought. Yet traditional development practices often leave gaps: […]

Feature

Testing critical IoT systems to mitigate network disruptions

Miroslav Bureš

The Internet of Things brings new opportunities and new challenges for mission-critical applications where lives are at stake. Systematic testing can help. The Internet of Things (IoT) has significantly increased the capabilities of mission-critical systems in many domains. Integrated rescue systems, healthcare, defense, energy, and transportation benefit from using the IoT, enabling faster system reactions […]

Feature

“When one teaches, two learn”: making the most of technical research mentorship

Matej Hrušovský

Lis Strenger

Research mentorships are the basic building block of productive industry-university relationships. We asked four mentors from around the globe to tell us about the challenges, rewards, and strategies of serving as a mentor. Linking a student’s research goals with the experience of a Red Hat software engineer is at the crux of the Red Hat […]

Feature

How expensive is it to crack a password derived with Argon2? Very

Vojtěch Polášek

Passwords made are to be memorable, so they are not usually secure enough for encryption software. That’s where derivation functions come in, transforming a password into a more suitable cryptographic key.

Feature

Bridging clusters: a comparative look at multicluster networking performance in Kubernetes

Sai Sindhur Malleni

José Castillo Lema

André Bauer

Raúl Sevilla Canavate

The EU Horizon project CODECO aims to provide smoother and more flexible support of services for distributed workloads across the edge-cloud continuum. Here’s what researchers discovered about multicluster networking solutions. The shift towards microservices has redefined how modern applications are built and run. With this architectural style, developers can break down monolithic systems into smaller, […]

Feature

CRANE: teaching code models to think without breaking their tools

Mingzhi Zhu

Stacy Patterson

Michele Merler

Raju Pavuluri

Can we enhance AI reasoning without sacrificing the reliability of coding tools? The CRANE method proves it’s possible. A stronger reasoning model is not automatically a better coding agent. For many AI systems, the standard approach is to take a model that can reason longer, plan more carefully, and recover from mistakes, then place it […]

Feature

Open source authentication exposed: how open source developers perceive user authentication

Agáta Kružíková

Ensuring security in open source software starts before a line of code is written. What role should communities and developers play? Open source projects are used in commercial products by many companies, from Microsoft and Google to Red Hat. The developers behind these projects and their user accounts are the first element in the supply […]

Feature

Smarter AI, fewer resources: bringing cloud AI into real-time edge devices to unlock performance

Eshed Ohn-Bar

A new AI framework for edge systems overcomes the communication and energy obstacles that limit their use in real-time applications by integrating local and cloud decision-making while maintaining strong performance. Artificial intelligence (AI) models with vast and generalized knowledge are increasingly being integrated into everyday devices, from smartphones that provide personalized assistance to mobile robots […]

Red Hat Research Quarterly

Creating a Linux-based unikernel

Red Hat Research Quarterly

Creating a Linux-based unikernel

Gordon Haff

Related Projects

Red Hat Research Quarterly

November 2021

Is there a way to gain the performance benefits of a unikernel without severing it from an existing general-purpose code base? Boston University professors, BU PhD students, and Red Hat engineers at the Red Hat Collaboratory at Boston University are getting close to finding the answer.

How we came to look (back) at unikernels

Can Linux make it simpler?

A unikernel Linux (UKL)

Running UKL

The biggest challenges

Work continues

Kenneth Peeples

Simone Ferlin-Reiter

Miroslav Bureš

Matej Hrušovský

Lis Strenger

Vojtěch Polášek

Sai Sindhur Malleni

José Castillo Lema

André Bauer

Raúl Sevilla Canavate

Mingzhi Zhu

Stacy Patterson

Michele Merler

Raju Pavuluri

Agáta Kružíková

Eshed Ohn-Bar