Red Hat Research Quarterly

The elastic bare metal cloud is here

Red Hat Research Quarterly

The elastic bare metal cloud is here

about the author

Gagan Kumar headshot

Gagan Kumar

Gagan Kumar works as a Senior Technical Product Manager at Red Hat. He has a bachelor’s degree in Computer Science from Visvesvaraya Technological University, India, and a master’s degree in Computer Science from Northeastern University, Boston. He works closely with the academic partners and customers of Red Hat. He is working on projects related to bare metal sharing systems and metrics collection for OpenShift instances.

Article featured in

Exclusivity of resources is becoming obsolete. The Elastic Secure Infrastructure Project (ESI) provides a solution for sharing computing resources and getting the most from hardware investments.

Using resources efficiently is an important goal for any organization. If those resources are computers, then theoretically that goal should be easily achievable, because machines don’t get tired and need a break. In the real world, however, job scheduling is complex, and resources will likely be sitting idle at times. What if an organization could lease their idle servers when they were not being used, and, for this favor, be compensated with access to other external servers at times when the organization needs them? The Elastic Secure Infrastructure (ESI) project is pursuing this idea to create more options for server efficiency. 

Efficient and elastic bare metal clusters

Mass Open Cloud (MOC), one of several research initiatives Red Hat Research supports, is a partnership of universities, research institutions, and industry working to create an open public cloud exchange model. MOC needed to create a bare metal cloud comprising hardware from multiple organizations involved in MOC, but the organizations also wanted to maintain complete control over the servers they own. Then, if the servers were idle, they would have an option to lease their hardware to other parties who might need it to create their own private network to form a cluster. 

Organizations and departments in the IT industry build dynamically scaled services, and, to maximize efficiency, they want to be able to lease unused IT infrastructure to others, as long as they can claim it back when the need recurs. MOC created projects such as HIL (Hardware Isolation Layer) and M2 (malleable metal) to achieve this goal. However, scaling this service required a lot of engineering effort and upstream changes in OpenStack. This effort led to the creation of the Elastic Secure Infrastructure project.

Using ESI, if a bare metal server is not currently in use and isn’t required immediately, the owner can lease the bare metal machine to a lessee who can use the machine for their need for an agreed amount of time. ESI enables organizations to build bare metal clusters that are elastic enough to cater to an organization’s changing computing needs.

Bringing multitenancy to Ironic

ESI allows communities to build a more diverse selection of servers than commercial services can support, and it also allows building a bigger community around IT research and development.

The ESI project is built as a solution on top of OpenStack. Ironic, a service in the OpenStack platform, already existed for managing bare metal machines, but it assumed that all computers managed by Ironic were owned by a single administrative organization. The ESI project proposed to add multitenancy to Ironic, and began working with the upstream developers. We helped to develop node multitenancy so that users could manage their own bare metal resources without requiring global administrative privileges.

While researching Ironic, we found an existing Ironic field called Owner that was used only for informational purposes. We proposed using this field for access control as well. In most of our use cases, owners leased the nodes to nonowners for a specific time period. We wanted to keep track of this information, so we also added another field called Lessee. The final step in our work was to create policy rules for owners and lessees, through which administrators can expose the node API calls to node owners and authorized lessees, based on updating policy files. These developments were the foundation for a shared bare metal cluster. (Read more about the development of multitenant functionality for Ironic in “Isn’t multitenancy ironic,” RHRQ 2:2, August 2020.)

ESI, virtualization, and public cloud providers

At this point, you might be wondering, “Why can’t we use virtualization instead of ESI?” Virtualization can address some resource sharing requirements. However, some application performance and isolation requirements can only be addressed by using the capabilities of an entire bare metal machine, without the middle layer of virtualization between hardware and software. Furthermore, some organizations need to share hardware between projects in a more secure way than virtualization will permit. The ESI project can better address both of these requirements. 

Another question that the ESI team frequently gets asked is, “Why should we use ESI when we can rent a server from any public cloud provider?” ESI allows communities to build a more diverse selection of servers than commercial services can support, and it also allows building a bigger community around IT research and development. Currently, much of the software that commercial providers use to manage servers at lower hardware levels is completely opaque to users and developers. With a more diverse community, users may be able to get access to hardware that they wouldn’t otherwise be able to try, such as new chipsets and other hardware under development with partners who participate in the leasing service to gain early user feedback. 

Research IT departments in academic and industry organizations are especially interested in sharing their resources and collaborating with other departments, universities, and research teams. Software that allows groups to build services across silos also increases flexibility for research teams and can reduce the total cost of ownership compared to purchasing resources by the minute for the overall organization. The ESI project is in the process of making these collaborations easier. If you are in a situation where your bare metal machines are idle for a long period of time and you want to participate in a cloud marketplace for using the machines more efficiently, or if you are looking to find specific types of bare metal machines you currently can’t access, the ESI project could be your solution. 

Importance of open source in ESI

Trustworthiness is essential for any shared service. In the ESI project, open source development has played a major role in developing trust in the community. The code isn’t a black box: a potential owner can go through the entire process and all the code before committing their resource to the ESI leasing platform. We are also continuing to work with upstream groups to add more security options, such as software attestation, to ESI. The open source model likewise facilitated development when we identified multitenancy as a feature gap in Ironic and worked with the upstream community to develop new code. This work wouldn’t be possible if Ironic were a proprietary service.

Pilot deployment

After several years of early research and development, the ESI concepts are no longer theoretical. The ESI engineering team successfully deployed a pilot environment in the Massachusetts Green High Performance Computing Center (MGHPCC) in 2021. Our goal is to support leasing functionality for bare metal machines in the Open Cloud Testbed, Mass Open Cloud, and CloudLab. We encourage other datacenters to create their own leasing federation or join an existing leasing federation and contribute to this project. 

What’s next for ESI?

The engineering team of the ESI project is now planning to support the installation of software that is expected to interact with the leased bare metal servers. We are in the process of identifying other feature gaps that need to be addressed to run platforms such as Red Hat® OpenShift® on top of the bare metal machines leased from the ESI service. Once this feature is supported, it could fundamentally change the way OpenShift is administered, because nodes in the infrastructure could also become elastic based on the platform workload. 

The ESI team is also exploring the possibility of building software to support a marketplace for the leasing service and a credit system for administrators who maintain an ESI service. Once ESI starts to scale horizontally, it becomes crucial to provide users the ability to post their available hardware in a marketplace and for the lessees to present various options for how they want to use services. A team of Boston University students and Red Hat interns produced a proof of concept for a bare metal marketplace called FLOCX  and presented it to the audience as part of DevConf.us 2019. The ESI team is continuing this work.

About the project

The Elastic Secure Infrastructure initiative is an open source project. We appreciate as much involvement as possible in the process of developing the project. If you want to be part of this team, learn more about our road map, or just see a demo of the project, you can email the ESI team at esi@lists.massopen.cloud.

If you want to clone the project’s GitHub repository, or submit a feature request, you can do so by visiting our GitHub repository. You can also contact the ESI development team on the OFTC #moc IRC channel.

 

SHARE THIS ARTICLE

More like this