Red Hat Research Quarterly

Open research clouds get the skills to pay the bills

Tzu-Mainn Chen

Tzu-Mainn Chen is a Principal Software Engineer at Red Hat, currently focusing on bringing Elastic Secure Infrastructure to life for the Massachusetts Open Cloud.

Article featured in

Red Hat Research Quarterly

November 2023

Download PDF

Subscribe now

In this issue

Feature

RISC-V extensions: what’s available and how to find them

Richard Jones

From the Director

AI for everyone: NERC expands access

Shaun Strohmer

News

Research demo shows hardware-software co-design in action

Interview

No more gatekeepers: Why technological ignorance is radically dangerous and how an open world will help

Jason Schlessman

Feature

Open research clouds get the skills to pay the bills

Tzu-Mainn Chen

Feature

Protecting data privacy: a look in our current toolkit

Gordon Haff

Column

Focus on edge: security, sustainability, and performance | November 2023

Ahmed Sanaullah

How do you charge for a cloud? Researchers at the New England Research Cloud have developed a stack to make understanding and charging for usage much simpler.

Universities and research institutions are increasingly embracing the cloud as a means to bring down costs and fully utilize the technical resources they have on hand. But creating and maintaining a cloud is not free, which leads to a question that is mundane and unglamorous but still critical: who is paying for all of this? This question is especially important to the New England Research Cloud (NERC), which provides cloud services such as OpenStack and Red Hat OpenShift to multiple member institutions, including Harvard University, the Massachusetts Institute of Technology, and Boston University.

Let’s start with the obvious and seemingly simple answer: the people using the service should be billed for it. But how do we define “use”? After all, no report in OpenShift says, “User A used six pieces of cloud this past week.” Instead, we need to look at lower-level utilization metrics, such as CPU, memory, or storage. Sum that up over a month, and you’ll have a good understanding of the amount of resources a user has consumed.

It’s not quite that easy, however, because resources are not uniform. For example, a server with a GPU is more expensive (and thus more valuable) than one without. In a cloud environment, it’s also important to account for shared usage, with the aforementioned GPU potentially being shared among multiple projects.

Solutions

Those are some of the difficulties present when considering billing requirements. Fortunately, there are also solutions. OpenShift includes a monitoring stack that stores various utilization metrics in time series form. This stack includes Prometheus, an open source monitoring solution that allows for easy custom queries of the exact metrics we need. NERC uses Python scripts to run those custom queries, generating monthly usage reports to bill consumers.

Custom platforms under development can also be built with billing requirements in mind.

We use a different approach with OpenStack, where NERC queries for VM creation and deletion events. This allows them to create a precise view of a project’s resource consumption. Similarly, Loki provides log aggregation for OpenShift, enabling operators to sift through events to create a fine-grained picture of resource usage and availability. This allows them to detect outages, during which users should not incur charges.

Custom platforms under development can also be built with billing requirements in mind. For example, NERC is in the process of deploying a bare metal cloud using ESI, which allows projects to lease bare metal nodes. These leases are modeled so that creating usage reports is as simple as running a single command.

Visibility for users

Generating usage reports is just one-half of the picture. The other is breaking down this information to users so they can make informed choices about their usage. In NERC, this all starts with the ColdFront interface, an open source resource allocation management system that uses a plugin architecture for easy integration with the APIs provided by OpenShift and OpenStack. Through ColdFront, users can view their quotas and request additional resources.

To ease a user’s understanding of what they are requesting, NERC uses the concept of services: a collection of resources such as CPUs, storage, and external addresses that are bundled and billed together as a unit. Examples include OpenStack CPU, OpenStack GPUA100, and OpenStack GPUV100, the use of which each entitles a user to a different set of OpenStack resources.

Generating usage reports is just one-half of the picture. The other is breaking down this information to users so they can make informed choices about their usage.

It’s also important to display resource consumption data to the user. NERC’s tool of choice here is Open XDMoD, an open source tool used by NERC in the past and thus familiar to users. XDMoD contains native support for OpenStack; however, integrating with OpenShift requires a bit more finesse, involving a custom script that queries Prometheus to generate an XDMoD-compatible log file. In the long term, NERC may shift to native OpenShift solutions. OpenShift already provides dashboards that allow users to query Prometheus to view the vast number of available metrics easily. For storage reasons, these metrics are only provided in the short term, but a new OpenShift operator called Curator solves this problem with the help of the Koku Metrics operator, which condenses the raw data to a resolution more suitable for long-term reports.

The final piece of the puzzle is billing: sending out a periodic invoice to users of NERC so they know how much to pay. Currently, NERC uses a custom script to create spreadsheets that are adjusted into invoices before being emailed to each user. This isn’t sustainable in the long term, and they’ll be moving to a new tool to automate much of this workflow. This tool will have a broad set of requirements. The most obvious one is invoicing, but nearly as important is a web dashboard that gathers usage data across all NERC offerings for presentation to the user. The tool also aims to accommodate the notion of credits for new users or contributors.

Continuing development

This view into NERC’s infrastructure highlights the challenges and solutions involved in the seemingly innocuous billing requirement. And there will be more as NERC continues to grow its capabilities. The aforementioned ESI bare metal cloud is being readied; further down the road, NERC may make new types of resources available in the cloud, such as FPGAs or infrastructure suitable for the wireless edge.

There’s also the question of disconnected infrastructure, where metrics are not collected for reasons ranging from policy (a requirement not to collect data identifying students) to capability (disconnected infrastructure). All of these will require NERC to evaluate and update its billing practices so that usage charges match their maintenance costs, while also working with engineers to ensure that the data exists to charge users accurately.

A vast number of moving pieces go into this billing stack, but with that complexity comes the opportunity to work on any or all of the technology involved and potentially break new ground in metering and billing for the entire cloud industry. I’ve included links throughout this article so you can investigate further if you’re curious and potentially get involved if you feel the call!

SHARE THIS ARTICLE

Open source cybersecurity and the next generation of computer scientists

Mike Bursell

Václav Matyáš, Professor with the Centre for Research on Cryptography and Security at the Faculty of Informatics at Masaryk University.

From the Director

From particles to prototypes: what we learn from managing open clouds

Heidi Dempsey

For those active in the early years of cloud computing, the challenges of open AI systems may feel strangely familiar. Do large-scale research collaborations have a lesson for today’s AI developers and engineers? We think so. With the proliferation of cloud computing in the early 2000s, IT organizations faced a new challenge: how to manage […]

News

Red Hat will offer collected teaching materials online

Matej Hrušovský

Red Hatters have been teaching at universities for almost fifteen years. Now, all that amassed expertise will be available on the Red Hat Research website to be used more widely. The relationship between Red Hatters and these universities has a long history. This is especially true in Brno, where a substantial portion of today’s senior […]

Feature

A data-driven approach for analyzing Common Criteria and FIPS 140 security certificates

Jaroslav Řezník

Petr Švenda

Seccerts is a much-needed tool for data scraping and analysis of security certificates, but creating it was harder than expected. Here’s why. Security certification documents from certification schemes like Common Criteria (CC) and the National Institute of Standards and Technology (NIST) Federal Information Processing Standard (FIPS) contain valuable, detailed information. Most of it, however, is […]

Feature

Open source education: from philosophy to reality

Danni Shi

Researchers, interns, and industry engineers have joined forces to create an open education platform using Red Hat OpenShift Data Science. Open source technology has transformed many industries, and education is now poised to be the next frontier. Open Education (OPE), an innovative project initiated by Boston University professor Jonathan Appavoo, is revolutionizing how education is […]

Feature

Verification of a Linux distribution

Kamil Dudka

While research on formal verification continues, fully automatic dynamic analysis of RPM packages is now available for Fedora users. In 2019, Red Hat joined the AUFOVER (Automation of Formal Verification) project, which focused on fully automatic detection of bugs in complex software products based on formal verification. The project was driven by Honeywell and supported […]

Feature

Don’t blame the developers: making security usable for IT professionals

Martin Ukrop

Historically, usability studies have looked mostly at end users, doing focus groups or user testing with customers or the general public. This process often neglected developers, system administrators, and other IT professionals and the systems they use day to day.

News

Red Hat Research Days coming this fall

Gagan Kumar

Heidi Dempsey

During Red Hat Research Days, researchers, Red Hatters, technologists, and students come together to discuss exciting new research developments.

Feature

PyLadies, welcome to open source!

Petr Viktorin

How did a group of three library students become part of an international force for promoting programming education? A Red Hatter who was there has the story.

Red Hat Research Quarterly

November 2023

Open research clouds get the skills to pay the bills

Tzu-Mainn Chen

Red Hat Research Quarterly

November 2023

Open research clouds get the skills to pay the bills

Tzu-Mainn Chen

Tzu-Mainn Chen

Red Hat Research Quarterly

November 2023

RISC-V extensions: what’s available and how to find them

Join the research journey

AI for everyone: NERC expands access

Research demo shows hardware-software co-design in action

No more gatekeepers: Why technological ignorance is radically dangerous and how an open world will help

Open research clouds get the skills to pay the bills

Protecting data privacy: a look in our current toolkit

Focus on edge: security, sustainability, and performance | November 2023

How do you charge for a cloud? Researchers at the New England Research Cloud have developed a stack to make understanding and charging for usage much simpler.

Solutions

Visibility for users

Continuing development

Open source cybersecurity and the next generation of computer scientists

Mike Bursell

From particles to prototypes: what we learn from managing open clouds

Heidi Dempsey

Red Hat will offer collected teaching materials online

Matej Hrušovský

A data-driven approach for analyzing Common Criteria and FIPS 140 security certificates

Jaroslav Řezník

Petr Švenda

Open source education: from philosophy to reality

Danni Shi

Verification of a Linux distribution

Kamil Dudka

Don’t blame the developers: making security usable for IT professionals

Martin Ukrop

Red Hat Research Days coming this fall

Gagan Kumar

Heidi Dempsey

PyLadies, welcome to open source!

Petr Viktorin