Red Hat Research Quarterly

Scaling the PEAKS of sustainability with insights from Kepler and machine learning

Han Dong

Han Dong is a postdoc in the Computer Science department at Boston University. His research interests lie in distributed systems, high-performance computing, and operating systems. He is interested in research addressing the growing energy needs of our modern systems.

about the author

Parul Singh

Parul Singh is a software engineer in the Office of the CTO at Red Hat. She is currently leading Red Hat’s effort on scaling ChRIS using OpenShift. Her research lies in AI, ML, NLP, and big data.

Related Projects

Discovering Opportunities for Optimizing OpenShift Energy Consumption

Article featured in

Red Hat Research Quarterly

November 2024

RHRQ_November24_WEB.pdf

Subscribe now

In this issue

From the Director

From particles to prototypes: what we learn from managing open clouds

Heidi Dempsey

News

Observability cluster added to the MOC Alliance’s New England Research Cloud

Thorsten Schwesig

Christopher Tate

Interview

Power surge: the push for sustainability in high-performance computing and AI workloads

Parul Singh

Han Dong

Shaun Strohmer

Feature

Scaling the PEAKS of sustainability with insights from Kepler and machine learning

Han Dong

Parul Singh

Feature

The Open Education Project is ready to scale

Danni Shi

Feature

Open source authentication exposed: how open source developers perceive user authentication

Agáta Kružíková

Column

Red Hat and the MOC-A: creating the open source cloud for the AI era

Orran Krieger

Heidi Dempsey

A proposed Kubernetes scheduler plugin aims to introduce energy efficiency as a factor in dynamic scheduling while still meeting performance requirements.

Businesses in many sectors are setting aggressive sustainability goals, from transitioning to renewable energy sources to reducing existing consumption. Nowhere is the pressure to meet these goals more urgent than in the technology sector, where datacenters running AI workloads consume rapidly increasing amounts of electricity. Making those goals achievable requires greater visibility into energy usage and an intelligent way to adapt dynamically in response to changing requirements.

That’s the idea behind the Power Efficiency Aware Kubernetes Scheduler (PEAKS), a collaborative project involving engineers and researchers from Boston University, Red Hat, and IBM. PEAKS is a scheduler that can factor in sustainability goals such as power utilization. It uses observability metrics to schedule tasks on a Kubernetes or OpenShift cluster more optimally. For example, an AI training algorithm is a long-running job that can go over days to train a model, or it could even be a continuous process to keep retraining the model whenever new data is found. PEAKS can find the nodes best suited for a particular workload to meet the application performance goals while reducing overall energy consumption.

PEAKS origins

PEAKS uses insights from projects based in both industrial and academic research. The first is Kepler (Kubernetes-based Efficient Power Level Exporter), an open source project founded by Red Hat’s Office of the CTO, with early contributions from IBM Research and Intel. Kepler offers a way to estimate power consumption at the process, container, and Kubernetes pod levels. The second set of insights come from a constellation of projects based at the Red Hat Collaboratory at Boston University that apply machine learning to Linux kernel configurations to optimize performance and energy tradeoff, such as Automatic Configuration of Complex Hardware and Discovering Opportunities for Optimizing OpenShift Energy Consumption.

Kepler

Kepler utilizes a BPF program integrated into the kernel’s pathway to extract process-related resource utilization metrics. Kepler also collects real-time power consumption metrics from the node components by using various APIs or by using regression-based trained power models when no real-time power metrics are available in the system.

Once all the data are collected, Kepler can calculate the energy consumed by each process. It does this by dividing the power used by a given resource based on the ratio of the process and system resource utilization. Then Kepler aggregates the per-process energy consumption into totals for containers and Kubernetes pods. Data collected and estimated for the containers are then stored as time-series data in Prometheus.

Determining per-container energy consumption in a Kubernetes or OpenShift environment is a more difficult problem than it might seem at first glance. The way we proportion energy consumption is based not just on the computation for a given node. Many open source projects that report energy consumption give you only the dynamic energy consumption or the power you’ve consumed by running a computation. Kepler is following the greenhouse protocol, where you also must account for idle energy: the energy used for the maintenance of the nodes, such as cooling or just keeping those nodes alive. We do this by finding the idle energy of the whole node and distributing that evenly over all the containers. This is not strictly accurate, but it is a consistent basis using the best-case scenario.

Kepler also addresses the challenge of calculating energy consumption in a virtualized environment. It’s relatively easy to calculate the energy consumption of processes or containers running on bare metal, where there’s no abstraction. But in a virtualized and cloud environment such as AWS, you don’t know how many VMs are running on a host. That data may not even be available to you because there could be multiple isolated tenants running their VMs on a particular physical machine.

In the multitenant scenario, cloud providers don’t generally provide a solution. You must trust that AWS or Google report your total energy consumption not at the process level, but at the tenant level. In other words, they should report the total consumption of the server or servers hosting your processes, or the total consumption of a cluster that provides all your virtual nodes. There are certain hypervisors that can provide this data from their host machines, but most do not. If you’re running your workloads on the public cloud, most of the time you won’t have this information.

Kepler tackles this problem by using models for various CPU architectures. We created these models by running benchmarks on bare metal machines with various CPU architectures. Kepler uses these models to estimate the energy consumption of a virtual machine on a specific host based on the benchmarks for machines with the same CPU architecture. (For greater detail, read “Exploring Kepler’s potentials: unveiling cloud application power consumption” on the Cloud Native Computing Foundation blog.)

Dynamic tuning

Policies in the Linux kernel can enable it to run in a more energy-efficient manner. With literally thousands of tunable parameters, there are many options to push the energy efficiency of Linux, especially in the way it runs workloads. By replacing some of the default Linux hardware policies with policies that are specifically tuned for energy efficiency, we can save a lot of energy. However, we must also ensure that the policy balances energy efficiency with performance for running applications. Service Level Objectives (SLOs) specify these required performance bounds for a system. For example, an SLO could be set so that 99% of incoming requests to the kernel are satisfied within one-millisecond, which means we can exploit variations in this one millisecond range to maximize energy savings. As these requests come in, a properly tuned system can respond to them in a way that satisfies the one-millisecond SLO while also saving energy.

To accomplish this, we use a machine learning algorithm—in this case, Bayesian optimization. The algorithm learns about the behavior of the application that’s running and automatically tunes relevant kernel parameters to move towards that energy efficiency space. In testing with an assumed SLO of 99% response latency below 500 µs, we achieved energy savings of up to 50% compared to untuned default Linux. Even at a more stringent SLO of 99% latency below 200 µs, Bayesian optimization adapted to this performance requirement with up to 30% energy savings.

PEAKS

PEAKS is applying the same type of tuning techniques to scheduling. There are two main areas in scheduling to consider: pod placement and autoscaling.

The result of the experiment: every dot in the figure is a single deployment configuration of the pods. The online learning approach was able to find a solution that is able to improve on the default behavior (proportional scaling) by a factor of 5 for performance while saving 14% in power.

Pod placement involves selecting the physical node where a particular pod will run. Not all applications require the latest and greatest hardware. In some cases, it might be even better to utilize older hardware due to its existing embodied carbon costs (the amount of greenhouse gas emissions associated with the upstream stages of a product’s life). Different generations of hardware (e.g., CPUs, GPUs) can have different Thermal Design Points (TDPs) due to their processor node technologies and other architectural characteristics. This is an interesting opportunity for tuning to maintain performance SLOs while slowly migrating functionality to older-generation hardware with drastically lower TDPs than the most recent versions.

Autoscaling involves determining the number of necessary pods to run as application needs change over time. The total number of pods can scale dynamically to meet SLOs as the workload changes. We can also leverage this information to improve energy efficiency. For example, if the load on datacenter servers follows a diurnal pattern, we may take advantage of this pattern to reduce energy consumption during periods of light load while still meeting performance goals.

We take inspiration from the previous kernel tuning work to decompose the above scheduling problems into a set of machine learning problems.

To explore tuning for these scheduling aspects, we take inspiration from the previous kernel tuning work to decompose the above scheduling problems into a set of machine learning problems, using popular techniques including Bayesian Optimization, Bandit Optimization, Gradient Descent, and Simulated Annealing.

Future milestones and challenges

One of the biggest challenges we have had in the Kepler project is validating its metrics. This is a general problem in AI when using models: how can you stand by the accuracy of your estimations? That’s why we are running various benchmarks to validate the metrics on Kepler. If we know the metrics are off by a certain acceptable amount, we report that. One option we are exploring is using the power distribution unit (PDU) data from the Massachusetts Green High Performance Computing Center (MGHPCC) to verify whether the values measured by Kepler are comparable to measured power usage at the rack level.

In terms of kernel tuning, the next level to explore is how we can further tune the energy of jobs based on a varying abundance of green energy over time or location, or on the basis of cooling requirements. For example, we are investigating how kernel tuning could reduce carbon emissions from a refrigeration cooling system while also considering SLOs. Running jobs at night, when cooling needs are lower than during the day, might be an option for applications such as database backups, which must run daily but allow flexible scheduling.

You can learn more about PEAKS as a proposed Kubernetes scheduler plugin in the GitHub Kubernetes Enhancement Proposals (KEP) repository and in Han Dong’s PEAKS presentation at DevConf.US.

SHARE THIS ARTICLE

Open source researchers in security and education win 2021 innovation awards

Vashek Matyáš

Masaryk University (MU) presented the 2021 MUNI Innovation Awards at its Business Research Forum on November 11, 2021. These awards recognize individual students and staff whose research has been successfully implemented in practice, helped to improve products or services, or in some other way enhanced the social relevance of MU research. Awards went to seventeen […]

Column

Composing a research symphony

Heidi Dempsey

Many talents contributed to one goal: a shared production-level research cloud. It was a chilly morning at Boston University, and I was looking for a quiet place to gather my thoughts and do some writing. I passed two painters covering up scuffs on the white walls and a man with a floor machine busily tracing […]

Project Updates

Research project updates—May 2023

Each quarter, Red Hat Research Quarterly highlights new and ongoing research collaborations from around the world. This quarter we highlight collaborative projects from Graz University of Technology (Austria), Masaryk University (Czech Republic), and Karlstad University (Sweden) Contact academic@redhat.com for more information on any project described here, or explore more research projects at research.redhat.com. PROJECT: Researching and […]

News

DevConf.US at the edge

Gordon Haff

This year’s conference showcased the many flavors and functions of edge computing. Now in its fifth year, DevConf.US was back in person at Boston University this past August. Aimed at community and professional contributors to free and open source technologies, DevConf.US included talks and plenty of informal discussions about the usual wide range of topics, […]

Feature

Fostering open innovation in hardware

Yan Fisher

Why is open hardware important? How is the new RISC-V architecture bringing open hardware research to the forefront? How will this impact you? Read on to find out.

Interview

Clouds on the horizon: shared cloud computing resources make research more accessible and more powerful

Heidi Dempsey

Dr. Michael Zink is Professor of Electrical and Computer Engineering at the University of Massachusetts, Amherst. In addition to publishing and teaching, Dr. Zink has participated in several projects providing distributed systems and virtual networks for research and education, including GENI and ExoGENI (2007-2021), Cloud Lab (2014-2021), and now the Open Cloud Testbed (OCT) since […]

Feature

Verifying programs that communicate with the environment

Henrich Lauko

Writing tests with high coverage is almost always tedious work that is still error prone. This can lead to missing crucial details that cause undesirable behavior, and, in the worst case, a complete system failure. What if there were an efficient way to automate this work?

Feature

Measuring open source success: developing analysis for actionable insights

Cali Dolfi

Project Aspen plans to enable quantitative open source community health analysis for all. Organizations are increasingly adopting open source software development models and open source aspects of organizational culture. As a result, interest in how open source communities succeed is reaching an all-time high. Until recent years, measuring the success of open source communities was […]

Interview

Power surge: the push for sustainability in high-performance computing and AI workloads

Parul Singh

Han Dong

Shaun Strohmer

John Goodhue has perspective. He was there at the birth of the internet and the development of the BBN Butterfly supercomputer, and now he’s a leader in one of the toughest challenges of the current age of technology—sustainable computing. Comparisons abound: one report says carbon emissions from cloud computing equal or exceed emissions from all […]