Red Hat Research Quarterly

Unleashing the potential of Function as a Service in the cloud continuum

Luis Tomás Bolivar

Luis Tomás Bolivar is a Principal Software Engineer at Red Hat, Spain. He is currently working in the Ecosystem Engineering group and on research activities with a focus on cloud computing in general, and automation, networking, and AI in particular. He holds a PhD in Computer Science (University of Castilla-La Mancha, Spain) and has been involved in several EU projects.

about the author

José Castillo Lema

José Castillo Lema is a Senior Software Engineer at Red Hat working with the Telco 5G performance/ scale team. During his MsC and PhD studies, he worked on QoS routing in SDN and NFV Management and Orchestration. Has been teaching postgraduate courses for the last 6 years.

Related Projects

PHYSICS: oPtimized HYbrid Space-time servIce Continuum in faaS

Article featured in

Red Hat Research Quarterly

May 2024

Download PDF

Subscribe now

In this issue

News

How the university OSPO approach is taking shape at the University of California and beyond

Stephanie Lieggi

Emily Lovell

James Davis

From the Director

Beyond the aha! moment: research develops solutions for lasting impact

Heidi Dempsey

Interview

ChRIS five years later: the groundbreaking platform levels the playing field for advanced analytics and AI in medicine

Orran Krieger

Shaun Strohmer

Feature

Moving ecological forecasting from supercomputer to cloud: why and how

Christopher Tate

Feature

Unleashing the potential of Function as a Service in the cloud continuum

Luis Tomás Bolivar

José Castillo Lema

Feature

Team threat hunting on a container platform: Kestrel as a Service

Kenneth Peeples

Column

Focus on trust | May 2024

Martin Ukrop

The PHYSICS project demonstrates the value of the FaaS paradigm for application development and data analysis. Here’s how we enhanced the infrastructure layer.

The difficulty of scaling, optimizing, and maintaining infrastructure makes cloud computing too complex or resource-intensive for many developers and data scientists. The Function-as-a-Service (FaaS) model (often called serverless computing, generically) allows users to run certain types of applications in a modern, scalable, and cost-effective way without the added complexity of maintaining their own infrastructure. The PHYSICS project (oPtimized Hybrid Space-Time ServIce Continuum in faaS) aims to unlock the potential of the FaaS paradigm for cloud service providers and application developers in a cloud-agnostic way.

PHYSICS brought together a consortium of 14 international partners leveraging a €5 million Horizon Europe grant from the European Commission, including use case leaders in e-health, smart agriculture, and smart manufacturing. Engineers from Red Hat took a lead role in the infrastructure layer, adapting and enhancing tools in the Kubernetes ecosystem for scaling, energy awareness, and multicluster automation, including automatic cluster onboarding and configuration of PHYSICS components on top.

This article provides a brief overview of the PHYSICS project and initial milestones before detailing our most recent work on the infrastructure layer of the project.

PHYSICS 101

PHYSICS facilitates the design, implementation, and deployment of advanced FaaS applications, using new functional flow programming tools that harness established design patterns and existing cloud/FaaS component libraries. One of the key outcomes of the project is a novel Global Continuum Layer (distinct from the infrastructure layer) to facilitate efficient function deployment across diverse clusters.

The Global Continuum Layer is a set of PHYSICS components that optimize key application objectives at the same time, such as performance, latency, and cost. Use cases developed with industry partners include a smart manufacturing system to optimize production pipelines, healthcare software using machine learning (ML) models to monitor the health of individuals and analyze anonymized collected data, and a solution for near real-time greenhouse management that is responsive to dynamic conditions.

In our midpoint progress report, we described how visual flow programming and ready-made patterns can enhance function development and explored how ready-made patterns can enhance abstract function development. We also presented the Load Generator Metrics tool, which was developed to evaluate the performance of the different functions so that other PHYSICS components (such as the global orchestrator or colocation engine) could make optimized decisions about function placement across and inside clusters. (For a more detailed view, see the research day recording.

Advanced PHYSICS

In the second half of the project, we identified opportunities for extensions and additions in the infrastructure layer. Our main working areas were threefold:

Multicluster automation, using the Open Cluster Management API and Knative
Energy awareness (with RYAX), using Kubernetes-based Efficient Power Level Exporter (Kepler)
Scaling, using Kubernetes Event-driven Autoscaling (KEDA)

Multicluster support

Onboarding new clusters is an important challenge addressed by PHYSICS. Even given Open Cluster Management (OCM), extra PHYSICS components need to be installed and configured when adding a new cluster. Beyond deploying the new services, you need to:

Connect them to the relevant PHYSICS components in the hub
Create a benchmarking load in the added cluster to gather base performance and energy consumption data

When onboarding clusters into OCM, central and remote cluster configuration is crucial. The first step before deploying functions on the cluster is properly configuring them and obtaining enough information (semantics) about them to make wise decisions concerning placement. This involves connecting PHYSICS components in the added clusters with other PHYSICS components and applications at the hub.

The Open Cluster Management API and components serve as the foundation, adding various Kubernetes objects in the OCM ManifestWork such as pods, services, and ServiceExports (for Submariner support). This also includes specific Custom Resource Definitions (CRDs), for example, the Workflow CRD introduced by PHYSICS to abstract the information related to the functions so that they can be deployed on the relevant FaaS engine—in our case, either OpenWhisk or Knative.

*^{Overview of cluster onboarding components and interactions}*

PHYSICS uses Knative capabilities for event-driven and serverless operations, which optimizes resource usage. When a remote cluster is added to the hub via OCM, a ManagedCluster object is created. The Knative APIServeSource then invokes the Knative Serverless Service upon receiving the event. The cluster onboarding pod, running as a Knative Serverless Service, processes the request, configures the edge cluster, and generates a benchmarking load. Specifically, the cluster onboarding pod:

Obtains the cluster name.
Creates an OCM ManifestWork, which includes the definition of the semantic deployment and its associated service. The Klusterlet agent in the remote cluster is in charge of creating the local resources in that cluster.
Waits until the deployment is ready and obtains its service IP by using the OCM feedbackRule.
Creates another OCM ManifestWork, which includes a Kubernetes Job that will generate some benchmarking load in the managed cluster. Again, the Klusterlet agent creates the Kubernetes Job in the remote cluster.
Waits until the job is completed using the OCM feedbackRule.
Calls the semantic service endpoint, leveraging Submariner to reach the semantic service IP. This provides information about the previous job executed to gather energy consumption and performance metrics to score the cluster.
Calls the reasoning framework to provide the information about the semantic service IP, so it can start requesting semantic information from the new cluster.

Extra configurations can be added easily as part of the cluster onboarding logic component or even created as extra Knative Serverless Services that react to the same events and perform other actions in parallel.

Kepler integration allowed us to estimate energy scores for new clusters, enhancing our understanding of energy consumption.

Energy awareness

Kepler offers accurate energy estimates and detailed reporting of power consumption. It harnesses an extended Berkeley Packet Filter (eBPF) approach to attribute power consumption to specific processes, containers, and Kubernetes pods, running custom code in the Linux kernel (or other operating system kernels) to obtain the metrics to fuel ML models that estimate energy consumption.

Kepler integration allowed us to estimate energy scores for new clusters, enhancing our understanding of energy consumption.

PHYSICS selected the Kepler project to acquire energy-related information crucial for its components, including the scheduler. We integrated Kepler by incorporating its metrics (via Prometheus, an open source monitoring toolkit and one of the earliest Cloud Native Computing Foundation projects) into cluster onboarding and the PHYSICS semantic component. This integration allowed us to estimate energy scores for new clusters, enhancing our understanding of energy consumption.

Within the scope of the PHYSICS project, we actively engaged in Kepler’s upstream development. Our collaboration involved:

Identifying and resolving critical issues hindering nested environment utilization, i.e., when operating on top of virtual machines in public cloud providers (AWS or Azure)
Making Kepler suitable for FaaS use cases by enabling a higher frequency sampling rate

In addition, a significant facet of our involvement was assessing the accuracy of Kepler’s ML model. We gauged the model’s performance by comparing estimated metrics for power usage per node obtained through Kepler with real metrics gathered on Grid5000, helping to ensure the reliability of energy consumption estimates.

Autoscaling workloads

KEDA is the API for autoscaling workloads in Kubernetes clusters. With KEDA, container scaling is based on the number of events to be processed, rather than CPU or memory thresholds. In addition, it is lightweight and fully integrated with Kubernetes through CRDs (as with every PHYSICS component). KEDA works alongside the standard Kubernetes scaling components, such as Horizontal Pod Autoscaling (HPA), providing a straightforward way of extending its functionality without overwriting or duplication. It also allows managing different types of workloads—such as deployments, jobs, and even custom resources—and scaling down to zero (a plus for energy savings).

*^{Mapping between PHYSICS elastic controller and KEDA components}*

We evaluated the suitability of the scalers already present in the upstream catalog and identified a few options that we implemented and enhanced for PHYSICS/FaaS purposes. Two in particular were suitable for optimizing function wait time and function performance and providing the capability of scaling the nodes of the cluster if needed.

PHYSICS contributions

PHYSICS earned praise for its technical achievements during the conclusive project review and made substantial contributions to upstream open source communities like Kepler, Submariner, Knative, NodeRed, and more. Industrial use cases in e-health, smart agriculture, and smart manufacturing have been demonstrated in multiple publications. Those interested in learning more can explore further details on the PHYSICS website and access reusable artifacts at the PHYSICS marketplace. Source code for many components is also available on the public PHYSICS GitHub repository.

The PHYSICS project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No. 101017047.

SHARE THIS ARTICLE

Red Hat Collaboratory at Boston University seeks proposals for 2024

The Red Hat Collaboratory at Boston University has launched its annual Request for Proposals (RFP). Proposal submissions are due October 2, 2023, and awards will be announced by December 12, 2023. Awarded projects will have a start date of January 1, 2024. The funding program enables collaborative research between Red Hat engineers and Boston University […]

News

How the university OSPO approach is taking shape at the University of California and beyond

Stephanie Lieggi

Emily Lovell

James Davis

Universities are looking at open source program offices to make the most of open source research on campus. UC Santa Cruz has an update on their progress. In 2022, the University of California, Santa Cruz became one of the first universities to build an academic open source program office (OSPO) on their campus (see “Building […]

Perspectives

Research perspectives: Focus on education

Matej Hrušovský

Sarah Coghlan

Enabling hands-on, experiential opportunities for students at multiple learning levels has been a mainstay of the Red Hat Research mission. Mentoring students in open source development, teaching classes, creating curriculum, and contributing to education infrastructure are all ways of growing a robust open source research community. That in turn benefits students, the companies that hire […]

Feature

The Open Education Project is ready to scale

Danni Shi

Enhancements to the pioneering platform for open source education have made it more reliable, easier to use, and much more affordable for new users. What would it mean to open source education? For starters, we’d need a way for educators to create and publish their own high-quality open source materials—lectures, presentations, textbooks, and lab manuals—so […]

Column

Why you—yes, you—should take another look at Red Hat’s Research Interest Groups

Heidi Dempsey

Researchers, students, and software engineers all have something to gain and something to give when checking out research interest groups. I was going through my coat pockets recently and found an old pair of Red Hat sunglasses. The plastic shade part of the sunglasses had popped out on one side, so if you put them […]

News

DevConf.US at the edge

Gordon Haff

This year’s conference showcased the many flavors and functions of edge computing. Now in its fifth year, DevConf.US was back in person at Boston University this past August. Aimed at community and professional contributors to free and open source technologies, DevConf.US included talks and plenty of informal discussions about the usual wide range of topics, […]

Feature

The need for constant-time cryptography

Ján Jančár

Timing attacks have been used successfully against a variety of popular encryption techniques, but they can be prevented with consistent use of constant-time code practice. Cryptography provides privacy for millions of people, whether by ensuring end-to-end encrypted messaging, securing more than ninety percent of the web behind HTTPS, or establishing trust behind the digital signatures […]

Project Updates

Research Project Updates—October 2020

Faculty, PhD students, and US Red Hat associates in Israel are collaborating actively on the following research projects. This quarter we highlight collaborative projects at Technion University, Tel Aviv University, and The Interdisciplinary Center Herzliya. We will highlight research collaborations from other parts of the world in future editions of the Research Quarterly. Contact academic@redhat.com for more information on any project described here.

News

Telemetry Working Group looks at observability

Gordon Haff

A new working group is tackling observability in production.