Americas Research Interest Group

Red Hat associates from locations across North and South America collaborate with primarily North and South American-based researchers on many research projects. In addition to long-standing formal arrangements with Boston University, the Mass Open Cloud Alliance, and the University of Massachusetts, we support student and faculty research and open source development work for undergraduates, Master’s, and PhD students. We also teach classes, mentor students, deliver technology workshops, and support outreach programs that improve diversity in computer science and engineering.

If you are a student interested in a project opportunity, please contact us.  If you are a Red Hatter interested in submitting a project, please copy this template and email your idea to Heidi Dempsey, Research and Innovation Director, North America.  

Meetings are currently on pause; see details on prior meetings below.

Catch recordings of meetings on the Americas Research Interest Group YouTube Channel.

Access meeting notes here.

 

Machine Learning Tuning of Kernel Policies Towards Energy Efficiency in Diverse Hardware and Software [Americas Research Interest Group Meeting, July 2023]

Link to access the meeting: http://meet.google.com/gsa-xdpn-nit

Materials from Meeting


Join us for the next Red Hat Research Americas Research Interest Group Meeting on July 18, 2023 at 3PM EDT. The meeting is open to Red Hatters and our research partners.

Meeting Agenda
Machine Learning Tuning of Kernel Policies Towards Energy Efficiency in Diverse Hardware and Software
Han Dong, Boston University

Abstract
As global data center energy use continues to rise, a core goal operating systems (OS), which is to enable higher efficiency and get work done while consuming fewer resources is magnified due to increasingly constrained energy budgets. Our work focuses on revealing how three fundamental aspects of an OS: 1) interrupt coalescing, 2) processor sleep state, and 3) dynamic voltage frequency scaling, can impact the performance and energy efficiency of network processing through a diverse hardware and software experimental study. We built upon our previous work which establishes how a state-of-the-art machine learning technique, Bayesian optimization, can be used by an operator to dynamically adjust service-level agreement (SLA) and energy goals while supporting a real world in-memory key-value store workload. This was made possible by the insight that being able to externally control interrupt coalescing helps stabilize application latency periods such that it becomes easier to control performance-energy trade-offs and magnify its benefits with processor frequency scaling while utilizing specialized sleep states.

We theorize that this insight is generally applicable across diverse sets of hardware and SLA-driven network applications. Almost all modern CPU architectures expose a degree of dynamic voltage frequency scaling such that it can trade off instruction execution speed with a reduction in energy use. Furthermore, modern NICs and their device drivers are typically developed to be configured via the ethtool networking utility, which often provide interfaces that enable user defined interrupt coalescing rates. Adjusting these rates can improve overall software stack efficiency as system overheads such as interrupt processing, OS book-keeping, and cache misses are amortized or eliminated by the batched handling of packets. To demonstrate this theory, we undertook a diverse experimental study to demonstrate how Bayesian optimization can be applied across various CPUs and NICs while running a diverse set of SLA-driven network applications. We utilize experimental hardware from both the Massachusetts Open Cloud (MOC) and CloudLab to demonstrate the generality and useability of Bayesian optimization as a mechanism to dynamically target SLA and energy goals.


News

Red Hat Collaboratory at Boston University announces Request for Proposals for 2024 Grants

Red Hat Collaboratory at Boston University announces Request for Proposals for 2024 Grants

The Red Hat Collaboratory at Boston University has announced details on the Request for Proposals (RFP) for 2024 Grants. The goal of the program is to enable collaborative research between Boston University researchers and Red Hat engineers. Projects must be open source and should generally focus on problems of distributed, operating, security, or network systems whose solution shows promise for advancing their field and impacting industry.

Affiliated Universities

Boston University

University of Massachusetts, Lowell

University of Massachusetts, Amherst

Northeastern University

People

Events

There are no events scheduled at the moment. Stay tuned for upcoming opportunities to connect with the community!

Related Projects

TitleSummaryResearch AreaUniversitieshf:tax:research_area
CuratorOperator Curator is an air-gapped infrastructure consumption analysis tool for the Red Hat OpenShift Container Platform. The curator retrieves infrastructure …
Improving Cyber Security Operations using Knowledge GraphsAbstractThe objective of this project is to improve the workflow and performance of security operation centers, including automating several of …, , ai-ml cloud-ds testing-and-ops
Minimal Mobile Systems via Cloud-based Adaptive Task ProcessingThe high cost of robots today has hindered their widespread use. Specifically, a limiting factor involves extensive hardware and software …, ai-ml cloud-ds
Co-Ops: Collaborative Open Source and Privacy-Preserving Training for Learning to DriveNote: This project is a continuation of OSMOSIS: Open-Source Multi-Organizational Collaborative Training for Societal-Scale AI Systems. AbstractCurrent development of autonomous …, ai-ml cloud-ds
CoDes : A co-design research lab to advance specialized hardware projectsCoDes research lab provides the infrastructure and engineering foundation needed to support co-design based specialized hardware research. The lab is currently located at Boston University, as part of the Red Hat – Boston University collaboratory., , ai-ml cloud-ds hardware-and-the-os
Prototyping a Distributed, Asynchronous Workflow for Iterative Near-Term Ecological ForecastingAbstractThe ongoing data revolution has begun to fuel the growth of near-term iterative ecological forecasts: continually-updated predictions about the future …
FHELib: Fully Homomorphic Encryption Hardware Library for Privacy-preserving ComputingNote: Please visit the Privacy-Preserving Cloud Computing using Homomorphic Encryption project page for information on a related project. In today’s …, , cloud-ds hardware-and-the-os security-privacy-cryptography
SECURE-ED: Open-Source Infrastructure for Student Learning Disability Identification and Treatment The project aims to develop an infrastructure that would enable users to input data about an individual student and receive …
Relational Memory ControllerNote: See the Near-Data Data Transformation project page for information about the work that led to this project. Abstract: Data movement …
Learned Cost-Models for Robust TuningNote: Please see the Robust Data Systems Tuning project page for earlier results associated with this research. Abstract: Data systems’ performance is …
Open-Source Toolchain Optimization for FPGA CADAdditional details to be added soon! Project Poster Link to full size project posterhardware-and-the-os
Characterizing Microservice ArchitecturesMicroservice architectures are the default method for building distributed applications in industry. Though the basic tenants of this architectural style …, , cloud-ds testing-and-ops
Understanding accuracy decay in online image retrieval systems within the context of open-set classification and unsupervised clusteringImage retrieval systems are extremely useful to political scientists and human rights advocates attempting to understand the scope and spread of disinformation in massive datasets. However, in standard image retrieval tasks the corpus of images is unchanging as time moves forward. When considering online disinformation this is clearly not the case. Image retrieval in an online system can essentially be modeled as an open-set problem, where there is no guarantee that the classes of images seen before will have any correspondence to the classes of images seen at present or in the future., ai-ml cloud-ds
Automated detection of memory safety vulnerabilities in RustIn comparison to C, the Rust language provides significant memory safety guarantees through its concept of lifetimes and its borrow-checker. …testing-and-ops
Tuning the Linux kernelThe Linux kernel is a complicated piece of software with multiple components interacting with each other in complex ways. The …, ai-ml hardware-and-the-os
Disinformation Detection at ScaleThe increased prevalence of fake and manipulated visual media on the Internet has led to social and technical dilemmas in …, , ai-ml security-privacy-cryptography
AI for Cloud OpsThis project aims to address this gap in effective cloud management and operations with a concerted, systematic approach to building and integrating AI-driven software analytics into production systems. We aim to provide a rich selection of heavily-automated “ops” functionality as well as intuitive, easily-accessible analytics to users, developers, and administrators, , ai-ml cloud-ds hardware-and-the-os
Creating a global open research platform to better understand social sustainability using data from a real-life smart villageA BU team is working with SmartaByar and the Red Hat Social Innovation Program in order to create a global …, , ai-ml cloud-ds security-privacy-cryptography
DISL: A Dynamic Infrastructure Services Layer for Reconfigurable HardwareOpen programmable hardware offers tremendous opportunities for increased innovation, lower cost, greater flexibility, and customization in systems we can now …, cloud-ds hardware-and-the-os
Practical Programming of FPGAs with Open Source ToolsThis project has evolved from the Practical programming of FPGAs in the data center and on the edge project. Please see …, cloud-ds hardware-and-the-os
Near-Data Data TransformationBU faculty members Manos Athanassoulis and Renato Mancuso will work with Red Hat researchers Uli Drepper and Ahmed Sanaullah to create a hardware-software co-design paradigm for data systems that implements near-memory processing., cloud-ds hardware-and-the-os
Towards high performance and energy efficiency in open-source stream processing.BU faculty members Vasiliki Kalavari and Jonathan Appavoo will work with Red Hat researcher Sanjay Arora to create an open-source …hardware-and-the-os
OSMOSIS: Open-Source Multi-Organizational Collaborative Training for Societal-Scale AI SystemsThe goal of our project is to develop a novel framework and cloud-based implementation for facilitating collaboration among highly heterogeneous research, development, and educational settings., ai-ml cloud-ds
Privacy-Preserving Cloud Computing using Homomorphic EncryptionNote: Please visit the FHELib: Fully Homomorphic Encryption Hardware Library for Privacy-preserving Computing project page for information on a related …, , cloud-ds hardware-and-the-os security-privacy-cryptography
Serverless Streaming Graph AnalyticsIn this project, we will focus on graph streams that can be used to model distributed systems, where workers are represented as nodes connected with edges that denote communication or dependencies., cloud-ds testing-and-ops
Enabling Intelligent In-Network Computing for Cloud SystemsWith the network infrastructure becoming highly programmable, it is time to rethink the role of networks in the cloud computing …, cloud-ds testing-and-ops
Linux Computational CachingIn this speculative work we are attempting to explore a biologically motivated conjecture on how memory of past computing can be stored and recalled to automatically improve a system’s behavior., , ai-ml cloud-ds hardware-and-the-os
The Open Education Project (OPE)In this project we are developing an exemplar set of materials for an introductory computers systems class that exploits, Jupyter, Jupyter Books, OpenShift and the the Mass Open Cloud to develop and deliver a unique educational experience for learning about how computer systems work., ai-ml cloud-ds
Symbiotes: A New step in Linux’s EvolutionThis work explores how a new kind of software entity, a symbiotie, might bridge this gap. By adding the ability for application software to shed the boundary that separates it from the OS kernel it is free to integrate, modify and evolve in to a hybrid that is both application and OS., hardware-and-the-os security-privacy-cryptography
Intelligent Data Synchronization for Hybrid CloudsThe goal of this project is to design configurable synchronization solutions on a common platform for a wide range of edge computing scenarios relevant to Red Hat. These solutions will be thoroughly validated on a state-of-the-art testbed capable of emulating realistic environments (e.g., smart cities)., , ai-ml cloud-ds testing-and-ops
Secure cross-site analytics on OpenShift logsThe project aims to explore whether cryptographically secure Multi-Party Computation, or MPC for short, can be used to perform secure cross-site analytics on OpenShift logs with minimum client participation., , cloud-ds security-privacy-cryptography testing-and-ops
Robust Data Systems TuningNote: Please see the Learned Cost-Models for Robust Tuning project page for research that has grown from this project. See …, ai-ml hardware-and-the-os
Robust LSM-Trees Under Workload UncertaintyWe introduce a new robust tuning paradigm to aid in the design of data systems with uncertain assumptions by modeling the behavior of the system and then utilizing these models in conjunction with techniques in robust optimization. Our approach is demonstrated through tuning a popular log-structured merge-tree based storage engine, RocksDBhardware-and-the-os
Does efficient, private, agnostic learning imply efficient, agnostic online learning?Users of online services today must trust platforms with their personal data. Platforms can choose to enable privacy by default …
Are Adversarial Attacks a Viable Solution to Individual Privacy?Users of online services today must trust platforms with their personal data. Platforms can choose to enable privacy by default …security-privacy-cryptography
Hybrid Cloud CachingA fundamental goal of the Hybrid Cloud Cache project is to allow simplified integration into existing data lakes, to enable caching to be transparently introduced into hybrid cloud computation, to support efficient caching of objects widely shared across clusters deployed by different organizations, and to avoid the complexity of managing a separate caching service on top of the data lake,
Volume Storage Over Object StorageThis project creates a hybrid storage system composed of a high-speed local device (e.g. Optane) to store short term data, along with a write-once object store (e.g, Ceph RGW) to store data blocks permanently., cloud-ds
Elastic Secure InfrastructureThis project encompasses work in several areas to design, build and evaluate secure bare-metal elastic infrastructure for data centers., , cloud-ds security-privacy-cryptography testing-and-ops
Open Cloud TestbedThe Open Cloud Testbed project will build and support a testbed for research and experimentation into new cloud platforms – the underlying software which provides cloud services to applications. Testbeds such as OCT are critical for enabling research into new cloud technologies – research that requires experiments which potentially change the operation of the cloud itself., , , , , , ai-ml cloud-ds hardware-and-the-os security-privacy-cryptography testing-and-ops
Kernel Techniques to Optimize Memory Bandwidth with Predictable LatencyRecent processors have started introducing the first mechanism to monitor and control memory bandwidth. Can we use these mechanisms to enable machines to be fully used while ensuring that primary workloads have deterministic performance? This project presents early results from using Intel’s Resource Director Technology and some insight into this new hardware support. The project also examines an algorithm using these tools to provide deterministic performance on different workloads.hardware-and-the-os
Unikernel LinuxThis project aims to turn the Linux kernel into a unikernel with the following characteristics: 1) are easily compiled for any application, 2) use battle-tested, production Linux and glibc code, 3) allow the entire upstream Linux developer community to maintain and develop the code, and 4) provide applications normally running vanilla Linux to benefit from unikernel performance and security advantages.hardware-and-the-os
Fuzzing Device Emulation in QEMUHypervisors—the software that allows a computer to simulate multiple virtual computers—form the backbone of cloud computing. Because they are both ubiquitous and essential, they are security-critical applications that make attractive targets for potential attackers., , hardware-and-the-os security-privacy-cryptography testing-and-ops
Automatic Configuration of Complex HardwareIn this project, we pursue three goals towards this understanding: 1) identify, via a set of microbenchmarks, application characteristics that will illuminate mappings between hardware register values and their corresponding microbenchmark performance impact, 2) use these mappings to frame NIC configuration as a set of learning problems such that an automated system can recommend hardware settings corresponding to each network application, and 3) introduce either new dynamic or application instrumented policy into the device driver in order to better attune dynamic hardware configuration to application runtime behavior.hardware-and-the-os
Quest-V, a Partitioning Hypervisor for Latency-Sensitive WorkloadsQuest-V is a separation kernel that partitions services of different criticality levels across separate virtual machines, or sandboxes. Each sandbox encapsulates a subset of machine physical resources that it manages without requiring intervention from a hypervisor. In Quest-V, a hypervisor is only needed to bootstrap the system, recover from certain faults, and establish communication channels between sandboxes.hardware-and-the-os
Performance Management for Serverless ComputingServerless computing provides developers the freedom to build and deploy applications without worrying about infrastructure. Resources (memory, cpu, location) specified …cloud-ds