Red Hat Research Quarterly

Research perspectives: Focus on AI and machine learning

Sanjay Arora

Sanjay Arora Sanjay Arora leads the AI agenda for Red Hat Research and is mainly interested in the application of machine learning to low-level systems.

about the author

Marek Grác

Marek Grác is a senior software engineer at Red Hat.

Article featured in

Red Hat Research Quarterly

May 2023

Download PDF

Subscribe now

In this issue

Interview

“That’s what open source is all about”: A short history of collaboration, innovation, and education in research

Shaun Strohmer

Feature

Meet CCO: a scalable multicloud cost optimizer for complex workloads

Ilya Kolchinsky

Feature

Tuning Linux kernel policies for energy efficiency with machine learning

Han Dong

Perspectives

Research perspectives: Focus on open hardware

Ahmed Sanaullah

Ulrich Drepper

Feature

Open source education: from philosophy to reality

Danni Shi

Feature

A data-driven approach for analyzing Common Criteria and FIPS 140 security certificates

Jaroslav Řezník

Petr Švenda

Perspectives

Research perspectives: Focus on clouds and research IT

Heidi Dempsey

Gagan Kumar

Perspectives

Research perspectives: Focus on testing and operations

Bandan Das

Daniel Bristot de Oliveira

Perspectives

Research perspectives: Focus on security, privacy, and cryptography

Lily Sturmann

Perspectives

Research perspectives: Focus on AI and machine learning

Sanjay Arora

Marek Grác

Red Hat Research focuses on accelerating the practical applications for artificial intelligence and machine learning by combining academic approaches and industry use cases. Rather than focusing purely on advancing AI/ML techniques, we identify research collaborations where they can play a central role in solving computing problems. The AI/ML projects we’ve highlighted in past issues drive innovation across multiple disciplines, from optimizing hybrid cloud operations to enhancing education, developing technology for autonomous vehicles, and automating methods for detecting visual disinformation online.

We asked Sanjay Arora, a Red Hat Research data scientist, and Marek Grác, a Red Hat senior software engineer and lecturer in machine learning at Masaryk University (Brno, CZ), to share their perspectives on trends in AI/ML, past, present, and future. Sanjay has contributed to RHRQ in the articles “When good models go bad: minimizing dataset bias in AI” (Feb 2021) and “Yuga: a tool to help Rust developers write unsafe code more safely” (Feb 2022). Marek has contributed to the RHRQ stories “‘When one teaches, two learn’: making the most of technical research mentorship” (Aug 2021) and “Making machine learning accessible across disciplines” (Nov 2021).

One of our tasks at Red Hat Research is persuading engineering managers and business units that it is worth engineers’ time to participate in research projects, internships, and mentoring students working on a thesis. Cooperation with universities has led to collaboration in research and demonstrated its applications for products and the open source community, making persuasion easier. The range of topics is broad, from online learning and community management to cloud technologies and testing.

Open source through the research lens

As RHRQ starts its fifth year it’s impossible to resist the temptation to look back at all we’ve done so far. The result is this collection of perspectives. Together they paint an inspiring picture of the innovative work that can be accomplished when engineering know-how and bold research questions come together in open source environments.

AI/ML is an excellent example of how that relationship works. The learnings we gain from research and teaching at universities are something we can also share internally. Our efforts to work on AI/ML projects with university partners have led to a broader understanding of the role of AI/ML in industry. The Applied Machine Learning course we teach at Masaryk University has content we want to teach in-house as well, so software engineers, quality assurance teams, and other roles know how to work with machine learning people to get the best possible results.

We foresee several different directions for further work in the field of AI/ML itself and in the many domains where it plays a key role:

Secure multiparty computing and differential privacy: The capacity to use sensitive datasets without revealing information about individuals in the dataset is critical for expanding the use of AI/ML systems. A good example is the recently launched Red Hat Collaboratory project “Co-ops: collaborative open source and privacy-preserving training for learning to drive.” The project is building out a privacy-preserving platform for sharing data collected from cars and videos that can be used for distributed, large-scale training of models for self-driving.

*^{The February 2021 issue featured Kate Saenko, Boston University professor and consulting professor for the MIT-IBM Watson AI lab, on minimizing dataset bias in AI.}*

Large ML models and their applicability data, including telemetry, logs, and error messages: Training large language models (LLM) on this unannotated data could enable better predictions. Generating natural language responses from a knowledge base could also support helpdesk technicians and automate a portion of their work. Current developments in this arena include GPT-4 and Chat GPT, which are closed source. The Red Hat OpenShift Data Science (RHODS) team is providing the infrastructure for IBM efforts to train large models, which could lead to work on applying these models for better searching, querying Red Hat Insights, or integrated development environments (IDEs).

Replacing heuristics with learned policies in systems software: Systems software like operating systems and compilers have a lot of heuristics that are used to guide decision making. These decisions affect performance and resource consumption. Red Hat Research is engaged in several projects exploring the replacement of these heuristics with learned policies using techniques like reinforcement learning and Bayesian Optimization. Two projects—”Automatic configuration of complex hardware” and “Toward high performance and energy efficiency in open source stream processing“— involve learning network policies governing packet batching and processor voltage and frequency settings to enable substantial energy savings while maintaining performance guarantees. (See also Han Dong’s article “Tuning Linux kernel policies for energy efficiency with machine learning” in this issue.) The project “Practical programming of FPGAs with open source tools” focuses on searching for the optimal ordering of compiler passes to maximize performance for the compiled code.

Open source education projects, so universities can freely share study materials or franchise already established courses: The need for data science and ML skills to fill industry positions is great. Red Hat Research maintains an open catalog of course materials for collaboration based on courses at partner universities and institutions. We are preparing materials for Applied Machine Learning and Employment and the IT job market. These courses will help not only universities and students but also employers. (See also Danni Shi’s article “Open source education: from philosophy to reality” in this issue.)

How open data standards make Brno a better city

Robert Spal

Brno, Czech Republic, is home to the world’s largest Red Hat technology center, and it was the birthplace of the university-industry relationship model that became Red Hat Research. Here’s how the smart city concept has been implemented in one of our hometowns. The article stems from a presentation at DevConf.cz 2022. To flourish in an […]

News

Hybrid cloud, edge, and security research featured at DevConf.CZ 2023

After more than three years of strictly virtual meetings, DevConf.CZ has finally returned to in-person events. The Brno-based hybrid gathering is an annual, free, Red Hat sponsored community conference for developers, admins, DevOps engineers, testers, documentation writers, and other contributors to open source technologies. Presentations highlighted progress made via industry-university collaboration in areas critical to […]

Feature

Where will we find the data scientists?

Jennifer Wood

Universities play a primary role in developing data skills, but traditional education alone can’t close the skills gap fast enough. The mismatch between the widespread need for strong data skills and the current workforce is an obstacle for nearly every sector of the economy, which means no single sector can solve it. Collaborative partnerships among […]

News

Publication highlights—August 2022

Red Hat Research collaborates with universities and government agencies to produce papers that bring open source contributions along with them. This is a sampling of recent publications and conference presentations; to see more visit the publications page on the Red Hat Research website.

Interview

Where are we with wireless? How researchers are pushing forward the state of the art, and what that means for industry

Heidi Dempsey

Feature

Finding bugs in parallel programs with heavy-duty program analysis

Vladimír Štill

Parallelism promises to make programs faster, yet it also opens many new pitfalls and makes testing programs much harder.

Feature

Protecting data privacy: a look in our current toolkit

Gordon Haff

The research uses for data could be endless, but without meeting stringent privacy requirements, some of the most promising analyses may never begin. “Data is the new oil” is a shorthand generally credited to UK mathematician Clive Humby. The saying got considerable play when “Big Data” was the latest catchphrase around a decade ago. As […]

Feature

Efficient runtime verification for the Linux kernel

Daniel Bristot de Oliveira

If safety-critical systems fail, they can cause significant damage, including loss of life. In this article we consider methods to verify their behavior in production.

Interview

A marriage of true minds: Making university-industry collaborations succeed

Martin Ukrop

Tomáš Vojnar has been a researcher, professor, vice-dean, and department chair—what about a marriage counselor? In conversation with Red Hat Research engineer Martin Ukrop, Tomáš —now head of the Department of Computer Systems and Communications in the Faculty of Informatics at Masaryk University—joked that a good relationship between academic and industrial partners can be like […]