Red Hat Research Quarterly

Research perspectives: Focus on AI and machine learning

Sanjay Arora

Sanjay Arora Sanjay Arora leads the AI agenda for Red Hat Research and is mainly interested in the application of machine learning to low-level systems.

about the author

Marek Grác

Marek Grác is a senior software engineer at Red Hat.

Article featured in

Red Hat Research Quarterly

May 2023

Download PDF

Subscribe now

In this issue

Interview

“That’s what open source is all about”: A short history of collaboration, innovation, and education in research

Shaun Strohmer

Feature

Meet CCO: a scalable multicloud cost optimizer for complex workloads

Ilya Kolchinsky

Feature

Tuning Linux kernel policies for energy efficiency with machine learning

Han Dong

Perspectives

Research perspectives: Focus on open hardware

Ahmed Sanaullah

Ulrich Drepper

Feature

Open source education: from philosophy to reality

Danni Shi

Feature

A data-driven approach for analyzing Common Criteria and FIPS 140 security certificates

Jaroslav Řezník

Petr Švenda

Perspectives

Research perspectives: Focus on clouds and research IT

Heidi Dempsey

Gagan Kumar

Perspectives

Research perspectives: Focus on testing and operations

Bandan Das

Daniel Bristot de Oliveira

Perspectives

Research perspectives: Focus on security, privacy, and cryptography

Lily Sturmann

Perspectives

Research perspectives: Focus on AI and machine learning

Sanjay Arora

Marek Grác

Red Hat Research focuses on accelerating the practical applications for artificial intelligence and machine learning by combining academic approaches and industry use cases. Rather than focusing purely on advancing AI/ML techniques, we identify research collaborations where they can play a central role in solving computing problems. The AI/ML projects we’ve highlighted in past issues drive innovation across multiple disciplines, from optimizing hybrid cloud operations to enhancing education, developing technology for autonomous vehicles, and automating methods for detecting visual disinformation online.

We asked Sanjay Arora, a Red Hat Research data scientist, and Marek Grác, a Red Hat senior software engineer and lecturer in machine learning at Masaryk University (Brno, CZ), to share their perspectives on trends in AI/ML, past, present, and future. Sanjay has contributed to RHRQ in the articles “When good models go bad: minimizing dataset bias in AI” (Feb 2021) and “Yuga: a tool to help Rust developers write unsafe code more safely” (Feb 2022). Marek has contributed to the RHRQ stories “‘When one teaches, two learn’: making the most of technical research mentorship” (Aug 2021) and “Making machine learning accessible across disciplines” (Nov 2021).

One of our tasks at Red Hat Research is persuading engineering managers and business units that it is worth engineers’ time to participate in research projects, internships, and mentoring students working on a thesis. Cooperation with universities has led to collaboration in research and demonstrated its applications for products and the open source community, making persuasion easier. The range of topics is broad, from online learning and community management to cloud technologies and testing.

Open source through the research lens

As RHRQ starts its fifth year it’s impossible to resist the temptation to look back at all we’ve done so far. The result is this collection of perspectives. Together they paint an inspiring picture of the innovative work that can be accomplished when engineering know-how and bold research questions come together in open source environments.

AI/ML is an excellent example of how that relationship works. The learnings we gain from research and teaching at universities are something we can also share internally. Our efforts to work on AI/ML projects with university partners have led to a broader understanding of the role of AI/ML in industry. The Applied Machine Learning course we teach at Masaryk University has content we want to teach in-house as well, so software engineers, quality assurance teams, and other roles know how to work with machine learning people to get the best possible results.

We foresee several different directions for further work in the field of AI/ML itself and in the many domains where it plays a key role:

Secure multiparty computing and differential privacy: The capacity to use sensitive datasets without revealing information about individuals in the dataset is critical for expanding the use of AI/ML systems. A good example is the recently launched Red Hat Collaboratory project “Co-ops: collaborative open source and privacy-preserving training for learning to drive.” The project is building out a privacy-preserving platform for sharing data collected from cars and videos that can be used for distributed, large-scale training of models for self-driving.

*^{The February 2021 issue featured Kate Saenko, Boston University professor and consulting professor for the MIT-IBM Watson AI lab, on minimizing dataset bias in AI.}*

Large ML models and their applicability data, including telemetry, logs, and error messages: Training large language models (LLM) on this unannotated data could enable better predictions. Generating natural language responses from a knowledge base could also support helpdesk technicians and automate a portion of their work. Current developments in this arena include GPT-4 and Chat GPT, which are closed source. The Red Hat OpenShift Data Science (RHODS) team is providing the infrastructure for IBM efforts to train large models, which could lead to work on applying these models for better searching, querying Red Hat Insights, or integrated development environments (IDEs).

Replacing heuristics with learned policies in systems software: Systems software like operating systems and compilers have a lot of heuristics that are used to guide decision making. These decisions affect performance and resource consumption. Red Hat Research is engaged in several projects exploring the replacement of these heuristics with learned policies using techniques like reinforcement learning and Bayesian Optimization. Two projects—”Automatic configuration of complex hardware” and “Toward high performance and energy efficiency in open source stream processing“— involve learning network policies governing packet batching and processor voltage and frequency settings to enable substantial energy savings while maintaining performance guarantees. (See also Han Dong’s article “Tuning Linux kernel policies for energy efficiency with machine learning” in this issue.) The project “Practical programming of FPGAs with open source tools” focuses on searching for the optimal ordering of compiler passes to maximize performance for the compiled code.

Open source education projects, so universities can freely share study materials or franchise already established courses: The need for data science and ML skills to fill industry positions is great. Red Hat Research maintains an open catalog of course materials for collaboration based on courses at partner universities and institutions. We are preparing materials for Applied Machine Learning and Employment and the IT job market. These courses will help not only universities and students but also employers. (See also Danni Shi’s article “Open source education: from philosophy to reality” in this issue.)

Research perspectives: Focus on clouds and research IT

Heidi Dempsey

Gagan Kumar

The open cloud has been both cornerstone and North Star for Red Hat Research. Our relationship with the Mass Open Cloud (MOC) and its more recent iteration, the MOC Alliance, has been critical to advancing our understanding of open cloud architecture and the many possibilities it opens for research. (Look no further than our interview […]

Perspectives

Research perspectives: Focus on education

Matej Hrušovský

Sarah Coghlan

Enabling hands-on, experiential opportunities for students at multiple learning levels has been a mainstay of the Red Hat Research mission. Mentoring students in open source development, teaching classes, creating curriculum, and contributing to education infrastructure are all ways of growing a robust open source research community. That in turn benefits students, the companies that hire […]

Perspectives

Research perspectives: Focus on testing and operations

Bandan Das

Daniel Bristot de Oliveira

Red Hat Research has fostered work on testing and analysis that started as open source explorations and ended as valuable upstreamed resources for anyone to use. We asked two engineers who’ve worked on highly successful projects, Daniel Bristot de Oliveira and Bandan Das, to share some of the biggest research accomplishments so far and let […]

Perspectives

Research perspectives: Focus on security, privacy, and cryptography

Lily Sturmann

RHRQ asked Lily Sturmann, a senior software engineer at Red Hat in the Office of the CTO in Emerging Technologies, to look back at the past few years of research in the area of security and privacy research and share her perspective on the future. She has contributed frequently to the Red Hat Next blog, […]

Perspectives

Research perspectives: Focus on open hardware

Ahmed Sanaullah

Ulrich Drepper

When the research interests that eventually coalesced into Red Hat Research first started, open hardware innovation was not a central feature on our roadmap. We asked Distinguished Engineer Ulrich Drepper and Senior Data Scientist (FPGAs) Ahmed Sanaullah to explain how and why that changed. Uli leads Red Hat’s research and future vision on artificial intelligence, […]