Test Case Prioritization: Towards High-Reliability Continuous Integration

Nov 4, 2021 | Blog, Israel

In August 2021, a team of graduate students from Reichman University, a leading research university in Israel, started working on the TCP project under the guidance of senior engineers from Red Hat. In a nutshell, the goal of the TCP project is to create a novel ML-based tool that solves the TCP (Test Case Prioritization) problem in software regression testing. 

As the size of software increases, testing requires more time and resources

Automatic regression testing is a crucial step of any CI/CD pipeline. Its primary goal is to detect bugs and defects introduced by recent changes as early as possible while keeping verification costs at a very low level. An ability to perform regression testing efficiently and effectively (i.e., within a small timeframe yet catching the majority of the bugs) would allow developers to rapidly deliver reliable software updates to the users.

Unfortunately, the regression testing process in modern large-scale software products tends to be much more complex and cumbersome than desired. As the size of the software increases, the test suite also grows bigger and requires more time and resources to be fully executed. In many cases, the time to run the entire test suite can reach 3-4 days or even a week. Consequently, executing all available tests during the CI/CD regression testing procedure is highly impractical and, in many cases, completely infeasible.

A new approach to software testing

To address this issue, test case prioritization (TCP) methods have emerged. Test case prioritization aims to order a given set of test cases such that, the earlier a test appears in the resulting order, the higher is the probability for this test to detect a bug or a fault introduced by the given code change. Provided such an ordering on the entire test suite, the regression testing procedure can iterate over it starting from the beginning and advancing until a predefined limit on the maximal testing time (say, 1 minute) is reached. As the most significant test cases are executed first, the chances of early fault detection rise while still only executing a negligible part (say, 1%) of the entire test suite.

In recent years, TCP solutions have been adopted by major industry players and have spawned a wave of academic research projects. However, the full potential of TCP in regression testing is yet to be explored. As of now, the most widely used approaches are mainly based on heuristic search strategies and/or code coverage methods. In contrast, methods based on machine learning in general and deep learning in particular are barely explored. Closing this gap and devising a ML-based TCP tool is the primary goal of our project.

Preliminary results

As of October 2021, the project team has reached its first major milestone: implementing an open-source proof-of-concept prototype that replicates the state-of-the-art academic results in this area. A large dataset of OpenShift test case execution statistics (successes and failures of tests on a variety of project versions) has been collected, an array of useful high-level features has been extracted, and a learning model based on XGBoost has been trained. The uncalibrated model has reached highly promising results, namely an accuracy of up to 0.95 and a recall of up to 0.98.

As an immediate next step, the team will be looking to improve the state-of-the-art. To that end, a variety of machine learning and data mining methods will be considered.

Those interested in finding out more about the TCP project and/or looking for collaboration opportunities are kindly invited to contact Dr. Ilya Kolchinsky (ikolchin@redhat.com) or Gil Klein (gklein@redhat.com).

Related Stories

Intern Spotlight: Christina Xu, Red Hat Research Boston

Intern Spotlight: Christina Xu, Red Hat Research Boston

At Red Hat Research, we hire creative, passionate students ready to work and learn with a global leader in open source solutions. Our interns bring fresh ideas and new connections to challenging problems in the open source community, unlocking their own potential...

Intern Spotlight: Jake Correnti, Red Hat Research Boston

Intern Spotlight: Jake Correnti, Red Hat Research Boston

At Red Hat Research, we hire creative, passionate students ready to work and learn with a global leader in open source solutions. Our interns bring fresh ideas and new connections to challenging problems in the open source community, unlocking their own potential...

Getting started with data science and machine learning

Getting started with data science and machine learning

Data science has exploded in popularity (and sometimes, hype) in recent years. This has led to an increased interest in learning the subject. With so many possible directions, it can be hard to know where to start. This blog post is here to help.

The (open) source of cutting-edge innovation

The (open) source of cutting-edge innovation

by Gordon Haff, technology advocate at Red Hat Where do people come together to make cutting-edge invention and innovation happen? One possible answer is the corporate research lab. More long-term focused than most company product development efforts, corporate labs...

Intern Spotlight: Maria Shevchuk, Red Hat Research Boston

Intern Spotlight: Maria Shevchuk, Red Hat Research Boston

This blog post spotlights Maria Shevchuk, a senior pursuing a BS in Biomedical Engineering and a BA in Computer Science dual degree at Boston University.  Maria has worked with Red Hat through student-funded opportunities associated with the Red Hat Collaboratory at Boston University and directly as a Red Hat intern.  She spoke with us about her research with the Red Hat Collaboratory at Boston University, how she has leveraged her time at Red Hat to pursue her passions in healthcare and technology, making the most of an internship, and her take on the hot dog sandwich debate.

Mastering Git with university students

Mastering Git with university students

Irina Gulina, Sr. Software Quality Engineer, RHEL for SAP Solutions, CCSP, Red Hat, and Tomáš Tomeček, Senior Principal Software Engineer, Linux Integration Engineering, Red Hat, discuss the Mastering Git course they teach at Masaryk University (MUNI) at the Faculty of Informatics (FI) in Brno, Czech Republic. The course was organized with the help of Martin Ukrop, Red Hat Program Manager, Red Hat Research. 

Intern Spotlight: Rohan Devasthale, Red Hat Research Boston

Intern Spotlight: Rohan Devasthale, Red Hat Research Boston

This blog post spotlights Rohan Devasthale, a Software Engineering Intern. Rohan spoke with us about his contributions to the Elastic Secure Infrastructure (ESI) project, how his experience as a Red Hat Research Intern enhanced his technical skills, and his passion for badminton.