The Open Education Project is ready to scale

Enhancements to the pioneering platform for open source education have made it more reliable, easier to use, and much more affordable for new users.

What would it mean to open source education? For starters, we’d need a way for educators to create and publish their own high-quality open source materials—lectures, presentations, textbooks, and lab manuals—so they aren’t locked into proprietary texts or software. We’d also want a way to deploy these materials in a live and interactive manner, at university scale, that makes it easy to collaborate and share content. To really make it open, we’d need to maximize accessibility, so a student can engage from anywhere simply by opening a web browser.

We’re now ready to host courses from other departments and other universities from around the world.

That’s the aim of the Open Education Project (OPE), which launched in 2022 when it first received support from the Red Hat Collaboratory Research Incubation Award program. In the May 2023 issue of RHRQ (“Open source education: from philosophy to reality”), I wrote about our progress with OPE and future milestones. Since then, we’ve run multiple successful Boston University computer science courses on OPE and made several improvements. We’re now ready to host courses from other departments and other universities from around the world.

OPE on NERC

One of the biggest accomplishments of the past year is moving OPE from AWS to the New England Research Cloud (NERC). That gives us three useful benefits. First, it reduces the cost per student. NERC provides student access at cost, which is a fraction of the cost of AWS. In 2022, courses running on OpenShift AI on AWS cost roughly $150 per student, even after efforts to minimize expenses. After moving to NERC, the final cost is roughly $18 per student. This change makes OPE a potential solution for a much wider range of schools, including colleges and K-12 schools in under-resourced regions.

*Danni Shi presents OPE at the 2024 MOC Alliance Workshop.*

Second, we have more flexibility, because we are using our dedicated OpenShift AI cluster for classes. This means we can manage the cluster to support as many classes as we wish while ensuring that we have the scalability and load balancing for large class sizes. Third, we have more customized monitoring and management, which helps efficiently manage our resources. For example, we can monitor when students’ notebooks have been idling for a long time and shut them down after a specified period of time, keeping in mind that students in a machine learning class may need a longer window to run training processes. We also have a monitor that garbage-collects notebooks when students have a wrong image or wrong container size, so we can ensure students are using the right container size to start their notebooks.

The monitoring cluster also helps us find anything abnormal and better debug any issues, and we can capitalize on NERCs Observability Cluster to monitor resource usage and better understand student resource usage patterns. This also makes billing cleaner. Until the move to NERC, we could not differentiate between the usage in different classes maintained by a university. Moving forward, we can bill for individual class usage, so departments running less resource-intensive courses won’t be charged the same rate as, say, a computer science department running several machine learning courses.

One significant challenge we faced was maintaining privacy for student notebooks. The problem was that each class had its own OpenShift namespace, mapped from a ColdFront project. Previously, when students launched the Jupyter Workbench via the Data Science project in their designated namespace, they could access all notebook instances within that namespace. This level of access allowed them to view, log in, and stop others’ notebooks, posing a significant privacy concern. To address this issue, we enabled the rhods-notebooks namespace and directed students to use the Jupyter title that belongs to the rhods-notebooks namespace. This way, each student has access exclusively to their individual notebooks. This solution led to classes being able to efficiently share resources within the rhods-notebooks namespace—which is also important when accounting resource usage for each class.

Functional enhancements

Stability and reliability

One of our goals over the past year was to develop an automated test framework for OPE content that could potentially become part of the supported Red Hat OpenShift AI platform. We now have test automation that includes a GitHub CI/CD workflow, so when we build the containers, we use that workflow to test them. We also developed a GitHub CI/CD workflow that triggers when changes are committed to the container source code. This workflow verifies the functionality of the container image, including package versions, Jupyter features, and the user interface. This ensures that changes to the source code do not introduce errors, maintaining the build’s integrity and reliability.

We also needed to develop tests for the cluster’s functionality and scalability. For example, if we launch 300 Juypter notebooks at one time, how much latency will there be? In our previous ROSA cluster in AWS, when a large group of students launched all their notebooks at the same time, it could take nearly 20 minutes for each notebook to start as the AWS cluster scaled up. We wrote scalability tests to confirm that we have no latency issues starting large numbers of notebooks on the new cluster.

Scalability tests to confirm that we have no latency issues starting large numbers of notebooks on the new cluster.

To make container builds faster and more reliable, we also reduced the container image size by using lightweight base images and multistage builds. Associate Software Engineer Isaiah Stapleton successfully squashed the image, meaning that an image potentially larger than 10 gigabytes—an excessive volume for an ordinary laptop—is reduced to less than two gigabytes per base image. Isaiah also developed a customized image-build process, allowing users to rebuild an existing OPE image with custom add-ons, such as modifying unique identifiers (UID), group identifiers (GID), and group settings within the container image.

Usability tools

The ability for users to create textbooks and interactive content is an essential element of OPE. To make it easier for authors to create static content or presentations, we created command-line tools to integrate OPE features. Adding a repository, creating a new OPE project, and building and publishing books can all be accomplished from the command line. The tool can be used inside an OPE container, which means that authors have a consistent lab environment while constructing textbooks. The command-line tool was developed in summer 2023 by BU intern Ke Li, and summer 2024 intern Meera Malhotra is working to improve it. Meera’s contributions include enhancing the OPE tools usage documentation, simplifying the installation process, and adding tooling to the container environment for ease in managing dependencies for installations.

*Intern Meera Malhotra presents her work on OPE at Boston University*.

Meera is also working on a new OPE textbook, Beyond the Classroom (still in draft form), with the goal of bridging the knowledge gap between education and writing code in the workforce. Meera gathered input from experienced Red Hat engineers to help identify areas where students need more instruction to transition successfully to the workforce. The textbook contains chapters on adjusting to writing code outside of the classroom, learning how to understand regex, reading man pages, using the command line, and Git fundamentals. The book also includes interviews with engineers who share their insights with junior programmers. For example, security expert Lily Sturmann gives advice about how to write more secure code with clean coding standards, and Principal Software Engineer Sally O’Malley discusses the benefits of learning the command line and on-the-job learning. The interviews give not only technical advice but advice on adjusting to the workforce and combating impostor syndrome.

Autograder

BU student Ross Mikulskis, with mentoring from Isaiah, worked on an OPE Autograder, an application that receives homework submissions via a POST request, runs tests, and sends the results back to the client. The autograder integrates via API with the Gradescope platform, which manages student submissions, runs the autograder at scale, and distributes the results to both student and teacher. By automating the grading process we are able to provide immediate and consistent feedback to students, while enhancing the ability for these classes to scale.

OPE in the real world

OPE infrastructure on NERC was used for three classes at Boston University in the Spring 2024 semester: Computer Systems (CS 210), with more than 320 students; Introduction to Operating Systems (EC 440), with more than 70 students, and Tools for Data Science (CS 506), with more than 220 students. Positive student feedback was that OPE was straightforward and easy to use with no onboarding difficulties.

We have been approached by others interested in OPE, including Thomas McKenna from BU Wheelock College of Education and Human Development, the founder and creator of Phenomena for NGSS, an educational website designed to support teachers in learning more about phenomena-based instruction. In a recent OPE meeting, Thomas shared his Phenomena website and educational use cases and expressed his desire to use OPE to improve the teaching experience. Chris Simmons from the Massachusetts Green High-Performance Computing Center (MGHPCC) utilizes the Jupyter platform to train researchers and found that OPE aligns with several aspects of his work. A recent visit from students from Beijing, China, also sparked interest in collaboration with computer science faculty at Tsinghua University. We welcome their interest and encourage others to join us.

*Isaiah Stapleton and Danni Shi (left) work with Red Hat summer interns*.

Building OPE is a highly collaborative learning process. Working with students and professors generates new and often innovative ideas on how to improve the books and platform. Students like Meera, who first encountered OPE in the CS 210 class, are also very active in creating the OPE platform. They bring valuable first hand feedback that helps us identify which parts of OPE need improvement and which things we want to maintain. This continuous feedback from both professors and students ensures that OPE effectively meets educational needs.

The OPE project is open to all. See the Open Education Project page for more details and links to repositories to help you start your own textbooks and classes.

SHARE THIS ARTICLE

Feature

Open research clouds get the skills to pay the bills

Tzu-Mainn Chen

How do you charge for a cloud? Researchers at the New England Research Cloud have developed a stack to make understanding and charging for usage much simpler. Universities and research institutions are increasingly embracing the cloud as a means to bring down costs and fully utilize the technical resources they have on hand. But creating […]

Feature

Where AI meets secure coding: inside SEMLA’s ambition for more resilient software

Simone Ferlin-Reiter

The industry-academia collaboration aimed at using LLMs to help generate more secure code builds on its success to expand research into infrastructure. In an era when software underpins everything from critical communications and global financial systems to lifesaving medical devices, security and reliability can never be an afterthought. Yet traditional development practices often leave gaps: […]

Feature

Refined Yuga analysis tool for detecting code defects in Rust improves usability

Anne Mulhern

Improvements to the research-developed tool for analyzing unsafe Rust have rendered it much more precise. Yuga is a static software analysis tool for identifying lifetime annotation code defects in Rust code. At the time it was first presented in a previous article in the Red Hat Research Quarterly, Yuga’s analysis yielded an unacceptably high number […]

Feature

Mental models: Qualitative research to design for Red Hat OpenShift users

Carl Pearson

Brian Dellascio

Sarahjane Clark

To design effectively for our users, we need to learn more about them. If we don’t, we may make a product that our users can’t be efficient in, or worse, a product that our users have no need for in the first place.

Feature

Where will we find the data scientists?

Jennifer Wood

Universities play a primary role in developing data skills, but traditional education alone can’t close the skills gap fast enough. The mismatch between the widespread need for strong data skills and the current workforce is an obstacle for nearly every sector of the economy, which means no single sector can solve it. Collaborative partnerships among […]

Feature

Open Education Project tackles GPU scheduling and metrics visibility

Danni Shi

Enhancements to the education project highlight how research work on OPE drives advancements for many kinds of multitenant environments. The Open Education Project (OPE) continues to develop solutions for optimizing GPU resource usage in a multitenant environment. OPE, a project of Red Hat Collaboratory at Boston University, has long been a pioneer in making high-quality, […]

Feature

BigDataStack delivers with contributions from industry and university partners

Yosef Moatti

Oshrit Feder

Guy Khazma

Gal Lushi

Paula Ta-Shma

Luis Tomás Bolivar

Miki Kenneth

Josh Salomon

Data skipping and network performance improvement technologies prove their value in data-intensive applications.

Feature

Moving ecological forecasting from supercomputer to cloud: why and how

Christopher Tate

New event-driven architecture enabled researchers to move the PEcAn platform to the New England Research Cloud and increase scalability. Near-term ecological forecasting can help communities make better decisions and prepare for extreme weather events and changes in the environment. Use cases include forecasts of infectious disease outbreaks, increases or declines in animal populations, or the […]

Feature

Making machine learning accessible across disciplines

Marek Grác

Machine learning has been driving research breakthroughs in many fields. Now there is an open source curriculum designed to help non-specialists build the skills they need to use it. Machine learning is an increasingly important competency in a growing number of fields. Biochemists are using it to create models for protein engineering. Economists are using […]

Red Hat Research Quarterly

The Open Education Project is ready to scale

Red Hat Research Quarterly

The Open Education Project is ready to scale

Danni Shi

Related Projects

Red Hat Research Quarterly

November 2024

Enhancements to the pioneering platform for open source education have made it more reliable, easier to use, and much more affordable for new users.

OPE on NERC

Functional enhancements

Stability and reliability

Usability tools

Autograder

OPE in the real world

Tzu-Mainn Chen

Simone Ferlin-Reiter

Anne Mulhern

Carl Pearson

Brian Dellascio

Sarahjane Clark

Jennifer Wood

Danni Shi

Yosef Moatti

Oshrit Feder

Guy Khazma

Gal Lushi

Paula Ta-Shma

Luis Tomás Bolivar

Miki Kenneth

Josh Salomon

Christopher Tate

Marek Grác