Red Hat Research Quarterly

The Open Education Project is ready to scale

Red Hat Research Quarterly

The Open Education Project is ready to scale

about the author

Danni Shi

Danni Shi is a senior software engineer at Red Hat leading the development effort for the Open Education (OPE project). She is dedicated to advancing open source education and improving the accessibility of technology for all.

Article featured in

Enhancements to the pioneering platform for open source education have made it more reliable, easier to use, and much more affordable for new users.

What would it mean to open source education? For starters, we’d need a way for educators to create and publish their own high-quality open source materials—lectures, presentations, textbooks, and lab manuals—so they aren’t locked into proprietary texts or software. We’d also want a way to deploy these materials in a live and interactive manner, at university scale, that makes it easy to collaborate and share content. To really make it open, we’d need to maximize accessibility, so a student can engage from anywhere simply by opening a web browser.

We’re now ready to host courses from other departments and other universities from around the world.

That’s the aim of the Open Education Project (OPE), which launched in 2022 when it first received support from the Red Hat Collaboratory Research Incubation Award program. In the May 2023 issue of RHRQ (“Open source education: from philosophy to reality”), I wrote about our progress with OPE and future milestones. Since then, we’ve run multiple successful Boston University computer science courses on OPE and made several improvements. We’re now ready to host courses from other departments and other universities from around the world.

OPE on NERC

One of the biggest accomplishments of the past year is moving OPE from AWS to the New England Research Cloud (NERC). That gives us three useful benefits. First, it reduces the cost per student. NERC provides student access at cost, which is a fraction of the cost of AWS. In 2022, courses running on OpenShift AI on AWS cost roughly $150 per student, even after efforts to minimize expenses. After moving to NERC, the final cost is roughly $18 per student. This change makes OPE a potential solution for a much wider range of schools, including colleges and K-12 schools in under-resourced regions. 

Danni Shi presents OPE at the 2024 MOC Alliance Workshop.

Second, we have more flexibility, because we are using our dedicated OpenShift AI cluster for classes. This means we can manage the cluster to support as many classes as we wish while ensuring that we have the scalability and load balancing for large class sizes. Third, we have more customized monitoring and management, which helps efficiently manage our resources. For example, we can monitor when students’ notebooks have been idling for a long time and shut them down after a specified period of time, keeping in mind that students in a machine learning class may need a longer window to run training processes. We also have a monitor that garbage-collects notebooks when students have a wrong image or wrong container size, so we can ensure students are using the right container size to start their notebooks. 

The monitoring cluster also helps us find anything abnormal and better debug any issues, and we can capitalize on NERCs Observability Cluster to monitor resource usage and better understand student resource usage patterns. This also makes billing cleaner. Until the move to NERC, we could not differentiate between the usage in different classes maintained by a university. Moving forward, we can bill for individual class usage, so departments running less resource-intensive courses won’t be charged the same rate as, say, a computer science department running several machine learning courses.

One significant challenge we faced was maintaining privacy for student notebooks. The problem was that each class had its own OpenShift namespace, mapped from a ColdFront project. Previously, when students launched the Jupyter Workbench via the Data Science project in their designated namespace, they could access all notebook instances within that namespace. This level of access allowed them to view, log in, and stop others’ notebooks, posing a significant privacy concern. To address this issue, we enabled the rhods-notebooks namespace and directed students to use the Jupyter title that belongs to the rhods-notebooks namespace. This way, each student has access exclusively to their individual notebooks. This solution led to classes being able to efficiently share resources within the rhods-notebooks namespace—which is also important when accounting resource usage for each class.

Functional enhancements

Stability and reliability

One of our goals over the past year was to develop an automated test framework for OPE content that could potentially become part of the supported Red Hat OpenShift AI platform. We now have test automation that includes a GitHub CI/CD workflow, so when we build the containers, we use that workflow to test them. We also developed a GitHub CI/CD workflow that triggers when changes are committed to the container source code. This workflow verifies the functionality of the container image, including package versions, Jupyter features, and the user interface. This ensures that changes to the source code do not introduce errors, maintaining the build’s integrity and reliability.

We also needed to develop tests for the cluster’s functionality and scalability. For example, if we launch 300 Juypter notebooks at one time, how much latency will there be? In our previous ROSA cluster in AWS, when a large group of students launched all their notebooks at the same time, it could take nearly 20 minutes for each notebook to start as the AWS cluster scaled up. We wrote scalability tests to confirm that we have no latency issues starting large numbers of notebooks on the new cluster.

Scalability tests to confirm that we have no latency issues starting large numbers of notebooks on the new cluster.

To make container builds faster and more reliable, we also reduced the container image size by using lightweight base images and multistage builds. Associate Software Engineer Isaiah Stapleton successfully squashed the image, meaning that an image potentially larger than 10 gigabytes—an excessive volume for an ordinary laptop—is reduced to less than two gigabytes per base image. Isaiah also developed a customized image-build process, allowing users to rebuild an existing OPE image with custom add-ons, such as modifying unique identifiers (UID), group identifiers (GID), and group settings within the container image.

Usability tools

The ability for users to create textbooks and interactive content is an essential element of OPE. To make it easier for authors to create static content or presentations, we created command-line tools to integrate OPE features. Adding a repository, creating a new OPE project, and building and publishing books can all be accomplished from the command line. The tool can be used inside an OPE container, which means that authors have a consistent lab environment while constructing textbooks. The command-line tool was developed in summer 2023 by BU intern Ke Li, and summer 2024 intern Meera Malhotra is working to improve it. Meera’s contributions include enhancing the OPE tools usage documentation, simplifying the installation process, and adding tooling to the container environment for ease in managing dependencies for installations. 

Intern Meera Malhotra presents her work on OPE at Boston University.

Meera is also working on a new OPE textbook, Beyond the Classroom (still in draft form), with the goal of bridging the knowledge gap between education and writing code in the workforce.  Meera gathered input from experienced Red Hat engineers to help identify areas where students need more instruction to transition successfully to the workforce. The textbook contains chapters on adjusting to writing code outside of the classroom, learning how to understand regex, reading man pages, using the command line, and Git fundamentals. The book also includes interviews with engineers who share their insights with junior programmers. For example, security expert Lily Sturmann gives advice about how to write more secure code with clean coding standards, and Principal Software Engineer Sally O’Malley discusses the benefits of learning the command line and on-the-job learning. The interviews give not only technical advice but advice on adjusting to the workforce and combating impostor syndrome.

Autograder

BU student Ross Mikulskis, with mentoring from Isaiah, worked on an OPE Autograder, an application that receives homework submissions via a POST request, runs tests, and sends the results back to the client. The autograder integrates via API with the Gradescope platform, which manages student submissions, runs the autograder at scale, and distributes the results to both student and teacher. By automating the grading process we are able to provide immediate and consistent feedback to students, while enhancing the ability for these classes to scale.

OPE in the real world

OPE infrastructure on NERC was used for three classes at Boston University in the Spring 2024 semester: Computer Systems (CS 210), with more than 320 students; Introduction to Operating Systems (EC 440), with more than 70 students, and Tools for Data Science (CS 506), with more than 220 students. Positive student feedback was that OPE was straightforward and easy to use with no onboarding difficulties. 

We have been approached by others interested in OPE, including Thomas McKenna from BU Wheelock College of Education and Human Development, the founder and creator of Phenomena for NGSS, an educational website designed to support teachers in learning more about phenomena-based instruction. In a recent OPE meeting, Thomas shared his Phenomena website and educational use cases and expressed his desire to use OPE to improve the teaching experience. Chris Simmons from the Massachusetts Green High-Performance Computing Center (MGHPCC) utilizes the Jupyter platform to train researchers and found that OPE aligns with several aspects of his work. A recent visit from students from Beijing, China, also sparked interest in collaboration with computer science faculty at Tsinghua University. We welcome their interest and encourage others to join us.

Isaiah Stapleton and Danni Shi (left) work with Red Hat summer interns.

Building OPE is a highly collaborative learning process. Working with students and professors generates new and often innovative ideas on how to improve the books and platform. Students like Meera, who first encountered OPE in the CS 210 class, are also very active in creating the OPE platform. They bring valuable first hand feedback that helps us identify which parts of OPE need improvement and which things we want to maintain. This continuous feedback from both professors and students ensures that OPE effectively meets educational needs.  

The OPE project is open to all.  See the Open Education Project page for more details and links to repositories to help you start your own textbooks and classes.

SHARE THIS ARTICLE

More like this