Towards high performance and energy efficiency in open-source stream processing.

BU faculty members Vasiliki Kalavari and Jonathan Appavoo will work with Red Hat researcher Sanjay Arora to create an open-source Mass Open Cloud (MOC)-hosted stream processing system using Apache Flink software. The researchers will leverage the open nature of the software to build a platform that optimizes trade-offs between energy efficiency and performance while maintaining transparency and the easy sharing of knowledge.  “This project aims to demonstrate that energy efficiency and the myriad layers of software that go into an open source streaming platform need not be incompatible,” the team wrote.

Project Team
Principal Investigator: Vasia Kalavri
Co-PI: Jonathan Appavoo
Red Hat Collaborator: Sanjay Arora
PhD Students: Han Dong, Sanskiriti Sharma, and Yuanli Wang
Undergraduate Students: Ke Li and Shengyao Luo

Repositories

  • Automated scripts for setting up experiments on our platform and gathering measurements: These include automating the start and shutdown of Flink jobs, parsing of Flink output into comma-delimited file formats for processing, Jupyter notebooks that contain graphing and data analysis utilizing Pandas, and our initial dynamic DVFS policy.
  • A sample dataset of energy and performance logs
  • Our web application: An open-source web application that allows submitting and monitoring experiments, and visualizing experimental results. The web application allows users to select a Flink query from a drop-down menu, configure experimental parameters, such as the input rate (events/s) and experiment duration, and submit it for execution. The backend handles query submission to our experimental infrastructure. Once the experiment is complete, the user can inspect the results by selecting one of the built-in visualizations.
  • https://github.com/EEStrmCmptng/flink-benchmarks: Set of queries we have built to study energy consumption of Flink

Presentations

  • “Towards high performance and energy efficiency in open-source stream processing”. 2023 MOC Alliance Annual workshop by PIs Vasiliki Kalavri and Jonathan Appavoo.
  • “BayOp: Taming and Controlling Performance and Energy Trade-offs Automatically in Network Applications”. 2023 MOC Alliance Annual workshop. Han Dong & Sanjay Arora
  • “Machine Learning Tuning of Kernel Policies Towards Energy Efficiency in Diverse Hardware and Software”. Americas Research Interest Group Meeting, July 2023. Han Dong
  • “Energy Efficient Streaming: A presentation of the platform,” prepared by Ke Li and Jax Luo, 2022

Posters

  • Towards high performance and energy efficiency in open-source stream processing”. Sanskriti Sharma, 2023 MOC Alliance Annual workshop