Greater New England Research Interest Group Meeting [May 2021]
Date: May 04, 2021
Project Update #1
Creating the new “How Do you Fedora” video series (20 min) – Gabbie Chang
The talk will detail the road to creating a new video series profiling some of Fedora’s various contributors and how they use Fedora.
Research Paper Reading Group – Part 2 (20 min) – Ahmed Sanaullah
Join us for session 2 as we continue our paper reading journey. This time, we will be exploring one of the most fundamental questions in research paper reading: how can I tell if I should read a paper without actually reading it? What makes this a challenging task is that this is not always a question of good vs bad. Even when one has two excellent papers in front of them, the effort it takes to properly read a paper could mean there isn’t sufficient time for both. As a result, a paper would need to be prioritized over another.
Similar to session 1, this session is also split into two parts. The first part of the session (5 min) outlines a heuristic approach to evaluating papers which is based on some common, high level indicators of quality. By spending 5 minutes evaluating a paper using these indicators, we can estimate if we should be spending the next 5 hours reading through the paper in detail. Then, as a group exercise, in the second part (15 min) we will look at three research papers and attempt to prioritize them on the basis of the indicators discussed in the first part.
Project Update #2
Robust LSM-Trees Tuning for Workload and Resources Uncertainty (20 min) – Andy Huynh
Modern data systems frequently employ tuning strategies that rely on a priori assumptions on the workload and hardware platform. However, data systems are consistently exposed to changing environments. Workloads may shift as application demands are not consistent, and with the prevalence of the cloud deploying applications on a constant hardware platform is not always guaranteed. Tuning data systems in such uncertain environments may lead to degradation in overall performance.
We introduce a new robust tuning paradigm to aid in the design of data systems with uncertain assumptions by modeling the behavior of the system and then utilizing these models in conjunction with techniques in robust optimization. Our approach is demonstrated through tuning a popular log-structured merge-tree based storage engine, RocksDB. We create a detailed cost model for the standard write and read queries, and frame the design decision as a robust optimization problem that chooses the physical layout of the tree by changing size ratio and memory allocated to the buffer versus bloom filter based on the available resources and expected workload.
In this talk I’ll be speaking about the process of developing the model, creating the system, and verifying the model against the physical system. Additionally, I’ll touch upon the robust optimization framework and the potential benefits this type of design paradigm has for other systems. As this work is ongoing, I will focus mainly on the design and current implementation.