Serverless Streaming Graph Analytics

Streaming graph analytics is an emerging field of applications that aim to extract knowledge from evolving networks in a timely and efficient manner. Graph streams are (possibly unbounded) sequences of timestamped events that represent relationships between entities: user interactions in social networks, online financial transactions, product purchases, driver and user locations in ride-sharing services.

In this project, we will focus on graph streams that can be used to model distributed systems, where workers are represented as nodes connected with edges that denote communication or dependencies. In this model, monitoring and performance analysis can be expressed as graph streaming queries. For example, if the dynamic topology of an OpenShift cluster fleet is modeled as a graph, a streaming query can continuously detect disconnected regions. We will design a prototype open-source streaming graph analytics system on top of Apache Flink Stateful Functions and develop a temporal graph processing API for expressing continuous and ad-hoc queries on graph streams.

Summary of Progress (as of Jan. 2023)

The major results of year #1 is HesseFun, a prototype system for serveless streaming graph analytics on Flink Stateful Functions. HesseFun ingests two types of input streams: (i) graph updates, i.e. edge additions, deletions, and modifications, and (ii) temporal queries. The Flink cluster acts as a distributed database backed by embedded RocksDB KV stores and maintains the evolving graph up-to-date. As a result, HesseFun can leverage Flink’s state management capabilities and checkpointing mechanism to ensure fault-tolerance and exactly-once semantics on recovery. The runtime layer consists of three core stateful functions. The VertexStorageFn function is responsible for ingesting graph updates and propagating them to the correct Flink partition. The QueryHandlerFn function ingests query streams and is responsible for efficiently extracting relevant graph views from partitioned state. Once the state becomes available, the query handler spawns the respective computation functions (shown in orange color) to serve the query load. Finally, the IterCoordinatorFn manages synchronization and convergence for iterative fixpoint graph queries, such as page rank. HesseFun currently offers a library of four built-in graph algorithms that can be used to compute connectivity queries, shortest path queries, page rank, and Graph Neural Network (GNN) inference queries. Since the state and compute layers are decoupled, adding a new query to HesseFun is as simple as registering a new function endpoint.

Project Team
Principal Investigator: Vasia Kalavri
BU PhD student: Emmanouil Kritharakis
BU Master students: Vivek Unnikrishnan and Shekhar Sharma (Directed Study)
Collaborators at KTH, Sweden: Sihan Chen (Master student), Sonia Horchidan (PhD student), Paris Carbone (faculty)
Collaborators at TU Munich: Nikolai Merkel, Jana Vatter (PhD students)

Temporal graph analytics on Apache Flink Stateful Functions from Research Interest Group Meeting, September 20, 2022

Accompanying slides

This project is supported by the Red Hat Collaboratory at Boston University.