Red Hat Research Quarterly

Anchored keys: scaling of in-memory storage for serverless data analytics

Tristan Tarrant

Tristan has been leading the Infinispan Engineering Team at Red Hat for the past eight years as well as being Principal Architect for Red Hat Data Grid. He’s been a passionate open source advocate and contributor for nearly three decades.

Article featured in

Red Hat Research Quarterly

February 2024

Download PDF

Subscribe now

In this issue

From the Director

Investing in open source research

Hugh Brock

News

2024 Collaboratory awards promote innovation in the cloud

Interview

Future vision: on the internet, technopanic, and the limits of AI

Jason Schlessman

Feature

Passive network monitoring with eBPF

Simon Sundberg

Simone Ferlin-Reiter

Toke Høiland-Jørgensen

Anna Brunstrom

Feature

QUBIP and the transition to post-quantum cryptography

Gordon Haff

Feature

Anchored keys: scaling of in-memory storage for serverless data analytics

Tristan Tarrant

Column

Focus on artificial intelligence and machine learning | February 2024

Sanjay Arora

The strategy for scaling data capacity varies according to volume, access patterns, and cost-effectiveness. We look at an approach that achieves optimal results in the context of serverless data analytics.

Big data holds great promise for solving complex problems, but data-intensive applications are necessarily limited by the difficulty of supporting and maintaining them. The CloudButton project has created a serverless data analytics platform to democratize big data by simplifying the overall lifecycle and programming model using serverless technologies. Working together, the CloudButton project partners have created a FaaS (Function-as-a-Service) compute run-time for analytics that overcomes the current limitations of existing serverless platforms.

A key component of the CloudButton architecture is Infinispan, an open source in-memory data grid providing a key-value data store that distributes data across elastically scalable clusters to guarantee high performance, high availability, and fault tolerance. Infinispan can serve as both a volatile cache and a persistent data store.

The CloudButton runtime leverages Infinispan as a mutable shared-memory store for input and output data and intermediate results used and produced by serverless functions. While Infinispan addressed CloudButton’s functional and performance requirements, we determined that its scalability model was not a good fit for the lifecycle of serverless workloads in a cloud environment. This article illustrates how we enhanced Infinispan to suit our needs.

Infinispan’s distributed storage

Infinispan stores data in memory using the following scheme. Key-value entries are stored in data structures called caches, which can be replicated or distributed. In replicated caches, all nodes have a copy of all the entries. In distributed caches, a fixed number of copies of any entry is stored across nodes. This allows distributed caches to scale linearly, storing more data as nodes are added to the cluster. Keys are mapped to nodes using a consistent-hashing¹ algorithm (currently MurmurHash3). These nodes are known as the primary owners.

Copies are replicated to other nodes, known as backup owners. These are chosen using anti-affinity rules to avoid colocation on the same server, rack, or site. For convenience, the consistent-hash space is divided into equal-sized segments. Remote clients are aware of the consistent-hash and segment allocation, allowing low read/write latency by performing operations directly with the primary owners in a single network hop. When nodes are added or removed, a rebalancing algorithm redistributes segments to nodes. The strategy attempts to minimize the number of ownership changes and, therefore, the amount of data to transfer. The state transfer algorithm is non-blocking: clients continue to read/write entries, which are transparently redirected to the appropriate owner (see Figure 1).

This approach works well when the operational capacity is known upfront or changes infrequently. In these scenarios, priority is given to data integrity during node failure and the performance of a stable cluster. The rebalancing that occurs after adding or removing nodes has a noticeable impact on the network and the CPU, potentially introducing unwanted latency. In a cloud environment, however, consuming resources for capacity that may not always be needed is not economical. Instead, the cluster should be just big enough for the data it contains. This also means that, if possible, it should scale to zero nodes if there is no data.

Anchored keys

To address the above scale-to-fit scenario, we implemented a scaling strategy called anchored keys. Anchored keys relax some of Infinispan’s distribution rules to remove the impact of state transfer by not rebalancing data when nodes are added.

In this implementation, the primary owner of new keys is the last node in the cluster. Writes can be issued against any node, but they will be applied only by the primary owner. The key-to-node mapping is stored in a secondary, replicated cache, which is accessible by all nodes in the cluster. A node is filled with data until it reaches its capacity. When a new node is added, only the key-to-node mapping cache is replicated to the new node. This is much cheaper than rebalancing the full entries with the values.

Without a predictable hashing strategy, clients can no longer determine the owner of a particular entry and must operate in a round-robin balancing mode. As the size of the cluster increases, the likelihood of a request hitting the primary owner of an entry decreases, significantly impacting latency. This only affects reads, since writes will always go to the node added last to the cluster (see Figure 2).

***Figure 3.*** *Anchored keys with backup nodes*

Anchored keys and fault tolerance

To address the need for fault tolerance when using anchored keys, we need to bring back the concept of primary and backup nodes. Each node will, therefore, be associated with one or more backup nodes: writes destined for a node will be replicated to all of its backups. The primary and its backups form a replica group. If one of the nodes fails, it is replaced by one of its backups, a new backup node is started, and data is replicated to it from the survivors in its group. Figure 3 illustrates the architecture of this approach.

A drawback of introducing fault-tolerance to anchored keys is that nodes are assigned a role, either as primary or backup. This asymmetric nature makes managing their lifecycle significantly more complex: in the rebalancing architecture, all nodes are symmetrical and interchangeable.

Performance comparison

State transfer

The advantage of using anchored keys over rebalancing becomes significant in the context of rapid scaling as desired in a serverless environment. Figure 4 shows that state transfer in the anchored keys mode is nearly 2.5 times faster than in rebalancing mode when dealing with 1 million entries, each storing a 2KB value. As the size of the values increases, the savings become even more significant since anchored keys do not need to transfer them.

Reads

Because anchored keys trade scalability performance for data-access performance, it’s important to show the impact of its storage strategy on regular reads and writes. Infinispan’s default consistent-hashing client intelligence ensures that reads and writes maintain near-constant performance, no matter how large the cluster.

Because anchored keys forego the benefits of consistent hashing, we have to resort to Infinispan’s basic intelligence, which is applying a round-robin algorithm on the nodes in the cluster. This means that, as the cluster size increases, the probability of a client interacting directly with the primary node decreases, thus causing an additional network hop between the server node handling the request and the primary owner of the requested entry. While this degradation could be significant for traditional caching workloads, it may be considered negligible for computation-heavy serverless functions. Figure 5 shows the impact of anchored keys compared to rebalancing based on the same scenario as above.

***Figure 5.*** *A comparison of read performance times for anchored keys vs. rebalancing*

With a single node, the impact of looking up the ownership of an anchored key incurs a 16% penalty, which is negligible. However, as the size of the cluster increases, the requirement to forward the operation to a remote owner makes the penalty much larger.

Writes

The performance of writes shows the same progression as that of reads: as the size of the cluster increases, the round-robin algorithm has a decreasing chance of hitting the primary owner. The approach used by anchored keys, however, allows us to introduce a write-specific optimization to the load-balancing algorithm. Since entries are only written to the last node in the cluster, and clients receive an ordered list of server nodes, clients can send write requests to the owner directly. Figure 6 compares the performance of all three approaches.

***Figure 6.*** *A comparison of write performance times for anchored keys vs. rebalancing*

As expected, the round-robin algorithm suffers the same penalty as seen for the read scenario. The last-node approach, however, ensures linear performance comparable to the optimal consistent-hash algorithm. The impact of maintaining the additional key-to-node mapping makes writes 17% slower on average, which we believe is acceptable.

Anchored keys and serverless data analytics

The introduction of anchored keys to Infinispan has made it an ideal choice as a high-performance shared data store for serverless analytics, where scale-to-fit capacity is required to balance infrastructure costs with throughput and availability. Combined with the other components developed in the context of the CloudButton project, such as the Lithops multi-cloud framework and the Crucial Distributed Shared Objects library, it provides a compelling platform for a variety of use cases, including scientific scenarios such as metabolomics and geospatial analysis, as well as broader business scopes, such as diagnostic and predictive analytics.

To learn more about the CloudButton project, please visit the project homepage.

Funded by the European Union under Grant Agreement No. 825184. Views and opinions expressed are those of the author(s) only and do not necessarily reflect those of the European Union or European Research Executive Agency. Neither the European Union nor the granting authority can be held responsible for them.

Footnote

See also D. Karger, E. Lehman, T. Leighton, M. Levine, D. Lewin, and R. Panigrahy. Consistent hashing and random trees: Distributed caching protocols for relieving hot spots on the world wide web. In Proceedings of the Twenty-ninth Annual ACM Symposium on Theory of Computing (STOC), pp. 654–663, 1997.

SHARE THIS ARTICLE

Passive network monitoring with eBPF

Simon Sundberg

Simone Ferlin-Reiter

Toke Høiland-Jørgensen

Anna Brunstrom

Passive network latency monitoring offers a more holistic view of network performance without creating additional traffic. Researchers are developing a new tool to enable it efficiently. Network latency is a determining factor in users’ Quality of Experience (QoE) for applications including web searches, live video, and video games. That’s why network latency monitoring is critical. […]

Project Updates

Research project updates—May 2023

Each quarter, Red Hat Research Quarterly highlights new and ongoing research collaborations from around the world. This quarter we highlight collaborative projects from Graz University of Technology (Austria), Masaryk University (Czech Republic), and Karlstad University (Sweden) Contact academic@redhat.com for more information on any project described here, or explore more research projects at research.redhat.com. PROJECT: Researching and […]

Column

Making a research will: the human side of project migration

Heidi Dempsey

Have a project moving on to a higher plane? Make a plan to prevent getting stuck in limbo. As technology innovators, we get excited about ushering in new ideas and implementing new technologies. Sometimes, however, we might overlook the less flashy labor needed to retire old ideas and replace existing software. Those ideas and software […]

From the Director

Investments in university partnerships develop big ideas into working code

Hugh Brock

It gives me great pleasure to announce the second round of the expanded Red Hat Collaboratory awards. This year we funded ten collaborative projects and nine speculative ones for $2.3 million, including a mix of new projects and continuations of the ones we supported in 2022. Like everything we engage in at Red Hat […]

Feature

Team threat hunting on a container platform: Kestrel as a Service

Kenneth Peeples

An automated tool developed by researchers aims to decrease the mean time to detection by enabling threat hunters to automate and collaborate within a secure, stable container environment. The automated security tools in a Security Operations Center (SOC) can handle about 80% of cybersecurity threats, leaving a substantial 20% of more sophisticated threats undetected. These […]

Feature

Sequential Monte Carlo for streaming data

Rui Vieira

Bayesian statistical methods can make predictive data analysis more accurate. In this article, we evaluate possible solutions to the challenge of refining and increasing the value of high-volume data streams.

Feature

Open source education: from philosophy to reality

Danni Shi

Researchers, interns, and industry engineers have joined forces to create an open education platform using Red Hat OpenShift Data Science. Open source technology has transformed many industries, and education is now poised to be the next frontier. Open Education (OPE), an innovative project initiated by Boston University professor Jonathan Appavoo, is revolutionizing how education is […]

Interview

Open source cybersecurity and the next generation of computer scientists

Mike Bursell

Václav Matyáš, Professor with the Centre for Research on Cryptography and Security at the Faculty of Informatics at Masaryk University.

News

Undergraduate research projects advance the Red Hat Collaboratory’s educational mission

Shaun Strohmer

The Red Hat Collaboratory at Boston University is supporting select undergraduate student research projects during Summer 2022, in keeping with its mission of advancing education in open source technologies. So far, six projects have been chosen to receive funding and supervision from BU computer engineering professors active in their own Collaboratory projects, with more expected. […]