D3N: A Multi-Layer Cache for Data Centers

Current caching methods for improving the performance of big-data jobs assume abundant (e.g., full bi-section) bandwidth to cache nodes. However, many enterprise data centers and co-location facilities exhibit significant network imbalances due to over-subscription and incremental network upgrades. This project designs and develops D3N, a novel multi-layer cooperative caching architecture that mitigates network imbalances by caching data on the access side of each layer of hierarchical network topology. A prototype implementation, which incorporates a two-layer cache, is highly-performant (can read cached data at 5GB/s, the maximum speed of our SSDs) and significantly improves the performance of big-data jobs. To fully utilize bandwidth within each layer under dynamic conditions, we present an algorithm that adaptively adjusts cache sizes of each layer based on observed workload patterns and network congestion.

For more information on this project and the unique partnership that produced it, please see the website of the Red Hat Collaboratory at Boston University as well as the article by Emine Ugur Kaynar in Red Hat Research Quarterly, Volume 1, Issue 2.