Transparent Data Transformation: Can I have my Rows and eat my Columns too?
A big dichotomy in data system design is the column vs. row-stores one. The first supports analytical, and the latter transactional workloads. There have been several efforts to bridge the two, especially in light of new hybrid transactional analytical processing workloads.
For example, SAP HANA [11] and Oracle TimesTen [8] use in-memory column-stores to offer efficient analytical processing and employ a row-wise write-store to
support ACID transactions. MemSQL uses a row-store for data ingestion in memory, and writes columns on disk to reduce future I/O [13]. IBM dashDB is a hybrid
store that supports mixed workloads [4]. Academic systems [2, 3, 9] also combine columnar and row-wise architectures. They all require conversions between rowwise and columnar formats, and, hence, they have to balance between efficient analytics and data freshness.
What if we had a way to decouple the physical data layout from the data access performance?