Cactus: Cache-Centric Techniques for In-Memory Query Processing
Both database systems and modern memory hierarchies are becoming increasingly complex and diverse. As a result, for many memory-resident data-intensive applications, the CPU caches have become the new bottleneck in the performance. There are two apporaches to cache-centric query processing, one is cache-conscious, and the other cache-oblivious. Both of them are aware of the existence of the memeory hierarchy, and the main difference is in if they assume the knowledge of specific parameters of the hierarchy such as the number of levels and the capacity and block size of each level. Due to this difference, cache-conscious techniques can achieve a great overall performance on a specific memory hiearchy through careful tuning whereas cache-oblivious ones are able to automatically maintain a good cache performance with provable bounds across memory hierarchies.
In this project, we examine the hardware cache behavior of in-memory query processing systems, study both cache-conscious and cache-oblivious approaches, and optimize them for performance as well as for automaticity.
Publications
- Bingsheng He and Qiong Luo. Cache-Oblivious Databases: Limitations and Opportunities. ACM Transactions on Database Systems (TODS), June 2008.
- Bingsheng He, Yinan Li, Qiong Luo, and Dongqing Yang. EaseDB: A Cache-Oblivious In-Memory Query Processor. ACM SIGMOD Conference, June 2007. (Demonstration paper)
- Bingsheng He, Yinan Li, Qiong Luo, and Dongqing Yang. A General Framework for Improving Query Processing Performance on Multi-Level Memory Hierarchies. DaMoN 2007.
- Bingsheng He and Qiong Luo. Cache-Oblivious Query Processing. CIDR 2007.
- Bingsheng He, Qiong Luo, and Byron Choi. Adaptive Index Utilization in Memory-Resident Structural Joins. IEEE Trans. Knowl. Data Eng. 19(6): 772-788 (2007).
- Bingsheng He and Qiong Luo. Cache-Oblivious Nested-Loop Joins. CIKM 2006.
- Bingsheng He, Qiong Luo, and Byron Choi. Cache-Conscious Automata for XML Filtering. ICDE 2005. An extended version appears in IEEE Trans. Knowl. Data Eng. 18(12): 1629-1644 (2006).
Software
- EaseDB_Win32.zip(63MB), the backend engine of EaseDB, including both source and binary files for a Win32 platform. The engine contains a set
of common relational operators such as selection, join, and aggregation. The storage formats supported include both column and row based models. The algorithm implmentations include both cache conscious and cache oblivious ones. (Sept 3, 2007)