Fast & Strong: The Case of Compressed String Dictionaries on Modern CPUs

Proceedings of the 15th International Workshop on Data Management on New Hardware(2019)

引用 10|浏览43
暂无评分
摘要
String dictionaries constitute a large portion of the memory foot-print of database applications. While strong string dictionary compression algorithms exist, these come with impractical access and compression times. Therefore, lightweight algorithms such as front coding are favored in practice. This paper endeavors to make strong string dictionary compression practical. We focus on Re-Pair Front Coding (RPFC), a grammar-based compression algorithm, since it consistently offers better compression ratios than other algorithms in the literature. To accelerate compression times, we propose block-based RPFC, which consists in compressing independently small blocks of the dictionary. Moreover, to accelerate access times, we devise a vectorized access method, using Intel® Advanced Vector Extensions 512 (Intel® AVX-512), that is enabled by two specific changes we propose to RPFC. Our experimental evaluation shows that our proposed techniques accelerate compression and access times by up to 24x and 2.9x, respectively. These results move our modified RPFC into a practical range for use in database systems.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要