OB-Tree: Accelerating Data Cleaning in Out-of-Core Column-Store Databases

Feng Yu, Brandon J. Latronics,Tyler Matacic,Eric S. Jones

2017 IEEE International Congress on Big Data (BigData Congress)(2017)

引用 0|浏览6
暂无评分
摘要
The column-store database, featuring a column-by-column data layout and a fast data retrieving speed, is a representative of next-generation database management systems in this big data era. Optimizing the write performance is a well-known challenge in out-of-core (or external memory) column-store databases. Data cleaning helps to cleanse redundant data and improve the overall performance of the databases. Previously proposed data cleaning methods require a long execution time and additional computing resources which are inefficient for column-store databases with large-volume data. This work introduces an auxiliary tree index and high-speed data cleaning methods to improve the overall processing speed of columnar data. The proposed index called OB-tree comes with a rich set of operations and possesses multiple advantages in working with a wide-range of column-store databases. We introduce new data cleaning methods utilizing OB-tree to efficiently identify target records and their locations. Extensive experiments show that the proposed methods enable significant performance improvements for data cleaning on column-store databases.
更多
查看译文
关键词
Column-Store Database,Index,B+ -Tree,Write Optimization
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要