Data Cleaning in Out-of-Core Column-Store Databases : An Index-Based Approach

semanticscholar(2016)

引用 0|浏览1
暂无评分
摘要
Write optimization in out-of-core (or external memory) column-store databases is a well-known challenge. Timestamped Binary Association Table (or TBAT) and Asynchronous Out-of-Core Update (or AOC Update) have shown significant improvements for this problem. However, after a time period of AOC updates, the selection query performance on TBAT gradually decreases. Even though data cleaning methods can merge update records in TBAT to increase ad-hoc searching speed, it could be a time-consuming process. In this work, we introduce multiple data cleaning methods utilizing the index structure called offset B-tree (or OB-tree). When the OB-tree and updating records can be fit into the system memory, an eager data cleaning approach is introduced for fast cleaning speed. In a data intensive environment, the OB-tree index or the updating records might be too large to fit into memory; therefore, a progressive data cleaning approach is introduced which can divide the update records into small slips and clean the data a memory-economic manner. keywords: column-store database, data cleaning, index, B-tree
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要