A Novel Method to Improve Hit Rate for Big Data Quick Reading

Xiaobo Zhang,Xinxin Zhou,Zhaohui Zhang,Lizhi Wang,Pengwei Wang

2019 International Conference on Artificial Intelligence and Advanced Manufacturing (AIAM)（2019）

引用 0|浏览1

暂无评分

摘要

In big data mining analysis, the data records in the dataset are randomly retrieved. The distributed storage modes, such as BigTable, HBase, provide the cache policy for file blocks in retrieval operations. Since these records are scattered in different file blocks, the block cache does not have a high hit rate. To deal with the above problem, we propose an LRU-based double queue K-frequency cache method (DLK). The method presents a double queue storage structure, applying different storage and eviction rules for the data with varying access frequency (i.e., high/low access frequency). While the method divides the memory into data area and list area and adopts different data structure to reduce the time of data retrieval and data processing. The experimental results show that proposed method can reduce retrieval time by 30% with the cache mechanism. Compared with existing methods DLK can improve the hit rate by 60.1% and reduce the retrieval time by 43.5%. While applying in smaller cache capacity, our method outperforms other algorithms.

查看译文

关键词

Distributed cache, replacement, frequency, double-queue

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要