Research and Application of Query Optimization Based on HBase

Kai Lei, Xiaoyu Xiong, Zengzhou Wang, Fangzhou Shi,Jian He

2021 7th International Conference on Computer and Communications (ICCC)(2021)

引用 0|浏览0
暂无评分
摘要
With the advent of the big data era, higher technical requirements have been put forward for the management and retrieval of massive data. As a distributed NoSQL database under the Hadoop framework, HBase has been chosen by many companies as the storage medium for big data due to its strong scalability, excellent storage capabilities, and good read and write capabilities. However, HBase uses the Map-Reduce framework to process aggregate queries, which requires Calculate in real time and cannot keep the calculated result. This paper is dedicated to the research of query optimization technology based on HBase. In order to improve the efficiency of HBase for time series data aggregation and query, an index structure based on time split tree is proposed. The query overhead of tree index stored on disk is high and the query time is high. For problems affected by the amount of data, the time division tree structure is improved, and the query algorithm is optimized at the same time to avoid the disk I/O overhead of traversing the index tree layer by layer. First, save the time summary information, that is, the fixed-length data aggregation information, by constructing a time split tree; then persist the time split tree index into HBase according to the set time, add the corresponding timing information, and construct the sequence number, timing number, The HBase primary key of time range information; using time sequence information, through optimized index query algorithms, it can effectively narrow the data table retrieval range and improve the efficiency of aggregated information query. The effectiveness of the improved method is proved by a large number of wind farm wind turbine operating data.
更多
查看译文
关键词
HBase,secondary index,time series data,aggregation query,time split tree
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要