Cowic: A Column-Wise Independent Compression For Log Stream Analysis

CCGRID '15: Proceedings of the 15th IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing(2015)

引用 11|浏览32
暂无评分
摘要
Nowadays massive log streams are generated from many Internet and cloud services. Storing log streams consumes a large amount of disk space and incurs high cost. Traditional compression methods can be applied to reduce storage cost, but are inefficient for log analysis, because fetching relevant log entries from compressed data often requires retrieval and decompression of large blocks of data.We propose a column-wise compression approach for well-formatted log streams, where each log entry can be independently compressed or decompressed for analysis. Specifically, we separate a log entry into several columns and compress each column with different models. We have implemented our approach as a library and integrated it into two applications, a log search system and a log joining system. Experimental results show that our compression scheme outperforms traditional compression methods for decompression times and has a competitive compression ratio. For log search, our approach achieves better query times than using traditional compression algorithms for both in-core and out-of-core cases. For joining log streams, our approach achieves the same join quality with only 30% memory of uncompressed streams.
更多
查看译文
关键词
Log Stream Compression,Log Search,Log Joining
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要