Efficient Histogram Estimation for Smart Grid Data Processing With the Loglog-Bloom-Filter

IEEE Trans. Smart Grid(2015)

引用 13|浏览24
With the emerging area of smart grids, one critical challenge faced by administrators of wide-area measurement systems is to analyze and model streaming data with limited resources on their embedded controllers. Usually, streaming data can be modeled as a multiset where each data item has its own frequency. In this paper, we study the problem on how to generate histograms of data items based on their frequency, so we can identify various issues such as power line tripping or line faults under constraints. The primary challenge for achieving this goal using conventional methods is that keeping an individual counter for each unique type of data is too memory-consuming, slow, and costly. In this paper, we describe a novel data structure and its associated algorithms, called the loglog bloom filter, for this purpose. This data structure extends the classical bloom filter with a recent technique called probabilistic counting, so it can effectively generate histograms for streaming data in one pass with sub-linear overhead. Therefore, this method is suitable for data processing in smart grids, where limited computational resources are available on the controllers. We analyze the performance, trade-offs, and capacity of this data structure, and evaluate it with real data traces collected through the frequency disturbance recorders deployed for the FNET/GridEye infrastructure. We demonstrate that this method can identify the frequencies of all unique items with high accuracy and low memory overhead, so that data outliers can be conveniently identified.
Radiation detectors,Smart grids,Histograms,Frequency estimation,Estimation,Data structures,Probabilistic logic
AI 理解论文
Chat Paper