Chrome Extension
WeChat Mini Program
Use on ChatGLM

M4: A Framework for Per-Flow Quantile Estimation.

Siyuan Dong,Zhuochen Fan, Tianyu Bai,Tong Yang, Hanyu Xue, Peiqing Chen,Yuhan Wu

IEEE International Conference on Data Engineering(2024)

Cited 0|Views3
No score
Abstract
The field of quantile estimation has grown in importance due to its myriad practical applications. Recent research trends have evolved from estimating the quantile for a single data stream to developing data structures that can concurrently estimate quantiles for multiple sub-streams, also known as flows. This paper introduces a novel framework, M4, designed to estimate per-flow quantiles in data streams accurately. M4 is a versatile framework that can be integrated with a wide array of single-flow quantile estimation algorithms, thereby enabling them to perform per-flow estimation. The framework employs a sketch-based approach to provide a space-efficient method for recording and extracting distribution information. M4 incorporates two techniques: MINIMUM and SUM. The MINIMUM technique minimizes the noise on a flow from other flows caused by hash collisions, while the SUM technique efficiently categorizes flows based on their sizes and customizes treatment strategies accordingly. We demonstrate the application of M4 on three single-flow quantile estimation algorithms (DDSketch, t-digest, and ReqSketch), detailing the specific implementation of the MINIMUM and SUM techniques. We provide theoretical proof that M4 delivers high accuracy while utilizing limited memory. Additionally, we conduct extensive experiments to evaluate the performance of M4 regarding accuracy and speed. The experimental results indicate that across all three example algorithms, M4 significantly outperforms two comparison frameworks in terms of accuracy for per-flow quantile estimation while maintaining comparable speed.
More
Translated text
Key words
Quantile Estimation,Data Streams,Distribution Information,Recent Research Trends,Histogram,Service Quality,Distribution Of Values,Real-valued,Less Than Or Equal,Horizontal Axis,Real-world Datasets,Hash Function,Anomaly Detection,Higher Layers,Large Flow,Flow Distribution,Error Bounds,Medium Flow,Small Flow,Long-tailed Distribution,Flow Size,Insertion Operator,Aligned Segments,Range Of Algorithms
AI Read Science
Must-Reading Tree
Example
Generate MRT to find the research sequence of this paper
Chat Paper
Summary is being generated by the instructions you defined