SIFT: A simple algorithm for tracking elephant flows, and taking advantage of power laws

msra

引用 33|浏览35
暂无评分
摘要
The past ten years have seen the discovery of a number of "power laws" in networking. When applied to network trafic, these laws imply the following 80- 20 rule: 80% of the work is brought by 20% of the flows; in particular, 80% of Internet packets are ge nerated by 20% of the flows. Heavy advantage could be taken of such a statistic if we could identify the packets of these dominant flows with minimal overhead. This motivates us to develop SIF T, a simple randomized algorithm for indentifying the packets of large flows. SIFT is based on t he inspection paradox: A low-bias coin will quite likely choose a packet from a large flow while s imultaneously missing the packets of small flows. We describe the basic algorithm and some variations. We then list some uses of SIFT that we are currently exploring, and focus on one particular use in t his paper—a mechanism for allowing a router to differentially allocate link bandwidth and buff er space to large and small flows. We compare SIFT's performance with the current practice in Int ernet routers via ns simulations. The advantages from using SIFT are a significant reduction in end -to-end flow delay, a sizeable reduction in the total buffer size and an increase in goodput. We comment on the implementability of SIFT and argue that it is feasible to deploy it in today's Internet . I. INTRODUCTION Scheduling policies significantly affect the performance o f resource allocation systems. When applied to situations where job sizes follow a heavy-tailed (or power law) distribution, the benefits can be particularly large. Consider the prepond erance of heavy-tailed distributions in network traffic: Internet flow sizes (19) and web traffic (8) have both been demonstrated to be heavy-tailed. The distribution of web traffic has been t aken advantage of for reducing download latencies by scheduling web sessions using the shortest remaining processing time (SRPT) algorithm. The benefit over the FIFO and the processor sharing (PS) disciplines is several orders of magnitude reduction in mean delay (12). Given the heavy-tailed nature of Internet flow-size distribution, one expects that incorpor ating flow size information in router, switch and caching algorithms will lead to similar improvements in performance. Before investigating the potential improvements in performance, we must address the following question: How can a router tell if a given packet is from a large flow? The SIFT algorithm gives a practical answer to this question, by separating (or sifting) the packets of long and short flows. SIFT randomly samples each arriving pac ket using a coin of small bias p. For example, with p = 0.01, a flow with 15 packets is going to have at least one of its packets sampled with a probability of just 0.15, while a fl ow with 100 packets is going to have at least one of its packets sampled with probability roughly equal to 1. Thus, most long flows will be sampled sooner or later; whereas most short flows will not be sampled. Being randomized, the SIFT algorithm will, of course, commit errors: it will likely miss some largish flows and sample some smallish flows. We later des cribe some simple ways of drastically reducing these errors. Once a packet is sampled, all further packets from that flow ca n be processed in ways which can be beneficial. We describe one use of SIFT in this pap er; Scheduling packets of long and short flows differentially. In (18), (14), another u se of SIFT is discussed: caching heavy-tail distributed flows and requests at routers and web caches, respectively.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要