StackPool: A High-Performance Scalable Network Architecture on Multi-core Servers

HPCC/EUC(2013)

引用 4|浏览22
暂无评分
摘要
There are numerous proprietary appliances in operators' networks. These appliances consume a lot of electricity and plenty of space to deploy, which lead to high operating expense (OPEX) for operators. Network Function Virtualisation (NFV) is introduced to solve this problem. NFV consolidates many network devices into network applications, which can be running on industry commodity servers. Those appliances are different from routers, because they have to handle protocol processing above network layer and provide socket APIs to various applications, which need full protocol stack support instead of packet forwarding only. Unfortunately, despite increasingly high speed bandwidth up to 10 Gbps or even 40 Gbps on commodity multi-core servers, network protocol processing bottlenecks are identified, such as throughput does not scale by the number of cores, or stack processing latency is too long for some applications, etc. In this paper, the reasons for poor stack performance (especially performance scalability and stack process latency) in software are systematically analyzed. And based on improving such analysis results, we propose Stack Pool, a novel high-performance scalable network architecture on multi-core servers. Stack Pool is constituted by multiple isolated virtual lanes. Each virtual lane contains an independent protocol stack instance, several pairs of hardware queues in NICs, as well as socket instances located in the stack instance. Each logical CPU core is responsible to process packets in a virtual lane. Flow director in NIC and lane selector in Stack Pool direct packets of different flows to several virtual lanes based on packet headers. We have implemented a Stack Pool prototype to show that the approach is promising. The Stack Pool outperforms standard Linux protocol stack with approximately 7 times throughput of UDP or 3 times that of TCP in a single virtual lane. Moreover, Stack Pool performance accrues linearly when scale to multiple cores, e.g.,- 10.7 and 17.2 times on 6 cores of UDP transmit and receive respectively, and 6.5 times of TCP throughput on 6 logical cores. At the same time, packet latency on Stack Pool is approximated only 1/4 than that on native Linux stack.
更多
查看译文
关键词
parallelism,parallel processing,opex,protocols,multiple isolated virtual lanes,network function virtualisation,performance scalability,industry commodity servers,socket api,flow director,high-performance scalable network architecture,high operating expense,stack process latency,multiprocessing systems,scalability,virtualisation,multicore servers,multi-core servers,nfv,network protocol stack,logical cpu core,network layer,performance,stackpool prototype,tcp throughput,linux,hardware,throughput,servers
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要