Swing: Providing Long-Range Lossless RDMA via PFC-Relay

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS(2023)

引用 0|浏览2
暂无评分
摘要
Remote Direct Memory Access (RDMA) has been widely deployed in datacenters for its high performance. Large-scale high performance cloud services built on geographically distributed datacenters require long-range RDMA for performance requirements. However, existing RDMA solutions can hardly satisfy the stringent requirements of the emerging large-scale high-performance cloud services built on geo-distributed datacenters in terms of throughput and delay. On the one hand, lossless RDMA suffers from a deep buffer and potential suboptimal throughput for inter-datacenter traffic due to delayed response to Priority Flow Control (PFC) messages. On the other hand, lossy RDMA with selective retransmissions suffers from poor performance when multiple flows with different round-trip times (RTTs) coexist in cross-datacenter scenarios. This article proposes Swing, which expands the high-performance lossless RDMA to long-distance links through PFC-Relay. Swing ensures the throughput of long-distance links while minimizing the buffer requirement for long-range RDMA. It enables long-range RDMA without making any modifications to existing in-datacenter networks. The evaluation shows that Swing can reduce the average flow completion time (FCT) by 14%-66% in a variety of traffic scenarios.
更多
查看译文
关键词
Inter datacenter communication,Datacenter networks,Flow control,PFC,RDMA
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要