Error Recovery of RDMA Packets in Data Center Networks

2019 28th International Conference on Computer Communication and Networks (ICCCN)(2019)

引用 6|浏览54
暂无评分
摘要
Modern data center applications need high throughput (40Gbps) and ultra-low latency (<;10us per hop), along with low CPU overhead. Remote Direct Memory Access (RDMA), which can be deployed in RDMA over commodity Ethernet (RoCEv2) protocol, has the potential to satisfy the requirements. RoCEv2 needs a lossless environment to achieve high performance. RoCEv2 provides Priority-based Flow Control (PFC) to prevent packet loss caused by buffer overflow. But packet loss can still happen in today’s data centers due to other reasons such as switch configuration error. There are two retransmission algorithms dealing with the packet loss recovery: Go-Back-0 and Go-Back-N. Unfortunately, by simply applying Go-Back-N algorithm to RoCEv2, the relative throughput will drop to nearly zero when the packet loss rate exceeds 1%. This is mainly caused by the improper triggering mechanism of generating NAK. This paper proposed an Improved Go-Back-N algorithm to solve this problem, which involves two mechanism. The Improved Go-Back-N is easy to be deployed in today’s data centers because it makes no changes on switches. It can improve the relative throughput to about 60% when the packet loss rate increases to 1%.
更多
查看译文
关键词
RDMA packets,data center networks,Remote Direct Memory Access,Priority-based Flow Control,data centers,switch configuration error,packet loss recovery,packet loss rate,RoCEv2 protocol,RDMA over commodity Ethernet protocol,go-back-N algorithm
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要