Designing Scalable and High-Performance MPI Libraries on Amazon Elastic Fabric Adapter

2019 IEEE Symposium on High-Performance Interconnects (HOTI)(2019)

引用 7|浏览57
暂无评分
摘要
Amazon has recently announced a new network interface named Elastic Fabric Adapter (EFA) targeted towards tightly coupled HPC workloads. In this paper, we characterize the features, capabilities and performance of the adapter. We also explore how its transport models such as UD and SRD (Scalable Reliable Datagram) impact the design of high-performance MPI libraries. Our evaluations show that hardware level reliability provided by SRD can significantly improve the performance of MPI communication. We also propose a new zero-copy transfer mechanism over unreliable and orderless channels that can reduce the communication latency of large messages. The proposed design also shows significant improvement in collective and application performance against the vendor provided MPI library.
更多
查看译文
关键词
Elastic Fabric Adapter,EFA,SRD,EC2,MPI,HPC
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要