UDON: A case for offloading to general purpose compute on CXL memory

Jon Hermes, Josh Minor, Minjun Wu,Adarsh Patil,Eric Van Hensbergen

arxiv(2024)

引用 0|浏览1
暂无评分
摘要
Upcoming CXL-based disaggregated memory devices feature special purpose units to offload compute to near-memory. In this paper, we explore opportunities for offloading compute to general purpose cores on CXL memory devices, thereby enabling a greater utility and diversity of offload. We study two classes of popular memory intensive applications: ML inference and vector database as candidates for computational offload. The study uses Arm AArch64-based dual-socket NUMA systems to emulate CXL type-2 devices. Our study shows promising results. With our ML inference model partitioning strategy for compute offload, we can place up to 90 just 20 (HNSW) kernels in vector databases can provide upto 6.87× performance improvement with under 10
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要