UDON: A case for offloading to general purpose compute on CXL memory

Jon Hermes, Josh Minor, Minjun Wu,Adarsh Patil,Eric Van Hensbergen

arxiv（2024）

引用 0|浏览1

暂无评分

摘要

Upcoming CXL-based disaggregated memory devices feature special purpose units to offload compute to near-memory. In this paper, we explore opportunities for offloading compute to general purpose cores on CXL memory devices, thereby enabling a greater utility and diversity of offload. We study two classes of popular memory intensive applications: ML inference and vector database as candidates for computational offload. The study uses Arm AArch64-based dual-socket NUMA systems to emulate CXL type-2 devices. Our study shows promising results. With our ML inference model partitioning strategy for compute offload, we can place up to 90 just 20 (HNSW) kernels in vector databases can provide upto 6.87× performance improvement with under 10

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要