NDSEARCH: Accelerating Graph-Traversal-Based Approximate Nearest Neighbor Search through Near Data Processing
2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA)(2023)
摘要
Approximate nearest neighbor search (ANNS) is a key retrieval technique for
vector database and many data center applications, such as person
re-identification and recommendation systems. It is also fundamental to
retrieval augmented generation (RAG) for large language models (LLM) now. Among
all the ANNS algorithms, graph-traversal-based ANNS achieves the highest recall
rate. However, as the size of dataset increases, the graph may require hundreds
of gigabytes of memory, exceeding the main memory capacity of a single
workstation node. Although we can do partitioning and use solid-state drive
(SSD) as the backing storage, the limited SSD I/O bandwidth severely degrades
the performance of the system. To address this challenge, we present NDSEARCH,
a hardware-software co-designed near-data processing (NDP) solution for ANNS
processing. NDSEARCH consists of a novel in-storage computing architecture,
namely, SEARSSD, that supports the ANNS kernels and leverages logic unit
(LUN)-level parallelism inside the NAND flash chips. NDSEARCH also includes a
processing model that is customized for NDP and cooperates with SEARSSD. The
processing model enables us to apply a two-level scheduling to improve the data
locality and exploit the internal bandwidth in NDSEARCH, and a speculative
searching mechanism to further accelerate the ANNS workload. Our results show
that NDSEARCH improves the throughput by up to 31.7x, 14.6x, 7.4x 2.9x over
CPU, GPU, a state-of-the-art SmartSSD-only design, and DeepStore, respectively.
NDSEARCH also achieves two orders-of-magnitude higher energy efficiency than
CPU and GPU.
更多查看译文
关键词
Near Data Processing,Approximate Nearest Neighbor Search,Hardware/Software Co-Design
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要