Accelerating range minimum queries with ray tracing cores

Enzo Meneses, Cristóbal A. Navarro, Héctor Ferrada, Felipe A. Quezada

Future Generation Computer Systems（2024）

引用 0|浏览13

暂无评分

摘要

Over the past decade, GPU technology has undergone a notable transformation, evolving from pure general-purpose computation to the integration of application-specific integrated circuits (ASICs), including Tensor Cores and Ray Tracing (RT) cores. While these specialized GPU cores were initially developed to enhance specific domains like AI and real-time rendering, recent research has successfully harnessed their capabilities to expedite other tasks traditionally reliant on conventional GPU computing. One GPU task that is still yet to find its way into RT cores is the processing of range minimum queries (RMQs) in parallel, which is fundamental in fields such as information retrieval or pattern matching, among others. In this context, accelerating RMQs with RT cores would impact many of the applications that heavily rely on this task. In this work we present RTXRMQ, a new approach that can compute RMQs with RT cores. The main contribution is the proposal of a geometric solution for RMQ, where elements become triangles that are placed and shaped according to the element’s value and position in the array, respectively, such that the closest hit of a ray launched from a point given by the query parameters corresponds to the result of that query. Experimental results show that RTXRMQ is currently best suited for small query ranges relative to the input size, achieving up to 5× and 2.3× of speedup over parallel state of the art CPU and GPU approaches, respectively. For medium and large query ranges RTXRMQ is still slower than the state of the art GPU approach, but still competitive by being 2.5× and 4× faster than a state of the art CPU method running in parallel as well. Furthermore, performance scaling experiments across the latest RTX GPU architectures show that if the current RT core scaling trend continues, then RTXRMQ’s performance would scale at a higher rate than the other compared approaches, making it an attractive tool for future high performance applications that employ many batches of RMQs.

查看译文

关键词

Ray tracing,RT cores,Bounding volume hierarchy,GPU computing,Range minimum query,Energy efficiency

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要