Compiler-Based Data Prefetching and Streaming Non-temporal Store Generation for the Intel(R) Xeon Phi(TM) Coprocessor

Parallel and Distributed Processing Symposium Workshops & PhD Forum(2013)

引用 62|浏览1
暂无评分
摘要
The Intel(R) Xeon Phi(TM) coprocessor has software prefetching instructions to hide memory latencies and special store instructions to save bandwidth on streaming non-temporal store operations. In this work, we provide details on compiler-based generation of these instructions and evaluate their impact on the performance of the Intel(R) Xeon Phi(TM) coprocessor using a wide range of parallel applications with different characteristics. Our results show that the Intel(R) Composer XE 2013 compiler can make effective use of these mechanisms to achieve significant performance improvements.
更多
查看译文
关键词
composer xe,memory latency,parallel application,compiler-based generation,xeon phi,streaming non-temporal store generation,special store instruction,different characteristic,significant performance improvement,non-temporal store operation,compiler-based data prefetching,effective use,intel xeon phi,vectors,parallel processing,performance,coprocessor,coprocessors,hardware,compiler,bandwidth
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要