AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
In the proposed prefetching scheme, the remote access latency can be overlapped with other operations and multiple pages may be prefetched in the same message

Dynamic data prefetching in home-based software DSMs

J. Comput. Sci. Technol., no. 3 (2001): 231-241

Cited: 12|Views14
EI WOS SCOPUS

Abstract

A major overhead in software DSM (Distributed Shared Memory) is the cost of remote memory accesses necessitated by the protocol as well as induced by false sharing. This paper introduces a dynamic prefetching method implemented in the JIAJIA software DSM to reduce system overhead caused by remote accesses. The prefetching method records t...More

Code:

Data:

Introduction
  • Software Distributed Shared Memory (DSM) provides the illusion of shared memory on the top of distributed memory hardware.
  • Most software DSM systems are page-based, using virtual memory protection to trap accesses to shared memory
  • These systems suffer from the high communication and coherence-induced overheads caused by the high level of implementation and large granularity of coherence.
  • Many techniques, such as multiplewriter protocol[l], lazy release consistency[=], and d a t a migration[ a], have been proposed to reduce false sharing and remote communication.
  • The page faulting processor continues on receiving the page acknowledgement message
Highlights
  • Software Distributed Shared Memory (DSM) provides the illusion of shared memory on the top of distributed memory hardware
  • The number of remote access messages is reduced because multiple pages may be prefetched in the same message
  • The average extra traffic caused by useless prefetch is 7%-13% in the evaluation
  • The prefetching scheme proposed in this paper predicts prefetehes by analyzing the periodicity of the access history string about remote writes and local accesses
  • The average extra traffic caused by useless prefetch is only 7% in the evaluation when the periodicity threshold is 3, which is much less than that of other prefetching methods such as that introduced in [11]
  • In the proposed prefetching scheme, the remote access latency can be overlapped with other operations and multiple pages may be prefetched in the same message
Results
  • In "Water, JIAp2 issues more than 30% useless remote accesses and traffic, while JIAB3.
Conclusion
  • The prefetching scheme proposed in this paper predicts prefetehes by analyzing the periodicity of the access history string about remote writes and local accesses.
  • The periodicity analysis method can predict prefetches rather precisely.
  • The average extra traffic caused by useless prefetch is only 7% in the evaluation when the periodicity threshold is 3, which is much less than that of other prefetching methods such as that introduced in [11].
  • In the proposed prefetching scheme, the remote access latency can be overlapped with other operations and multiple pages may be prefetched in the same message.
  • Among eight benchmarks, the prefetching scheme achieves a performance increment of 15%20% in three benchmarks and around 870-10% in another three
Tables
  • Table1: Table 1
  • Table2: Run Time Statistics of Parallel Execution
  • Table3: Relative Runtime Statistics
  • Table4: FFT Results with Different Plimit
Download tables as Excel
Related work
  • There is some previous work regarding data prefetching in software DSMs. A similar work to ours was proposed in [il i by Karlsson et aI. Their approach is also based on the previous access history in software DSMs and also issues prefetching messages after

    Vol.16 synchronization. However, their approach is based on homeless software DSM (TreadMarks) while ours is on home-based software DSM. Our prefetching algorithm is also different from theirs. Their algorithm decides prefetching according to remote and local accesses during last two intervals, while ours analyzes the periodicity from previous INV (invalidation) and GETP (fetching a remote page) interleaving string.
Funding
  • This work is supported by the National Natural Science Foundation of China (No.60073018)
Reference
  • Carter J, Bennet J, Zwaenepoel W. Implementation and performance of Munin. In Proc. the 13th Syrup. Operating Systems Principles, Oct., 1991, pp.152-164.
    Google ScholarLocate open access versionFindings
  • Keleher P, Dwarkadas S, Cox A, Zwaenepoel W. TreadMarks distributed shared memory on standard workstations and operatiag systems. In Proc. the 1994 Winter Usenix Conf., Jan., 1994, pp.115-131.
    Google ScholarLocate open access versionFindings
  • Hu Weiwu, Shi Weisong, Tang Zhimin. Optimizing home-based software DSM protocols. Cluster Computing, to appear in 2001.
    Google ScholarFindings
  • Hu Weiwu, Shi Weisong, Tang Zhimin, Li Ming. A lock-based cache coherence protocol for scope consistency. Journal of Computer Science and Technology, Mar., 1998, 13(2): 97-109.
    Google ScholarLocate open access versionFindings
  • Woo S, Ohara M, Torrie E et al. The SPLASH-2 programs: Characterization and methodological considerations. In Prac. ISCA'95, 1995, pp.24-36.
    Google ScholarLocate open access versionFindings
  • Bailey D, Barton J, Lasinski T, Simon H. The NAS parallel benchmarks. Technical Report No.103863, NASA, Jul., 1993.
    Google ScholarFindings
  • Lu H, Dwarkadas S, Cox A, Zwaenepoel W. Quantifying the performance differences between PVM and TreadMarks. Journal of Parallel and Distributed Computing, Jun., 1997, 43(2): 65-78.
    Google ScholarLocate open access versionFindings
  • Iftode L. Home-based shared virtual memory [dissertation]. Princeton University, Aug., 1998.
    Google ScholarFindings
  • Hu Weiwu, Shi Weisong, Tang Zhimin. Reducing system overhead in home-based software DSMs. In Proc. the 13th Int. Parallel Processing Syrup., Apr., 1999, pp.167-173.
    Google ScholarLocate open access versionFindings
  • Hu Weiwu, Zhang Fuxin, Liu Haiming. A new home-based software DSM protocol for SMP clusters. In Proc. the 6th Euro-Par Conference, Aug., 2000, pp.1132-1142.
    Google ScholarLocate open access versionFindings
  • Karlsson M, Stenstrom P. Effectiveness of dynamic prefetching in multiple-writer distributed virtual shared memory system. Journal of Parallel and Distributed Computing, Jun., 1997, 43(2): 79-93.
    Google ScholarLocate open access versionFindings
  • Bianchini R, Kontothanasis L, Pinto R et al. Hiding communication latency and coherence overhead in software DSMs. In Proc. 7th Int. Conf. Architectural Support for Programming Languages and Operating Systems, 1996, pp.198-209.
    Google ScholarLocate open access versionFindings
  • Mowry T, Gupta A. Tolerating latency through software-controlled prefetching in shared-memory multiprocessors. Journal of Parallel and Distributed Computing, Jun., 1991, 12(2): 87-106.
    Google ScholarLocate open access versionFindings
  • Dwarkadas S, Lu H, Cox A et al. Combining compile-time and runtime support for efficient software distributed shared memory. In Proc. IEEE, Special Issue on Distributed Shared Memory, Mar., 1999, pp.476-486.
    Google ScholarLocate open access versionFindings
  • Keleher P, Tseng C. Enhancing software DSM for compiler-parMlelized applications. In Proc. the 11th Int. Parallel Processing Symposium, Apr., 1997.
    Google ScholarLocate open access versionFindings
  • Chandra S, Larus J. Optimizimg communication in HPF programs for fine-grained distributed shared memory. In Proc. the 6th Syrup. Principles and Practice of Parallel Programming, Jun., 1997.
    Google ScholarLocate open access versionFindings
  • Amza C, Cox A, Dwarkadas S et al. Adaptive protocols for software distributed shared memory. In Proc. [EEE, Special Issue on Distributed Shared Memory, Mar., 1999, pp.467-475.
    Google ScholarLocate open access versionFindings
  • Bershad B, Zekauskas M, Sawdon W. The Midway Distributed Shared Memory System. In Proc. the 38th I E E E Int. CompCon Conf., Feb., 1993, pp.528-537.
    Google ScholarLocate open access versionFindings
  • Dwark~das S, Schaffer A, Cottingham R et al. Parallelization of general linkage analysis problems. Human Heredity, 1994, 44: 127-141.
    Google ScholarLocate open access versionFindings
  • Lathtop G, Lalouel J, Jurier C, Ott J. Strategies for multilocus analysis in humans. P N A S, 1994, 81: 3443-3446.
    Google ScholarLocate open access versionFindings
  • Li K. IVY: A shared virtual memory system for parallel computing. In Proc. the 1988 Int. Conf. Parallel Processing, Aug., 1988, 2: 94-101.
    Google ScholarLocate open access versionFindings
  • Schaffer A, Gupta S, Shriram K, Cottingham R. Avoiding recomputation in genetic linkage analysis. Human Heredity, 1994, 44: 225-237. HUWeiwu received his B.S. degree from the University of Science and Technology of China in 1991 and his Ph.D. degree from the Institute of Computing Technology, The Chinese Academy of Sciences in 1996, both in computer science. He is currently a professor in the Institute of Computing Technology. His research interests include high performance computer architecture, parallel processing, and SOC design.
    Google ScholarLocate open access versionFindings
  • ZHANGFuxin received his B.S. degree in computing technology from the University of Science and Technology of China in 1999. He is currently an M.S. candidate in the Institute of Computing Technology, The Chinese Academy of Sciences. His research interests include high performance computer architecture, cluster computing, and LINUX.
    Google ScholarLocate open access versionFindings
  • LIU Haiming received his B.S. degree in computing technology from the University of Science and Technology of China in 1999. He is currently an M.S. candidate in the Institute of Computing Technology, The Chinese Academy of Sciences. His research interests include high performance computer architecture and cluster computing.
    Google ScholarFindings
0
Your rating :

No Ratings

Tags
Comments
avatar
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn