软件分布式共享存储系统的性能优化
作者: 施巍松
出版时间:2004-07-19
出版社:高等教育出版社
- 高等教育出版社
- 9787040146158
- 1
- 248901
- 平装
- 32开
- 2004-07-19
- 210
- 251
- 工学
- 软件工程
软件分布式共享存储系统(又称虚拟共享存储系统)由于结合了共享存储系统的易编程性和分布式存储系统的可伸缩性而成为近十几年来的一个重要研究方向。设计软件分布式共享存储系统最主要的目标是对应用程序不做修改或稍加修改就可以在该系统上运行,并能取得令人满意的性能。但为了维护共享数据的一致性和通信的透明性而引入的系统开销使得现有的很多系统都难以达到这个目标。
本文着重研究如何提高软件分布式共享存储系统的性能,分别从高速缓存一致性协议、存储器组织方式、系统开销、循环调度、任务迁移和通信优化六个方面提出了相应的优化技术。
本书适合作为高等院校研究生分布式系统、分布式与并发程序设计课程的教学参考书,也可供相关技术人员使用。
Chapter 1 Introduction
1.1 Basic Idea of Software DSM
1.2 Memory Consistency Model
1.3 Cache Coherence Protocol
1.4 Application Programming Interface
1.5 Memory Organization
1.6 Implementation Method
1.6.1 Implementation Levels
1.6.2 Granularity of the System
1.7 Some Representative software DSMs
1.8 Recent Progress on Software DSM and Open Questions
1.8.1 Software DSM-oriented Application Research
1.8.2 Fine-grain vs. Coarse-grain Software DSM Systems
1.8.3 Hardware Support for Software DSM System
1.8.4 More Relaxed Memory Consistency Model
1.8.5 SMP-Based Hierarchical Software DSM System
1.9 Summary of Dissertation Contributions
1.10 Organization of the Dissertation
Chapter 2 Lock-Based Cache Coherence Protocol
2.1 Cache Coherence Protocol
2.1.1 Write-Invalidate vs. Write-Update
2.1.2 Multiple Writer Protocol
2.1.3 Delayed Propagation Protocol
2.2 Snoopy Protocols
2.3 Directory-Based Protocols
2.3.1 Full Bit Vector Directory
2.3.2 Limited Pointer Directory
2.3.3 Linked List Directory
2.3.4 Probowner Directory
2.4 Lock-Based Cache Coherence Protocol
2.4.1 Design Consideration
2.4.2 Supporting Scope Consistency
2.4.3 The Basic Protocol
2.4.4 Correctness of the Protocol
2.4.5 Advantages and Disadvantages
2.5 Summary
Chapter 3 JIAJIA Software DSM System
3.1 Introduction
3.2 Memory Organization
3.3 Lock-Based Cache Coherence Protocol
3.4 Programming Interface
3.5 Implementation
3.5.1 Starting Multiple Processes
3.5.2 Shared Memory Management
3.5.3 Synchronization
3.5.4 Communication
3.5.5 Deadlock Free of Communication Scheme
3.6 Performance Evaluation and Analysis
3.6.1 Applications
3.6.2 Performance of JIAJIA and CVM
3.6.3 Confidence-Interval Based Summarizing Technique
3.6.4 Paired Confidence Interval Method
3.6.5 Real World Application: Em3d
3.6.6 Scalability of JIAJIA
3.7 Summary
Chapter 4 System Overhead Analysis and Reducing
4.1 Introduction
4.2 Analysis of Software DSM System Overhead
4.3 Performance Measurement and Analysis
4.3.1 Experiment Platform
4.3.2 Overview of Applications
4.3.3 Analysis
4.3.4 The CPU Effect
4.4 Reducing System Overhead
4.4.1 Reducing False Sharing
4.4.2 Reducing Write Detection Overhead
4.4.3 Tree Structured Propagation of Barrier Messages
4.4.4 Performance Evaluation and Analysis
4.5 Summary
Chapter 5 Affinity-Based Self Scheduling
5.1 Background
5.2 Related Work
5.2.1 Static Scheduling(Static)
5.2.2 Self Scheduling(SS)
5.2.3 Block Self Scheduling(BSS)
5.2.4 Guided Self Scheduling(GSS)
5.2.5 Factoring Scheduling(FS)
5.2.6 Trapezoid Self Scheduling(TSS)
5.2.7 Affinity Scheduling(AFS)
5.2.8 Safe Self Scheduling(SSS)
5.2.9 Adaptive Affinity Scheduling(AAFS)
5.3 Design and Implementation of ABS
5.3.1 Target System
5.3.2 Affinity-Based Self Scheduling Algorithm
5.4 Analytic Evaluation
5.5 Experiment Platform and Performance Evaluation
5.5.1 Experiment Platform
5.5.2 Application Description
5.5.3 Performance Evaluation and Analysis
5.6 Summary
Chapter 6 Dynamic Task Migration Scheme
6.1 Introduction
6.2 Rationale of Dynamic Task Migration
6.3 Implementation
6.3.1 Computation Migration
6.3.2 Data Migration
6.4 Home Migration
6.5 Experimental Results and Analysis
6.5.1 Experiment Platform
6.5.2 Applications
6.5.3 Performance Evaluation and Analysis
6.6 Related Work
6.7 Summary
Chapter 7 Communication Optimization for Home -Based Software DSMs
7.1 Introduction
7.2 Key Issues of ULN
7.2.1 Communication Model
7.2.2 Data Transfer
7.2.3 Protection
7.2.4 Address Translation
7.2.5 Message Pipelining
7.2.6 Arrival Notification
7.2.7 Reliability
7.2.8 Multicast
7.3 Communication Requirements of Software DSMs
7.4 Design of JMCL
7.4.1 JMCL API
7.4.2 Message Flow of JMCL
7.5 Current State and Future Work
7.6 Conclusion
Chapter 8 Conclusions and Future Directions
8.1 Conclusions
8.2 Future of Software DSM
Bibliography
List of Figures
1.1 Basic idea of software DSM
1.2 Illustration of simple software DSM system
2.1 Write merging in eager release consistency
2.2 Comparison of communication amount in eager and lazy RC
2.3 State transition digram of the lock-based cache protocol
3.1 Memory organization of CC-NUMA
3.2 Memory organization of COMA
3.3 Memory organization of JIAJIA
3.4 Memory Allocation Example
3.5 Flow chart of threads creating procedurejiacreat()
3.6 Flow chart of memory allocationjia_alloc(size)
3.7 Examples of nested critical sections
3.8 Communication between two processors
3.9 (x-μ)/ s2 /n follows a t(n -1 )distribution
3.10 Speedups of 8 applications under 2 ,4 ,8 ,processors
4.1 (a)General prototype of software DSM system.(b) Basic communication framework of JIAJIA
4.2 Time partition of SIGSEGV handler and synchronization operation
4.3 (a)Speedups of applications on 8 processors. (b)Time statistics of applications.
4.4 (a)Comparison of speedups of fast CPU and slow CPU.(b)Effects of CPU speed to system overhead
4.5 Breakdown of execution time
5.1 Basic framework of application
5.2 Execution time of different scheduling schemes in dedicated environment:(a)SOR. (b)JI. (c)TC.(d)MAT. (e)AC
5.3 Execution time of different scheduling schemes in metacomputing environment:(a)SOR. (b)JI.(c)TC. (d)MAT. (e)AC
5.4 Execution time with different chunk size under ABS scheduling scheme in metacomputing environment:(a)SOR. (b)JI. (c)TC. (d)MAT. (e)AC
6.1 Basic framework of dynamic task migration scheme
6.2 Performance comparison:(a)execution time.(b)system overhead
7.1 Comparison of different communication substrate:(a)unreliable. (b)reliable
7.2 Interface description of JMCL
7.3 Message transfer flow in JMCL
7.4 Message transfer flow in UDP/IP
List of Tables
1.1 Some Representative Software DSM Systems
2.1 Some Notations
2.2 Message Costs of Shared Memory Operations
2.3 Comparison of Different Coherence Protocols
3.1 Characteristics of Benchmarks and Execution Results
3.2 Eight-Processor Execution Statistics
3.3 Execution Time,Fixed Speedup(Sf)and Memory Requirement for Different Scales
3.4 Execution Time,Scaled Speedup(Ss)for Problem Scale 120 ×60 ×208
4.1 Description of Time Statistical Variables
4.2 Characteristics of Applications
4.3 Breakdown of Execution Time of These Applications
4.4 Characteristics of the Benchmarks
4.5 Eight-Way Parallel Execution Results
5.1 Chunk Sizes of Different Scheduling Schemes
5.2 The Number of Messages and Synchronization Operations Associated with Loop Allocation
5.3 Description of the Symbols
5.4 The Effects of Locality and Load Imbalance (unit: second)
5.5 The Number of Synchronization Operations of Different Scheduling Algorithms in Dedicated Environment
5.6 The Number of Getpages of Different Scheduling Algorithms in Dedicated Environment
5.7 System Overhead of Different Scheduling Algorithms in Dedicated Environment(Ⅰ)(second)
5.8 System Overhead of Different Scheduling Algorithms in Dedicated Environment(Ⅱ)
5.9 The Number of Synchronization Operations of Different Sc-heduling Algorithms in Metacomputing Environment
5.10 The Number of Getpages of Different SchedulingAlgorithms in Metacomputing Environment
5.11 System Overhead of Different Scheduling Algorithms in Metacomputing Environment(Ⅰ)
5.12 System Overhead of Different Scheduling Algorithms in Metacomputing Environment(Ⅱ)
5.13 The Number of Synchronization operations with Different Chunk Sizes in Metacomputing Environment
5.14 The Number of Getpages with Different Chunk Sizes in Metacomputing Environment
5.15 System Overhead of Different Chunk Sizes in Metacomputing Environment
6.1 Definition of the Symbols
6.2 System Overheads in Unbalanced Environment
6.3 System Overheads in Unbalanced Environment with Task Migration
7.1 Descriptions of JMCL Applications Programming Interface