中图种学学计算机科学与术系 University of Science and Technology of China DEPARTMENT。 F COMPUTE三巴 ENCE AND ECHNOLDD Parallel algorithms status and prospects Guo-Liang chen Dept of computer Science and Technology National High Performance Computing Center at Hefei Univ. of Science and Technology of China Hefei, Anhui, 230027, P.R. China glchen@ustc.edu.cn http:/www.nhpcc.ustcedu.cn NHPCC at Hef
NHPCC at Hefei Parallel Algorithms: status and prospects Guo-Liang CHEN Dept.of computer Science and Technology National High Performance Computing Center at Hefei Univ. of Science and Technology of China Hefei,Anhui,230027,P.R.China glchen@ustc.edu.cn http://www.nhpcc.ustc.edu.cn
中图种学学计算机科学与术系 University of Science and Technology of China DEPARTMENT。 F COMPUTE三巴 ENCE AND ECHNOLDD Abstract The parallel algorithm is very important problem to parallel processing. In this talk, we first briefly introduce to the parallel algorithm. and then we focus on discussion issues and direction of the parallel algorithm research. Lastly, the existent problems and faced new challenges for parallel gorithm research are given. We argue that the parallel lgorithm research should establish a systematic approach to Theory-Design-Implementation-Application, should form an integrated methodology of " Architecture- Algorithm-Programming. Only in this way, parallel algorithm research becomes continuous development and more realistic NHPCC at Hefei 2021/2/6
NHPCC at Hefei 2 2021/2/6 Abstract The parallel algorithm is very important problem to parallel processing. In this talk, we first briefly introduce to the parallel algorithm. And then, we focus on discussion issues and direction of the parallel algorithm research. Lastly, the existent problems and faced new challenges for parallel algorithm research are given. We argue that the parallel algorithm research should establish a systematic approach to “Theory-Design-Implementation-Application”, should form an integrated methodology of “ArchitectureAlgorithm-Programming”. Only in this way, parallel algorithm research becomes continuous development and more realistic
中图种学学计算机科学与术系 University of Science and Technology of China DEPARTMENT。 F COMPUTE三巴 ENCE AND ECHNOLDD Outline Introduction Research issues Parallel computation models Design techniques Parallel complexity theory Research directions Parallel computing Solving problem from applied domains Non-Tradition computation modes Existent problems and faced new challenges NHPCC at Hefei 2021/2/6
NHPCC at Hefei 3 2021/2/6 Outline ▪ Introduction ▪ Research Issues ▪ Parallel computation models ▪ Design techniques ▪ Parallel complexity theory ▪ Research directions ▪ Parallel computing ▪ Solving problem from applied domains ▪ Non-Tradition computation modes ▪ Existent problems and faced new challenges
中图种学学计算机科学与术系 University of Science and Technology of China DEPARTMENT。 F COMPUTE三巴 ENCE AND ECHNOLDD Introduction(1) What is a parallel algorithm? Algorithm------a method and procedure to solve a given problem Parallel algorithm------an algorithm in which multiple operations are performed simultaneously Why parallelism has been an interesting topic? The real world is inherently parallel: it is natural and straightforward to express something about the real world in a parallel way There are limits to sequential computing performance physical limits such as the speed of light Parallel computation is still likely to be more cost- effective for many applications than using uniprecessors which are very expensive NHPCC at Hefei 2021/2/6
NHPCC at Hefei 4 2021/2/6 Introduction (1) ▪ What is a parallel algorithm? ▪ Algorithm------ a method and procedure to solve a given problem ▪ Parallel algorithm------ an algorithm in which multiple operations are performed simultaneously. ▪ Why parallelism has been an interesting topic? ▪ The real world is inherently parallel: it is natural and straightforward to express something about the real world in a parallel way. ▪ There are limits to sequential computing performance: physical limits such as the speed of light. ▪ Parallel computation is still likely to be more costeffective for many applications than using uniprecessors which are very expensive
中图种学学计算机科学与术系 University of Science and Technology of China DEPARTMENT DF COMPUTE三巴 ENCE AND ECHNOLDD Introduction(2) Why parallelism has not led to widespread use? Conscious human thinking appears to be sequential to us The theory required for parallel algorithm in immature and was developed after the technology The hardware platform required for parallel algorithm are very expensive. Portability is much more serious issue in parallel programming than in sequential Why we need parallel algorithm? To increase computational speed To increase computational precision (to generate the fine mesh) To meet requirement of real time computation(weather forecasting NHPCC at Hefei 2021/2/6
NHPCC at Hefei 5 2021/2/6 Introduction (2) ▪ Why parallelism has not led to widespread use? ▪ Conscious human thinking appears to be sequential to us. ▪ The theory required for parallel algorithm in immature and was developed after the technology. ▪ The hardware platform required for parallel algorithm are very expensive. ▪ Portability is much more serious issue in parallel programming than in sequential. ▪ Why we need parallel algorithm? ▪ To increase computational speed. ▪ To increase computational precision (to generate the fine mesh). ▪ To meet requirement of real time computation (weather forecasting)
中图种学学计算机科学与术系 University of Science and Technology of China DEPARTMENT。 F COMPUTE三巴 ENCE AND ECHNOLDD Introduction 3) Classif ication of parallel algorithms Numerical parallel algorithms(algebraic operation: matrix operations, solving a system of linear equations etc.) Non-numerical parallel algorithms(symbolic operation sorting, searching graph algorithms etc.) Research hierarchy of parallel algorithms Parallel complexity theory (parallelizable problem, NC class problem, P-complete problem, lower bound etc.) Design and analysis of parallel algorithms(efficient parallel algorithms) Implementation of parallel algorithms (hardware platform software supporting) NHPCC at Hefei 2021/2/6
NHPCC at Hefei 6 2021/2/6 Introduction (3) ▪ Classification of parallel algorithms ▪ Numerical parallel algorithms (algebraic operation: matrix operations, solving a system of linear equations etc.). ▪ Non-numerical parallel algorithms (symbolic operation: sorting, searching, graph algorithms etc.). ▪ Research hierarchy of parallel algorithms ▪ Parallel complexity theory (parallelizable problem, NC class problem, P-complete problem, lower bound etc.) ▪ Design and analysis of parallel algorithms (efficient parallel algorithms). ▪ Implementation of parallel algorithms (hardware platform, software supporting)
中图种学学计算机科学与术系 University of Science and Technology of China DEPARTMENT。 F COMPUTE三巴 ENCE AND ECHNOLDD 工 ntroduction(4) The history of parallel algorithm research From 70's to 80's two decades, parallel algorithm research were very hot, many excellent papers, textbooks monographs of parallel algorithms were published From the middle of 90s, parallel algorithm had shifted to parallel computing New opportunity for parallel algorithm research The dramatically decrease of computer price and the rapid development of communication technology, make it possible to build PC-cluster by ourselves It is easy to get free software to support cluster from internet NHPCC at Hefei 2021/2/6
NHPCC at Hefei 7 2021/2/6 Introduction (4) ▪ The history of parallel algorithm research ▪ From 70’s to 80’s two decades, parallel algorithm research were very hot, many excellent papers, textbooks, monographs of parallel algorithms were published. ▪ From the middle of 90’s, parallel algorithm had shifted to parallel computing. ▪ New opportunity for parallel algorithm research ▪ The dramatically decrease of computer price and the rapid development of communication technology, make it possible to build PC-cluster by ourselves. ▪ It is easy to get free software to support cluster from internet
中图种学学计算机科学与术系 University of Science and Technology ef D三 PARTMENT OF C Researc chInses Parallel computation models PRAM APRAM BSP logp MH and UMH M emory-LO Design techniques Partitioning Principle Divide-and-Conquer strategy Balanced Trees Method Doubling techniques Pipelining Techniques Parallel complexity theory Nc class P-complete NHPCC at Hefei 2021/2/6 8
NHPCC at Hefei 8 2021/2/6 Research Issues ▪ Parallel computation models ▪ PRAM ▪ APRAM ▪ BSP ▪ logP ▪ MH and UMH ▪ Memory-LogP ▪ Design techniques ▪ Partitioning Principle ▪ Divide-and-Conquer Strategy ▪ Balanced Trees Method ▪ Doubling Techniques ▪ Pipelining Techniques ▪ Parallel complexity theory ▪ NC class ▪ P-complete
中种学林水学计算机科学与术系 versity ResearchhIssuesna parallel computation models(1) PRAM(Parallel Random Access Memory SIMD-SM, Used for fine grain parallel computing Centralized shared memory, Globally synchronized Advantages Suitable for representing and analyzing complexity of parallel algorithms, Simple to use, hiding the most of low level details of parallel computer(communication, ynchronizationetc) Disadvantages Unsuitable for mimd computers Unrealistic to neglect the issues of memory contention communication delay etc NHPCC at Hefei 2021/2/6
NHPCC at Hefei 9 2021/2/6 Research Issues – Parallel computation models(1) ▪ PRAM(Parallel Random Access Memory) ▪ SIMD-SM, Used for fine grain parallel computing, Centralized shared memory, Globally synchronized. ▪ Advantages ▪ Suitable for representing and analyzing complexity of parallel algorithms, Simple to use , hiding the most of lowlevel details of parallel computer (communication, synchronization etc. ). ▪ Disadvantages ▪ Unsuitable for MIMD computers, Unrealistic to neglect the issues of memory contention, communication delay etc
中种学林水学计算机科学与术系 versity ResearchhIssuesna parallel computation models(2) Asynchronous PRAM APRAM or MIMD-SM, Used for medium grain parallel computation centralized shared memory asynchronous operation, Read/write shared variable communication, Explicit synchronous(barrier, etc. Computation in APRAM Computation Consists of global phases separated by barriers: in a phase all processors execute operation asynchronized the last instruction must be a synchronization instruction Advantages: Preserving much of simplicity of PRAM, better programmability, program must be correctness, easy to analyze complexity Disadvantages: Unsuitable for MIMD computers with distributed memory NHPCC at Hefei 2021/2/6 10
NHPCC at Hefei 10 2021/2/6 Research Issues – Parallel computation models(2) ▪ Asynchronous PRAM ▪ APRAM or MIMD-SM, Used for medium grain parallel computation, Centralized shared memory, Asynchronous operation, Read/write shared variable communication, Explicit synchronous (barrier, etc.). ▪ Computation in APRAM ▪ Computation Consists of global phases separated by barriers: in a phase all processors execute operation asynchronized, the last instruction must be a synchronization instruction. ▪ Advantages: Preserving much of simplicity of PRAM, better programmability, program must be correctness, easy to analyze complexity. ▪ Disadvantages: Unsuitable for MIMD computers with distributed memory