正在加载图片...
Contents iⅸ Step 3.Using Hardware Trigonometry Functions.....163 Step 4.Experimental Performance Tuning.....................166 8.4 Final Evaluation.. .167 8.5 Exercises.170 CHAPTER 9 APPLICATION CASE STUDY:MOLECULAR VISUALIZATION AND ANALYSIS. .173 9.1 Application Background194 9.2 A Simple Kemnel Implementation....176 9.3 Instruction Execution Efficiency.................. 180 9.4 Memory Coalescing182 9.5 Additional Performance Comparisons185 9.6 Using Multiple GPUs.. .187 9.7 Exercises........ .188 CHAPTER 10 PARALLEL PROGRAMMING AND COMPUTATIONAL THINKING... 191 10.1 Goals of Parallel Programming .................... 192 10.2 Problem Decomposition. 193 10.3 Algorithm Selection ........ .196 10.4 Computational Thinking...... 202 10.5 Exercises… .204 CHAPTER 11 A BRIEF INTRODUCTION TO OPENCLTM...05 11.1 Background.… 205 11.2 Data Parallelism Model........ .207 11.3 Device Architecture209 11.4 Kernel Functions......... 211 11.5 Device Management and Kernel Launch...........................212 11.6 Electrostatic Potential Map in OpenCL14 11.7 Summary.… .219 11.8 Exercises. 220 CHAPTER 12 CONCLUSION AND FUTURE OUTLOOK21 12.1Goals Revisited221 12.2 Memory Architecture Evolution23 12.2.1 Large Virtual and Physical Address Spaces.......223 12.2.2 Unified Device Memory Space.224 12.2.3 Configurable Caching and Scratch Pad...225 12.2.4 Enhanced Atomic Operations.226 12.2.5 Enhanced Global Memory Access.................... .226Step 3. Using Hardware Trigonometry Functions ....................163 Step 4. Experimental Performance Tuning ...............................166 8.4 Final Evaluation..........................................................................167 8.5 Exercises .....................................................................................170 CHAPTER 9 APPLICATION CASE STUDY: MOLECULAR VISUALIZATION AND ANALYSIS............................................................................173 9.1 Application Background.............................................................174 9.2 A Simple Kernel Implementation ..............................................176 9.3 Instruction Execution Efficiency................................................180 9.4 Memory Coalescing....................................................................182 9.5 Additional Performance Comparisons .......................................185 9.6 Using Multiple GPUs .................................................................187 9.7 Exercises .....................................................................................188 CHAPTER 10 PARALLEL PROGRAMMING AND COMPUTATIONAL THINKING ....................................................................................191 10.1 Goals of Parallel Programming ...............................................192 10.2 Problem Decomposition ...........................................................193 10.3 Algorithm Selection .................................................................196 10.4 Computational Thinking...........................................................202 10.5 Exercises ...................................................................................204 CHAPTER 11 A BRIEF INTRODUCTION TO OPENCL ......................................205 11.1 Background...............................................................................205 11.2 Data Parallelism Model............................................................207 11.3 Device Architecture..................................................................209 11.4 Kernel Functions ......................................................................211 11.5 Device Management and Kernel Launch ................................212 11.6 Electrostatic Potential Map in OpenCL ..................................214 11.7 Summary...................................................................................219 11.8 Exercises ...................................................................................220 CHAPTER 12 CONCLUSION AND FUTURE OUTLOOK ........................................221 12.1 Goals Revisited.........................................................................221 12.2 Memory Architecture Evolution ..............................................223 12.2.1 Large Virtual and Physical Address Spaces ................223 12.2.2 Unified Device Memory Space ....................................224 12.2.3 Configurable Caching and Scratch Pad........................225 12.2.4 Enhanced Atomic Operations .......................................226 12.2.5 Enhanced Global Memory Access ...............................226 Contents ix
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有