Introduction www.nvidia.com CUDA C Pr_中国高校课件下载中心

点击下载：《GPU并行编程 GPU Parallel Programming》课程教学资源（参考文献）NVIDIA CUDA C Programming Guide（Design Guide，June 2017）

正在加载图片...

Introduction there is a lower requirement for sophisticated flow control,and because it is executed on many data elements and has high arithmetic intensity,the memory access latency can be hidden with calculations instead of big data caches. Data-parallel processing maps data elements to parallel processing threads.Many applications that process large data sets can use a data-parallel programming model to speed up the computations.In 3D rendering,large sets of pixels and vertices are mapped to parallel threads.Similarly,image and media processing applications such as post-processing of rendered images,video encoding and decoding,image scaling,stereo vision,and pattern recognition can map image blocks and pixels to parallel processing threads.In fact,many algorithms outside the field of image rendering and processing are accelerated by data-parallel processing,from general signal processing or physics simulation to computational finance or computational biology. 1.2.CUDA:A General-Purpose Parallel Computing Platform and Programming Model In November 2006,NVIDIA introduced CUDA,a general purpose parallel computing platform and programming model that leverages the parallel compute engine in NVIDIA GPUs to solve many complex computational problems in a more efficient way than on a CPU. CUDA comes with a software environment that allows developers to use C as a high- level programming language.As illustrated by Figure 4,other languages,application programming interfaces,or directives-based approaches are supported,such as FORTRAN,DirectCompute,OpenACC. www.nvidia.com CUDA C Programming Guide PG-02829-001v8.0|3Introduction www.nvidia.com CUDA C Programming Guide PG-02829-001_v8.0 | 3 there is a lower requirement for sophisticated flow control, and because it is executed on many data elements and has high arithmetic intensity, the memory access latency can be hidden with calculations instead of big data caches. Data-parallel processing maps data elements to parallel processing threads. Many applications that process large data sets can use a data-parallel programming model to speed up the computations. In 3D rendering, large sets of pixels and vertices are mapped to parallel threads. Similarly, image and media processing applications such as post-processing of rendered images, video encoding and decoding, image scaling, stereo vision, and pattern recognition can map image blocks and pixels to parallel processing threads. In fact, many algorithms outside the field of image rendering and processing are accelerated by data-parallel processing, from general signal processing or physics simulation to computational finance or computational biology. 1.2. CUDA® : A General-Purpose Parallel Computing Platform and Programming Model In November 2006, NVIDIA introduced CUDA® , a general purpose parallel computing platform and programming model that leverages the parallel compute engine in NVIDIA GPUs to solve many complex computational problems in a more efficient way than on a CPU. CUDA comes with a software environment that allows developers to use C as a highlevel programming language. As illustrated by Figure 4, other languages, application programming interfaces, or directives-based approaches are supported, such as FORTRAN, DirectCompute, OpenACC

<<向上翻页向下翻页>>

点击下载：《GPU并行编程 GPU Parallel Programming》课程教学资源（参考文献）NVIDIA CUDA C Programming Guide（Design Guide，June 2017）