www.nvidia.com CUDA C Programming Gui_中国高校课件下载中心

点击下载：《GPU并行编程 GPU Parallel Programming》课程教学资源（参考文献）NVIDIA CUDA C Programming Guide（Design Guide，June 2017）

正在加载图片...

TABLE OF CONTENTS Chapter 1.Introduction..1 1.1.From Graphics Processing to General Purpose Parallel Computing...............................1 1.2.CUDA:A General-Purpose Parallel Computing Platform and Programming Model.............3 1.3.A Scalable Programming Model.....................................4 1.4.Document Structure....................... …6 8 Chapter 2.Programming Model................ 2.1.Kernels..… .8 2.2.Thread Hierarchy........................... 9 2.3.Memory Hierarchy................... 11 2.4.Heterogeneous Programming.................. 3 2.5.Compute Capability.............. 15 Chapter 3.Programming Interface........ …16 3.1.Compilation with NVCC............ 16 3.1.1.Compilation Workflow.... 17 3.1.1.1.Offline Compilation....... 17 3.1.1.2.Just-in-Time Compilation.. .17 3.1.2.Binary Compatibility............. .17 3.1.3.PTX Compatibility............... 18 3.1.4.Application Compatibility........... 18 3.1.5.C/C++Compatibility...... .19 3.1.6.64-Bit Compatibility................ 19 3.2.cUDA C Runtime.............. .19 3.2.1.Initiatization................. …20 3.2.2.Device Memory.… 20 3.2.3.Shared Memory.................. .23 3.2.4.Page-Locked Host Memory..................... 28 3.2.4.1.Portable Memory.................... …29 3.2.4.2.Write-combining Memory.........29 3.2.4.3.Mapped Memory.................. 29 3.2.5.Asynchronous Concurrent Execution............... .30 3.2.5.1.Concurrent Execution between Host and Device..... .31 3.2.5.2.Concurrent Kernel Execution....................... 31 3.2.5.3.Overlap of Data Transfer and Kernel Execution.... ,31 3.2.5.4.Concurrent Data Transfers...................... 32 3.2.5.5.Streams.… .32 3.2.5.6.Events.… 36 3.2.5.7.Synchronous Calls.... .36 3.2.6.Multi-Device System...................... .37 3.2.6.1.Device Enumeration...... .37 3.2.6.2.Device Selection......................37 www.nvidia.com CUDA C Programming Guide PG-02829-001_v8.0|iiwww.nvidia.com CUDA C Programming Guide PG-02829-001_v8.0 | iii TABLE OF CONTENTS Chapter 1. Introduction.........................................................................................1 1.1. From Graphics Processing to General Purpose Parallel Computing............................... 1 1.2. CUDA®: A General-Purpose Parallel Computing Platform and Programming Model.............3 1.3. A Scalable Programming Model.........................................................................4 1.4. Document Structure...................................................................................... 6 Chapter 2. Programming Model............................................................................... 8 2.1. Kernels......................................................................................................8 2.2. Thread Hierarchy......................................................................................... 9 2.3. Memory Hierarchy....................................................................................... 11 2.4. Heterogeneous Programming.......................................................................... 13 2.5. Compute Capability..................................................................................... 15 Chapter 3. Programming Interface..........................................................................16 3.1. Compilation with NVCC................................................................................ 16 3.1.1. Compilation Workflow.............................................................................17 3.1.1.1. Offline Compilation.......................................................................... 17 3.1.1.2. Just-in-Time Compilation....................................................................17 3.1.2. Binary Compatibility...............................................................................17 3.1.3. PTX Compatibility..................................................................................18 3.1.4. Application Compatibility.........................................................................18 3.1.5. C/C++ Compatibility............................................................................... 19 3.1.6. 64-Bit Compatibility............................................................................... 19 3.2. CUDA C Runtime.........................................................................................19 3.2.1. Initialization.........................................................................................20 3.2.2. Device Memory..................................................................................... 20 3.2.3. Shared Memory..................................................................................... 23 3.2.4. Page-Locked Host Memory........................................................................28 3.2.4.1. Portable Memory..............................................................................29 3.2.4.2. Write-Combining Memory....................................................................29 3.2.4.3. Mapped Memory...............................................................................29 3.2.5. Asynchronous Concurrent Execution............................................................ 30 3.2.5.1. Concurrent Execution between Host and Device........................................31 3.2.5.2. Concurrent Kernel Execution............................................................... 31 3.2.5.3. Overlap of Data Transfer and Kernel Execution......................................... 31 3.2.5.4. Concurrent Data Transfers.................................................................. 32 3.2.5.5. Streams.........................................................................................32 3.2.5.6. Events...........................................................................................36 3.2.5.7. Synchronous Calls.............................................................................36 3.2.6. Multi-Device System............................................................................... 37 3.2.6.1. Device Enumeration.......................................................................... 37 3.2.6.2. Device Selection.............................................................................. 37

<<向上翻页向下翻页>>

点击下载：《GPU并行编程 GPU Parallel Programming》课程教学资源（参考文献）NVIDIA CUDA C Programming Guide（Design Guide，June 2017）