正在加载图片...
TABLE OF CONTENTS Chapter 1.Introduction..1 1.1.From Graphics Processing to General Purpose Parallel Computing...............................1 1.2.CUDA:A General-Purpose Parallel Computing Platform and Programming Model.............3 1.3.A Scalable Programming Model.....................................4 1.4.Document Structure....................... …6 8 Chapter 2.Programming Model................ 2.1.Kernels..… .8 2.2.Thread Hierarchy........................... 9 2.3.Memory Hierarchy................... 11 2.4.Heterogeneous Programming.................. 3 2.5.Compute Capability.............. 15 Chapter 3.Programming Interface........ …16 3.1.Compilation with NVCC............ 16 3.1.1.Compilation Workflow.... 17 3.1.1.1.Offline Compilation....... 17 3.1.1.2.Just-in-Time Compilation.. .17 3.1.2.Binary Compatibility............. .17 3.1.3.PTX Compatibility............... 18 3.1.4.Application Compatibility........... 18 3.1.5.C/C++Compatibility...... .19 3.1.6.64-Bit Compatibility................ 19 3.2.cUDA C Runtime.............. .19 3.2.1.Initiatization................. …20 3.2.2.Device Memory.… 20 3.2.3.Shared Memory.................. .23 3.2.4.Page-Locked Host Memory..................... 28 3.2.4.1.Portable Memory.................... …29 3.2.4.2.Write-combining Memory.........29 3.2.4.3.Mapped Memory.................. 29 3.2.5.Asynchronous Concurrent Execution............... .30 3.2.5.1.Concurrent Execution between Host and Device..... .31 3.2.5.2.Concurrent Kernel Execution....................... 31 3.2.5.3.Overlap of Data Transfer and Kernel Execution.... ,31 3.2.5.4.Concurrent Data Transfers...................... 32 3.2.5.5.Streams.… .32 3.2.5.6.Events.… 36 3.2.5.7.Synchronous Calls.... .36 3.2.6.Multi-Device System...................... .37 3.2.6.1.Device Enumeration...... .37 3.2.6.2.Device Selection......................37 www.nvidia.com CUDA C Programming Guide PG-02829-001_v8.0|iiwww.nvidia.com CUDA C Programming Guide PG-02829-001_v8.0 | iii TABLE OF CONTENTS Chapter 1. Introduction.........................................................................................1 1.1. From Graphics Processing to General Purpose Parallel Computing............................... 1 1.2. CUDA®: A General-Purpose Parallel Computing Platform and Programming Model.............3 1.3. A Scalable Programming Model.........................................................................4 1.4. Document Structure...................................................................................... 6 Chapter 2. Programming Model............................................................................... 8 2.1. Kernels......................................................................................................8 2.2. Thread Hierarchy......................................................................................... 9 2.3. Memory Hierarchy....................................................................................... 11 2.4. Heterogeneous Programming.......................................................................... 13 2.5. Compute Capability..................................................................................... 15 Chapter 3. Programming Interface..........................................................................16 3.1. Compilation with NVCC................................................................................ 16 3.1.1. Compilation Workflow.............................................................................17 3.1.1.1. Offline Compilation.......................................................................... 17 3.1.1.2. Just-in-Time Compilation....................................................................17 3.1.2. Binary Compatibility...............................................................................17 3.1.3. PTX Compatibility..................................................................................18 3.1.4. Application Compatibility.........................................................................18 3.1.5. C/C++ Compatibility............................................................................... 19 3.1.6. 64-Bit Compatibility............................................................................... 19 3.2. CUDA C Runtime.........................................................................................19 3.2.1. Initialization.........................................................................................20 3.2.2. Device Memory..................................................................................... 20 3.2.3. Shared Memory..................................................................................... 23 3.2.4. Page-Locked Host Memory........................................................................28 3.2.4.1. Portable Memory..............................................................................29 3.2.4.2. Write-Combining Memory....................................................................29 3.2.4.3. Mapped Memory...............................................................................29 3.2.5. Asynchronous Concurrent Execution............................................................ 30 3.2.5.1. Concurrent Execution between Host and Device........................................31 3.2.5.2. Concurrent Kernel Execution............................................................... 31 3.2.5.3. Overlap of Data Transfer and Kernel Execution......................................... 31 3.2.5.4. Concurrent Data Transfers.................................................................. 32 3.2.5.5. Streams.........................................................................................32 3.2.5.6. Events...........................................................................................36 3.2.5.7. Synchronous Calls.............................................................................36 3.2.6. Multi-Device System............................................................................... 37 3.2.6.1. Device Enumeration.......................................................................... 37 3.2.6.2. Device Selection.............................................................................. 37
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有