正在加载图片...
G.5.Compute Capability 5.x..... 229 G.5.1.Architecture......... 229 G.5.2.Global Memory.............. 230 G.5.3.Shared Memory................ 230 G.6.Compute Capability 6.x...... …234 G.6.1.Architecture...................... 0.234 G.6.2.Global Memory.................... .234 G.6.3.Shared Memory............... 234 Appendix H.Driver API................ ,235 H.1.C0 ntext.… ….238 H.2.Module.… .239 H.3.Kernel Execution.......................... ...240 H.4.Interoperability between Runtime and Driver APls. .242 Appendix I.CUDA Environment Variables......... .243 Appendix J.Unified Memory Programming................ .246 J.1.Unified Memory Introduction............. ..246 J.1.1.Simplifying GPU Programming................... .247 J.1.2.Data Migration and Coherency...... 248 J.1.3.GPU Memory Oversubscription........249 J.1.4.Multi-GPU Support.............. .249 J.1.5.System Requirements..................... 250 J.2.Programming Model................ ...250 J.2.1.Managed Memory Opt In....................... ….250 J.2.1.1.Explicit Allocation Using cudaMallocManaged()..........................250 J.2.1.2.Global-Scope Managed Variables Usingmanaged_.................................251 J.2.2.Coherency and concurrency.................252 J.2.2.1.GPU Exclusive Access To Managed Memory............252 J.2.2.2.Explicit Synchronization and Logical GPU Activity..........................253 J.2.2.3.Managing Data Visibility and Concurrent CPU GPU Access with Streams.........254 J.2.2.4.Stream Association Examples.................255 J.2.2.5.Stream Attach With Multithreaded Host Programs..56 J.2.2.6.Advanced Topic:Modular Programs and Data Access Constraints....................257 J.2.2.7.Memcpy()/Memset()Behavior With Managed Memory................ 258 J.2.3.Language Integration.............258 J.2.3.1.Host Program Errors withmanaged_Variables.............................. 259 J.2.4.Querying Unified Memory Support....260 J.2.4.1.Device Properties............. ,260 .2.4.2.Pointer Attributes...260 J.2.5.Advanced Topics...................... ...260 J.2.5.1.Managed Memory with Multi-GPU Programs on pre-6.x Architectures...............260 J.2.5.2.Using fork()with Managed Memory.................. 261 J.3.Performance Tuning..................... .261 J.3.1.Data Prefetching................. ..262 www.nvidia.com CUDA C Programming Guide PG-02829-001_v8.01xiwww.nvidia.com CUDA C Programming Guide PG-02829-001_v8.0 | xi G.5. Compute Capability 5.x.............................................................................. 229 G.5.1. Architecture.......................................................................................229 G.5.2. Global Memory....................................................................................230 G.5.3. Shared Memory................................................................................... 230 G.6. Compute Capability 6.x.............................................................................. 234 G.6.1. Architecture.......................................................................................234 G.6.2. Global Memory....................................................................................234 G.6.3. Shared Memory................................................................................... 234 Appendix H. Driver API...................................................................................... 235 H.1. Context.................................................................................................. 238 H.2. Module...................................................................................................239 H.3. Kernel Execution.......................................................................................240 H.4. Interoperability between Runtime and Driver APIs.............................................. 242 Appendix I. CUDA Environment Variables................................................................243 Appendix J. Unified Memory Programming..............................................................246 J.1. Unified Memory Introduction.........................................................................246 J.1.1. Simplifying GPU Programming.................................................................. 247 J.1.2. Data Migration and Coherency................................................................. 248 J.1.3. GPU Memory Oversubscription..................................................................249 J.1.4. Multi-GPU Support................................................................................ 249 J.1.5. System Requirements............................................................................ 250 J.2. Programming Model....................................................................................250 J.2.1. Managed Memory Opt In.........................................................................250 J.2.1.1. Explicit Allocation Using cudaMallocManaged()........................................ 250 J.2.1.2. Global-Scope Managed Variables Using __managed__................................. 251 J.2.2. Coherency and Concurrency.................................................................... 252 J.2.2.1. GPU Exclusive Access To Managed Memory............................................. 252 J.2.2.2. Explicit Synchronization and Logical GPU Activity.....................................253 J.2.2.3. Managing Data Visibility and Concurrent CPU + GPU Access with Streams......... 254 J.2.2.4. Stream Association Examples..............................................................255 J.2.2.5. Stream Attach With Multithreaded Host Programs.....................................256 J.2.2.6. Advanced Topic: Modular Programs and Data Access Constraints....................257 J.2.2.7. Memcpy()/Memset() Behavior With Managed Memory.................................258 J.2.3. Language Integration.............................................................................258 J.2.3.1. Host Program Errors with __managed__ Variables..................................... 259 J.2.4. Querying Unified Memory Support............................................................. 260 J.2.4.1. Device Properties............................................................................260 J.2.4.2. Pointer Attributes........................................................................... 260 J.2.5. Advanced Topics.................................................................................. 260 J.2.5.1. Managed Memory with Multi-GPU Programs on pre-6.x Architectures.............. 260 J.2.5.2. Using fork() with Managed Memory...................................................... 261 J.3. Performance Tuning....................................................................................261 J.3.1. Data Prefetching.................................................................................. 262
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有