《并行与分布式程序设计》课程教学参考书：NVIDIA《CUDA C Programming》（Professional）.pdf_大学文库

Professional CUDAC Programming Published by John Wiley Sons,Inc. 10475 Crosspoint Boulevard Indianapolis,IN 46256 www.wiley.com Copyright 2014 by John Wiley Sons,Inc.,Indianapolis,Indiana Published simultaneously in Canada ISBN:978-1-118-73932-7 ISBN:978-1-118-73927-3(ebk) IsBN:978-1-118-73931-0(ebk) Manufactured in the United States of America 10987654321 No part of this publication may be reproduced,stored in a retrieval system or transmitted in any form or by any means, electronic,mechanical,photocopying,recording,scanning or otherwise,except as permitted under Sections 107 or 108 of the 1976 United States Copyright Act,without either the prior written permission of the Publisher,or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center,222 Rosewood Drive,Danvers, MA 01923,(978)750-8400,fax(978)646-8600.Requests to the Publisher for permission should be addressed to the Permissions Department,John Wiley Sons,Inc.,111 River Street,Hoboken,NJ 07030,(201)748-6011,fax(201)748- 6008,or online at http://www.wiley.com/go/permissions. Limit of Liability/Disclaimer of Warranty:The publisher and the author make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifically disclaim all warranties,including without limitation warranties of fitness for a particular purpose.No warranty may be created or extended by sales or pro- motional materials.The advice and strategies contained herein may not be suitable for every situation.This work is sold with the understanding that the publisher is not engaged in rendering legal,accounting,or other professional services. If professional assistance is required,the services of a competent professional person should be sought.Neither the pub- lisher nor the author shall be liable for damages arising herefrom.The fact that an organization or Web site is referred to in this work as a citation and/or a potential source of further information does not mean that the author or the publisher endorses the information the organization or Web site may provide or recommendations it may make.Further,readers should be aware that Internet Web sites listed in this work may have changed or disappeared between when this work was written and when it is read. For general information on our other products and services please contact our Customer Care Department within the United States at(877)762-2974,outside the United States at(317)572-3993 or fax (317)572-4002. Wiley publishes in a variety of print and electronic formats and by print-on-demand.Some material included with stan- dard print versions of this book may not be included in e-books or in print-on-demand.If this book refers to media such as a CD or DVD that is not included in the version you purchased,you may download this material at http://book- support.wiley.com.For more information about Wiley products,visit www.wiley.com. Library of Congress Control Number:2014937184 Trademarks:Wiley,Wrox,the Wrox logo,Programmer to Programmer,and related trade dress are trademarks or regis- tered trademarks of John Wiley Sons,Inc.and/or its affiliates,in the United States and other countries,and may not be used without written permission.CUDA is a registered trademark of NVIDIA Corporation.All other trademarks are the property of their respective owners.John Wiley Sons,Inc.,is not associated with any product or vendor mentioned in this book. www.it-ebooks.info

ffi rs.indd 08/07/2014 Page ii Professional CUDA® C Programming Published by John Wiley & Sons, Inc. 10475 Crosspoint Boulevard Indianapolis, IN 46256 www.wiley.com Copyright © 2014 by John Wiley & Sons, Inc., Indianapolis, Indiana Published simultaneously in Canada ISBN: 978-1-118-73932-7 ISBN: 978-1-118-73927-3 (ebk) ISBN: 978-1-118-73931-0 (ebk) Manufactured in the United States of America 10 9 8 7 6 5 4 3 2 1 No part of this publication may be reproduced, stored in a retrieval system or transmitted in any form or by any means, electronic, mechanical, photocopying, recording, scanning or otherwise, except as permitted under Sections 107 or 108 of the 1976 United States Copyright Act, without either the prior written permission of the Publisher, or authorization through payment of the appropriate per-copy fee to the Copyright Clearance Center, 222 Rosewood Drive, Danvers, MA 01923, (978) 750-8400, fax (978) 646-8600. Requests to the Publisher for permission should be addressed to the Permissions Department, John Wiley & Sons, Inc., 111 River Street, Hoboken, NJ 07030, (201) 748-6011, fax (201) 748- 6008, or online at http://www.wiley.com/go/permissions. Limit of Liability/Disclaimer of Warranty: The publisher and the author make no representations or warranties with respect to the accuracy or completeness of the contents of this work and specifi cally disclaim all warranties, including without limitation warranties of fi tness for a particular purpose. No warranty may be created or extended by sales or promotional materials. The advice and strategies contained herein may not be suitable for every situation. This work is sold with the understanding that the publisher is not engaged in rendering legal, accounting, or other professional services. If professional assistance is required, the services of a competent professional person should be sought. Neither the publisher nor the author shall be liable for damages arising herefrom. The fact that an organization or Web site is referred to in this work as a citation and/or a potential source of further information does not mean that the author or the publisher endorses the information the organization or Web site may provide or recommendations it may make. Further, readers should be aware that Internet Web sites listed in this work may have changed or disappeared between when this work was written and when it is read. For general information on our other products and services please contact our Customer Care Department within the United States at (877) 762-2974, outside the United States at (317) 572-3993 or fax (317) 572-4002. Wiley publishes in a variety of print and electronic formats and by print-on-demand. Some material included with standard print versions of this book may not be included in e-books or in print-on-demand. If this book refers to media such as a CD or DVD that is not included in the version you purchased, you may download this material at http://booksupport.wiley.com. For more information about Wiley products, visit www.wiley.com. Library of Congress Control Number: 2014937184 Trademarks: Wiley, Wrox, the Wrox logo, Programmer to Programmer, and related trade dress are trademarks or registered trademarks of John Wiley & Sons, Inc. and/or its affi liates, in the United States and other countries, and may not be used without written permission. CUDA is a registered trademark of NVIDIA Corporation. All other trademarks are the property of their respective owners. John Wiley & Sons, Inc., is not associated with any product or vendor mentioned in this book. www.it-ebooks.info

ABOUT THE AUTHORS JOHN(RUNWEI)CHENG is a research scientist with extensive industry experience in high-performance computing on heterogeneous computing platforms.Before joining the oil and gas industry,John worked in the finance industry for more than ten years as an expert in computational intelligence,providing advanced solutions based on genetic algorithms hybridized with data mining and statistical learning to solve real world business challenges.As an internationally recognized researcher in the field of genetic algorithms and their application to industrial engineering,John has co-authored three books.John's first book,Genetic Algorithms and Engineering Design,published by John Wiley and Sons in 1997,is still used as a textbook in universities worldwide.John has a wide range of experience in both academic research and industry development,and is gifted in making complex subjects accessible to readers with a concise,illustrative,and edifying approach.John earned his doctoral degree in computational intelligence from the Tokyo Institute of Technology. MAX GROSSMAN has been working as a developer with various GPU programming models for nearly a decade.His experience is focused in developing novel GPU pro- gramming models and implementing scientific algorithms on GPU hardware.Max has applied GPUs to a wide range of domains,including geoscience,plasma phys- ics,medical imaging,and machine learning,and enjoys understanding the compu- tational patterns of new domains and finding new and unusual ways to apply GPUs to them.Lessons learned from these domains help to guide Max's work in programming models and frameworks.Max earned his degree in computer science from Rice University with a focus in paral- lel computing TY MCKERCHER is a Principal Solution Architect with NVIDIA,leading a team that specializes in visual computing systems architecture across multiple industries. He often serves as a liaison between customer and product engineering teams dur- ing emerging technology evaluations.He has been engaged in CUDA-based projects since he participated in the first CUDA kitchen training session held at NVIDIA headquarters in 2006.Since then,Ty has helped architect GPU-based supercomput- ing environments at some of the largest and most demanding production datacenters in the world. Ty earned his mathematics degree with emphasis in geophysics and computer science from the Colorado School of Mines. www.it-ebooks.info

ffi rs.indd 08/07/2014 Page v ABOUT THE AUTHORS JOHN (RUNWEI) CHENG is a research scientist with extensive industry experience in high-performance computing on heterogeneous computing platforms. Before joining the oil and gas industry, John worked in the fi nance industry for more than ten years as an expert in computational intelligence, providing advanced solutions based on genetic algorithms hybridized with data mining and statistical learning to solve real world business challenges. As an internationally recognized researcher in the fi eld of genetic algorithms and their application to industrial engineering, John has co-authored three books. John’s fi rst book, Genetic Algorithms and Engineering Design, published by John Wiley and Sons in 1997, is still used as a textbook in universities worldwide. John has a wide range of experience in both academic research and industry development, and is gifted in making complex subjects accessible to readers with a concise, illustrative, and edifying approach. John earned his doctoral degree in computational intelligence from the Tokyo Institute of Technology. MAX GROSSMAN has been working as a developer with various GPU programming models for nearly a decade. His experience is focused in developing novel GPU programming models and implementing scientifi c algorithms on GPU hardware. Max has applied GPUs to a wide range of domains, including geoscience, plasma physics, medical imaging, and machine learning, and enjoys understanding the computational patterns of new domains and fi nding new and unusual ways to apply GPUs to them. Lessons learned from these domains help to guide Max’s work in programming models and frameworks. Max earned his degree in computer science from Rice University with a focus in parallel computing. TY MCKERCHER is a Principal Solution Architect with NVIDIA, leading a team that specializes in visual computing systems architecture across multiple industries. He often serves as a liaison between customer and product engineering teams during emerging technology evaluations. He has been engaged in CUDA-based projects since he participated in the fi rst CUDA kitchen training session held at NVIDIA headquarters in 2006. Since then, Ty has helped architect GPU-based supercomputing environments at some of the largest and most demanding production datacenters in the world. Ty earned his mathematics degree with emphasis in geophysics and computer science from the Colorado School of Mines. www.it-ebooks.info

ACKNOWLEDGMENTS IT WOULD BE HARD TO IMAGINE this project making it to the finish line without the suggestions, constructive criticisms,help,and resources of our colleagues and friends We would like to express our thanks to NVIDIA for granting access to many GTC conference pre- sentations and CUDA technical documents that add both great value and authority to this book. In particular,we owe much gratitude to Dr.Paulius Micikevicius and Dr.Peng Wang,Developer Technology Engineers at NVIDIA,for their kind advice and help during the writing of this book. Special thanks to Mark Ebersole,NVIDIA Chief CUDA Educator,for his guidance and feedback during the review process. We would like to thank Mr.Will Ramey,Sr.Product Manager at NVIDIA,and Mr.Nadeem Mohammad,Product Marketing at NVIDIA,for their support and encouragement during the entire project. We would like to thank Mr.Paul Holzhauer,Director of Oil Gas at NVIDIA,for his support during the initial phase of this project. Especially,we owe an enormous debt of gratitude to many presenters and speakers in past GTC con- ferences for their inspiring and creative work on GPU computing technologies.We have recorded all your credits in our suggested reading lists. After years of work using GPUs in real production projects,John is very grateful to the people who helped him become a GPU computing enthusiast.Especially,John would like to thank Dr.Nanxun Dai and Dr.Bao Zhao for their encouragement,support,and guidance on seismic imaging projects at BGP.John also would like to thank his colleagues Dr.Zhengzhen Zhou,Dr.Wei Zhang,Mrs. Grace Zhang,and Mr.Kai Yang.They are truly brilliant and very pleasant to work with.John loves the team and feels very privileged to be one of them.John would like to extend a special thanks to Dr.Mitsuo Gen,an internationally well-known professor,the supervisor of John's doctoral pro- gram,for giving John the opportunity to teach at universities in Japan and co-author academic books,especially for his fully supporting John during the years when John was running a startup based on evolutionary computation technologies in Tokyo.John is very happy working on this proj- ect with Ty and Max as a team and learned a lot from them during the process of book writing. John owes a debt of gratitude to his wife,Joly,and his son,Rick,for their love,support,and consid- erable patience during evenings and weekends over the past year while Dad was yet again "doing his own book work.” For over 25 years,Ty has been helping software developers solve HPC grand challenges.Ty is delighted to work at NVIDIA to help clients extend their current knowledge to unlock the poten- tial from massively parallel GPUs.There are so many NVIDIANs to thank,but Ty would like to specifically recognize Dr.Paulius Micikevicius for his gifted insights and strong desire to always improve while doing the heavy lifting for numerous projects.When John asked Ty to help share www.it-ebooks.info

ffi rs.indd 08/07/2014 Page vii ACKNOWLEDGMENTS IT WOULD BE HARD TO IMAGINE this project making it to the fi nish line without the suggestions, constructive criticisms, help, and resources of our colleagues and friends. We would like to express our thanks to NVIDIA for granting access to many GTC conference presentations and CUDA technical documents that add both great value and authority to this book. In particular, we owe much gratitude to Dr. Paulius Micikevicius and Dr. Peng Wang, Developer Technology Engineers at NVIDIA, for their kind advice and help during the writing of this book. Special thanks to Mark Ebersole, NVIDIA Chief CUDA Educator, for his guidance and feedback during the review process. We would like to thank Mr. Will Ramey, Sr. Product Manager at NVIDIA, and Mr. Nadeem Mohammad, Product Marketing at NVIDIA, for their support and encouragement during the entire project. We would like to thank Mr. Paul Holzhauer, Director of Oil & Gas at NVIDIA, for his support during the initial phase of this project. Especially, we owe an enormous debt of gratitude to many presenters and speakers in past GTC conferences for their inspiring and creative work on GPU computing technologies. We have recorded all your credits in our suggested reading lists. After years of work using GPUs in real production projects, John is very grateful to the people who helped him become a GPU computing enthusiast. Especially, John would like to thank Dr. Nanxun Dai and Dr. Bao Zhao for their encouragement, support, and guidance on seismic imaging projects at BGP. John also would like to thank his colleagues Dr. Zhengzhen Zhou, Dr. Wei Zhang, Mrs. Grace Zhang, and Mr. Kai Yang. They are truly brilliant and very pleasant to work with. John loves the team and feels very privileged to be one of them. John would like to extend a special thanks to Dr. Mitsuo Gen, an internationally well-known professor, the supervisor of John’s doctoral program, for giving John the opportunity to teach at universities in Japan and co-author academic books, especially for his fully supporting John during the years when John was running a startup based on evolutionary computation technologies in Tokyo. John is very happy working on this project with Ty and Max as a team and learned a lot from them during the process of book writing. John owes a debt of gratitude to his wife, Joly, and his son, Rick, for their love, support, and considerable patience during evenings and weekends over the past year while Dad was yet again “doing his own book work.” For over 25 years, Ty has been helping software developers solve HPC grand challenges. Ty is delighted to work at NVIDIA to help clients extend their current knowledge to unlock the potential from massively parallel GPUs. There are so many NVIDIANs to thank, but Ty would like to specifi cally recognize Dr. Paulius Micikevicius for his gifted insights and strong desire to always improve while doing the heavy lifting for numerous projects. When John asked Ty to help share www.it-ebooks.info

ACKNOWLEDGMENTS CUDA knowledge in a book project,he welcomed the challenge.Dave Jones,NVIDIA,senior direc- tor approved Ty's participation in this project,and sadly last year Dave lost his courageous battle against cancer.Our hearts go out to Dave and his family-his memory serves to inspire,to press on,and to pursue your passions.The encouragements from Shanker Trivedi and Marc Hamilton have been especially helpful.Yearning to maintain his life/work balance,Ty recruited Max to join this project.It was truly a pleasure to learn from John and Max as they developed the book content that Ty helped review.Finally,Ty's wife,Judy,and his four children deserve recognition for their unconditional support and love-it is a blessing to receive encouragement and motivation while pursuing those things that bring joy to your life. Max has been fortunate to collaborate with and be guided by a number of brilliant and talented engineers,researchers,and mentors.First,thanks have to go to Professor Vivek Sarkar and the whole Habanero Research Group at Rice University.There,Max got his first taste of HPC and CUDA.The mentorship of Vivek and others in the group was invaluable in enabling him to explore the exciting world of research.Max would also like to thank Mauricio Araya-Polo and Gladys Gonzalez at Repsol.The experience gained under their mentorship was incredibly valuable in writ- ing a book that would be truly useful to real-world work in science and engineering.Finally,Max would like to thank John and Ty for inviting him along on this writing adventure in CUDA and for the lessons this experience has provided in CUDA,writing,and life. It would not be possible to make a quality professional book without input from technical editors, development editors,and reviewers.We would like to extend our sincere appreciation to Mary E. James,our acquisitions editor;Martin V.Minner,our project editor;Katherine Burt,our copy edi- tor;and Wei Zhang and Chao Zhao,our technical editors.You are an insightful and professional editorial team and this book would not be what it is without you.It was a great pleasure to work with you on this project. www.it-ebooks.info

ffi rs.indd 08/07/2014 Page viii CUDA knowledge in a book project, he welcomed the challenge. Dave Jones, NVIDIA, senior director approved Ty’s participation in this project, and sadly last year Dave lost his courageous battle against cancer. Our hearts go out to Dave and his family — his memory serves to inspire, to press on, and to pursue your passions. The encouragements from Shanker Trivedi and Marc Hamilton have been especially helpful. Yearning to maintain his life/work balance, Ty recruited Max to join this project. It was truly a pleasure to learn from John and Max as they developed the book content that Ty helped review. Finally, Ty’s wife, Judy, and his four children deserve recognition for their unconditional support and love — it is a blessing to receive encouragement and motivation while pursuing those things that bring joy to your life. Max has been fortunate to collaborate with and be guided by a number of brilliant and talented engineers, researchers, and mentors. First, thanks have to go to Professor Vivek Sarkar and the whole Habanero Research Group at Rice University. There, Max got his fi rst taste of HPC and CUDA. The mentorship of Vivek and others in the group was invaluable in enabling him to explore the exciting world of research. Max would also like to thank Mauricio Araya-Polo and Gladys Gonzalez at Repsol. The experience gained under their mentorship was incredibly valuable in writing a book that would be truly useful to real-world work in science and engineering. Finally, Max would like to thank John and Ty for inviting him along on this writing adventure in CUDA and for the lessons this experience has provided in CUDA, writing, and life. It would not be possible to make a quality professional book without input from technical editors, development editors, and reviewers. We would like to extend our sincere appreciation to Mary E. James, our acquisitions editor; Martin V. Minner, our project editor; Katherine Burt, our copy editor; and Wei Zhang and Chao Zhao, our technical editors. You are an insightful and professional editorial team and this book would not be what it is without you. It was a great pleasure to work with you on this project. ACKNOWLEDGMENTS www.it-ebooks.info