正在加载图片...
Parallel Prefix Sum (Scan)with CUDA for d:=0 to logan-1 do for k from0 to n-lby24+!in parallel do x[k+24+11刂=xk+22.刂+x[k+24+1-] Algorithm 3:The up-sweep (reduce)phase of a work-efficient sum scan algorithm (after Blelloch [1]). d-0 Xo Z(xc..X1) X2 E(xc..X3) X4 2(x4.X5) X6 X(xc..X7) d=1 Xo E(xc..X1) X2 ∑(XcX3) X4 E(x4.x5) X6 (X4x7) d=2 Xo X.x) X2 ∑(X2.X3) X4 ∑(X4.x5) X6 2x6.x7) 3 Xo X2 X3 X4 X5 X6 X7 Figure 2:An illustration of the up-sweep,or reduce,phase of a work-efficient sum scan algorithm. April 2007 8Parallel Prefix Sum (Scan) with CUDA April 2007 8 Algorithm 3: The up-sweep (reduce) phase of a work-efficient sum scan algorithm (after Blelloch [1]). Figure 2: An illustration of the up-sweep, or reduce, phase of a work-efficient sum scan algorithm. for d := 0 to log2n - 1 do for k from 0 to n – 1 by 2d + 1 in parallel do x[k + 2d + 1 - 1] := x[k + 2d - 1] + x [k + 2d + 1 - 1]
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有