A Better Parallel Scan Algorithm 1.Read input from device to shared memory 2.Iterate log(n)times;stride from 1 to n-1:double stride each iteration. XY 3 STRIDE 1 XY 3 4 8 STRIDE 2 XY 34111112121114 ITERATION =2 STRIDE =2 电子科妓女学 O14 A Better Parallel Scan Algorithm 1. Read input from device to shared memory 2. Iterate log(n) times; stride from 1 to n-1: double stride each iteration. XY 3 4 8 7 4 5 7 9 XY 3 1 7 0 4 1 6 3 ITERATION = 2 STRIDE = 2 STRIDE 1 XY 3 4 11 11 12 12 11 14 STRIDE 2