7.7 Quasi-(that is,Sub-)Random Sequences 309 CITED REFERENCES AND FURTHER READING: Hammersley,J.M.,and Handscomb,D.C.1964,Monte Carlo Methods (London:Methuen). Shreider,Yu.A.(ed.)1966,The Monte Carlo Method (Oxford:Pergamon). Sobol',I.M.1974,The Monte Carlo Method(Chicago:University of Chicago Press). Kalos,M.H.,and Whitlock,P.A.1986,Monte Carlo Methods(New York:Wiley). 7.7 Quasi-(that is,Sub-)Random Sequences 三 8=g We have just seen that choosing N points uniformly randomly in an n- dimensional space leads to an error term in Monte Carlo integration that decreases g as 1/vN.In essence,each new point sampled adds linearly to an accumulated sum that will become the function average,and also linearly to an accumulated sum of squares that will become the variance (equation 7.6.2).The estimated error comes from the square root of this variance,hence the power N-1/2. Just because this square root convergence is familiar does not,however,mean 9 that it is inevitable.A simple counterexample is to choose sample points that lie on a Cartesian grid,and to sample each grid point exactly once (in whatever order). The Monte Carlo method thus becomes a deterministic quadrature scheme-albeit a simple one-whose fractional error decreases at least as fast as N-1(even faster if the function goes to zero smoothly at the boundaries of the sampled region,or is periodic in the region). 、屋号 The trouble with a grid is that one has to decide in advance how fine it should be.One is then committed to completing all of its sample points.With a grid,it is not convenient to "sample until some convergence or termination criterion is met A 6 One might ask if there is not some intermediate scheme,some way to pick sample points"at random,"yet spread out in some self-avoiding way,avoiding the chance clustering that occurs with uniformly random points. A similar question arises for tasks other than Monte Carlo integration.We might want to search an n-dimensional space for a point where some (locally computable) Numerica 10621 condition holds.Of course,for the task to be computationally meaningful,there had better be continuity,so that the desired condition will hold in some finite n- 431 dimensional neighborhood.We may not know a priori how large that neighborhood Recipes is,however.We want to "sample until the desired point is found,moving smoothly to finer scales with increasing samples.Is there any way to do this that is better (outside than uncorrelated,random samples? North The answer to the above question is "yes."Sequences of n-tuples that fill n-space more uniformly than uncorrelated random points are called guasi-random sequences.That term is somewhat of a misnomer,since there is nothing"random' about quasi-random sequences:They are cleverly crafted to be,in fact,sub-random. The sample points in a quasi-random sequence are,in a precise sense,"maximally avoiding"of each other. A conceptually simple example is Halton's sequence [11.In one dimension,the jth number H;in the sequence is obtained by the following steps:(i)Write j as a number in base b,where b is some prime.(For examplej=17 in base b=3 is 122.) (ii)Reverse the digits and put a radix point(ie.,a decimal point base b)in front of
7.7 Quasi- (that is, Sub-) Random Sequences 309 Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). CITED REFERENCES AND FURTHER READING: Hammersley, J.M., and Handscomb, D.C. 1964, Monte Carlo Methods (London: Methuen). Shreider, Yu. A. (ed.) 1966, The Monte Carlo Method (Oxford: Pergamon). Sobol’, I.M. 1974, The Monte Carlo Method (Chicago: University of Chicago Press). Kalos, M.H., and Whitlock, P.A. 1986, Monte Carlo Methods (New York: Wiley). 7.7 Quasi- (that is, Sub-) Random Sequences We have just seen that choosing N points uniformly randomly in an ndimensional space leads to an error term in Monte Carlo integration that decreases as 1/ √ N. In essence, each new point sampled adds linearly to an accumulated sum that will become the function average, and also linearly to an accumulated sum of squares that will become the variance (equation 7.6.2). The estimated error comes from the square root of this variance, hence the power N −1/2. Just because this square root convergence is familiar does not, however, mean that it is inevitable. A simple counterexample is to choose sample points that lie on a Cartesian grid, and to sample each grid point exactly once (in whatever order). The Monte Carlo method thus becomes a deterministic quadrature scheme — albeit a simple one — whose fractional error decreases at least as fast as N −1 (even faster if the function goes to zero smoothly at the boundaries of the sampled region, or is periodic in the region). The trouble with a grid is that one has to decide in advance how fine it should be. One is then committed to completing all of its sample points. With a grid, it is not convenient to “sample until” some convergence or termination criterion is met. One might ask if there is not some intermediate scheme, some way to pick sample points “at random,” yet spread out in some self-avoiding way, avoiding the chance clustering that occurs with uniformly random points. A similar question arises for tasks other than Monte Carlo integration. We might want to search an n-dimensional space for a point where some (locally computable) condition holds. Of course, for the task to be computationally meaningful, there had better be continuity, so that the desired condition will hold in some finite ndimensional neighborhood. We may not know a priori how large that neighborhood is, however. We want to “sample until” the desired point is found, moving smoothly to finer scales with increasing samples. Is there any way to do this that is better than uncorrelated, random samples? The answer to the above question is “yes.” Sequences of n-tuples that fill n-space more uniformly than uncorrelated random points are called quasi-random sequences. That term is somewhat of a misnomer, since there is nothing “random” about quasi-random sequences: They are cleverly crafted to be, in fact, sub-random. The sample points in a quasi-random sequence are, in a precise sense, “maximally avoiding” of each other. A conceptually simple example is Halton’s sequence [1]. In one dimension, the jth number Hj in the sequence is obtained by the following steps: (i) Write j as a number in base b, where b is some prime. (For example j = 17 in base b = 3 is 122.) (ii) Reverse the digits and put a radix point (i.e., a decimal point base b) in front of
310 Chapter 7. Random Numbers .8 6 .6 .4 .4 2 83 旦 granted for 19881992 0 0 0 .2 4 .6 8 0 .2 .4 .6 8 11800 points I to 128 points 129 to 512 1T, Cambridge University Press. users to make one paper from NUMERICAL RECIPES IN C: f 8 server computer, (North America THE 6 .4C Programs 2 2 email strictly prohibited. Copyright (C) 0 to dir 0 .2 .4 .6 .8 1 0 .6 ART OF SCIENTIFIC COMPUTING(ISBN points 513 to 1024 points 1 to 1024 1881992 Figure 7.7.1.First 1024 points of a two-dimensional Sobol'sequence.The sequence is generated number-theoretically,rather than randomly,so successive points at any stage "know"how to fill in the gaps in the previously generated distribution. Numerical Recipes 10-521 43108 the sequence.(In the example,we get 0.221 base 3.)The result is Hj.To get a sequence of n-tuples in n-space,you make each component a Halton sequence with a different prime base b.Typically,the first n primes are used. (outside It is not hard to see how Halton's sequence works:Every time the number of North Software. digits in j increases by one place,j's digit-reversed fraction becomes a factor of b finer-meshed.Thus the process is one of filling in all the points on a sequence of finer and finer Cartesian grids-and in a kind of maximally spread-out order visit website on each grid (since,e.g.,the most rapidly changing digit in i controls the most machine significant digit of the fraction). Other ways of generating quasi-random sequences have been suggested by Faure,Sobol',Niederreiter,and others.Bratley and Fox [2]provide a good review and references,and discuss a particularly efficient variant of the Sobol'[3]sequence suggested by Antonov and Saleev[41.It is this Antonov-Saleev variant whose implementation we now discuss
310 Chapter 7. Random Numbers Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). . . . . . . . . .. . . . . . . . . . . . . . . . . .. . . . . .. . . . . . . . . . . . . . . . . .. . . . . . . . . . .. . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . 0 .2 .4 .6 .8 1 points 1 to 128 0 .2 .4 .6 .8 1 . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . ... .. . . .. . . . . .. . . . . . . . . . . . . . . . ... .. .. . . . . .. .. . . . .. . . . .. . . . . . .. . . . . . . . . . .. . . . . . . . . .. . . . . . . . . .. . . . . .... .. .. .. .. .. ... ... ... . .. . .. .... . .. .. . .. . .. . . .. . . .. ... . . ..... . . . . . ... .. .. . ... .. .. ... . . . . .. .... . . . . . ... . . . . .. . . . . ... .. . . . .. .. .. .. .. .... .. . . . .. . .. .... .. . . ......... . .. . .... .. .. ... ....... ... . . .. .. ... . ... . . . ..... .. .. ... . . . .. .. .. . .... .. ... . . .. . . . .. . .. .. . . ... .. 0 .2 .4 .6 .8 1 points 129 to 512 0 .2 .4 .6 .8 1 ... .. . .. . ...... .. . . ....... ...... ....... ..... . . ... .. ...... ......... .. . ...... .. . . . . .. .. . . .. ... .... .. ....... ... . . ... .. .. . . .. . .. ... . .. .. ..... .. .... . .... .... ... .. ... ...... .. .. .. ... . . . .. .. ........ .. ..... .. .. .... .. .... .. ... ... . . .. .. ..... .. .. ....... . ... . . . .. .. . . .. ... .. .. .. ..... ... . ... .. . . . . . . . ....... . . .. ..... ... . .... . .... .. .... .. ... ... .. ... .. ..... .. ... .. ... ... .. ... . ... ... ..... ... .. . . .. .. .. .. . ... . .. . . ... ... . ........ ...... ...... .... ... . ... . ... ... . ...... .. . ..... .... .... ..... .. . . .... . . . . . . 0 .2 .4 .6 .8 1 points 513 to 1024 0 .2 .4 .6 .8 1 ........ ... . .... ... . .... ..... . .. .. ........ ... . .. .. . .. ..... .. . ... ........ ...... . ... .. .. .... . .... .... . . . . ... . ... ... .. .. .. ...... .. .. ... ....... ... .. . ... ..... . .. . ... . .. .... ........ ... . .. .. .. . .. .. ... ......... .. ... ... . . . . .... . . ... ... ..... .. .. . . ..... . .. .. . . .. . ..... .. .. ... .... ....... . . ... .... . ..... .. .. ... ...... .... .. ... . ... . . ................ ... . ..... . ... .. .. .. . . . ..... .. ... .... ... ..... . ...... .. .. ....... .... . ... . . .... . . .. . .. .. . .. .. . ... . ... ... . .. ... .. ... . ....... .. . .... .. ..... . ..... .. . .... .... ... .. .. .. .. .... . ..... ... ...... . ..... ..... . .. . ... ..... . ... ... . .. ... . .. . .... .. .. ... ... . . .... .. ... . .. ..... .. . ... ....... ... ... . ........ . .. .... . .. .. .. .. .. ... ... ..... . .. .... . .. ... .... . .. .... . .... .. .. .. .. . .. .. .. . ... .. ... .. .. .. .. ... . .. . . .. ... . .. ... . .. .. . ....... . .... .. ........... ... . .. ... ... ... .. .. ... . .. . .. ....... . ....... ....... . ... . ... . . ... ..... .. .... ...... .. ... .. .. . . . . ... .. . ..... .... . ... .... . ...... ..... ... ... .. . .. .. .. .. ... .. .. ... ... . . . . .. . .. .. . ... . . .. . . . . . . . ... . .. .. . .. .. ....... ..... .... .. .. ..... ... .. .. 0 .2 .4 .6 .8 1 points 1 to 1024 0 .2 .4 .6 .8 1 Figure 7.7.1. First 1024 points of a two-dimensional Sobol’ sequence. The sequence is generated number-theoretically, rather than randomly, so successive points at any stage “know” how to fill in the gaps in the previously generated distribution. the sequence. (In the example, we get 0.221 base 3.) The result is Hj . To get a sequence of n-tuples in n-space, you make each component a Halton sequence with a different prime base b. Typically, the first n primes are used. It is not hard to see how Halton’s sequence works: Every time the number of digits in j increases by one place, j’s digit-reversed fraction becomes a factor of b finer-meshed. Thus the process is one of filling in all the points on a sequence of finer and finer Cartesian grids — and in a kind of maximally spread-out order on each grid (since, e.g., the most rapidly changing digit in j controls the most significant digit of the fraction). Other ways of generating quasi-random sequences have been suggested by Faure, Sobol’, Niederreiter, and others. Bratley and Fox [2] provide a good review and references, and discuss a particularly efficient variant of the Sobol’ [3] sequence suggested by Antonov and Saleev [4]. It is this Antonov-Saleev variant whose implementation we now discuss
7.7 Quasi-(that is,Sub-)Random Sequences 311 Degree Primitive Polynomials Modulo 2* 0(i.e.,x+1) 2 1(i.e,x2+x+1) 3 l,2(.e,x3+x+1andx3+x2+1) 1,4(i.e,x4+x+1andx4+x3+1) 2,4,711,13,14 6 1,13,16.19,22,25 1,4.7,8.14,19,21,28.31,32,37,41,42,50,55,56.59,62 8 14,21,22,38,47,49,50,52,56,67,70,84,97,103,115,122 (including this one) 9 8,13.16,22,25,44,47,52,55,59,62,67,7481,82,87,91,94,103,104,109,122 124.137,138,143,145,152,157,167,173,176,181,182.,185,191,194.199,218,220 11-800872 227,229,230,234,236,241,244,253 from NUMERICAL RECIPES IN 公 4,13,19.22,50,55,64,69,98.107,115.121,127,134.140,145,152,158.161,171 (North 181,194,199,203,208,227,242,251,253,265,266,274,283,289,295,301,316 319,324,346.352,361,367,382,395,398,400,412,419,422,426,428.433.446 America computer, 1988-1992 by Cambridge University Press. THE 454,457,472,493.505,508 ART *Expressed as a decimal integer representing the interior bits (that is,omitting the Programs high-order bit and the unit bit). The Sobol'sequence generates numbers between zero and one directly as binary fractions of length w bits,from a set of w special binary fractions,V.i=1.2.....w,called direction numbers.In Sobol's original method,the ith number X;is generated by XORing (bitwise exclusive or)together the set ofVi's satisfying the criterion on i,"the ith bit ofj is nonzero." As j increments,in other words,different ones of the Vi's flash in and out of X;on different time scales.Vi alternates between being present and absent most quickly,while V goes from present to absent(or vice versa)only every 2-1 steps. Antonov and Saleev's contribution was to show that instead of using the bits of the 10-521 integer j to select direction numbers,one could just as well use the bits of the Gray code of j,G(j).(For a quick review of Gray codes,look at $20.2.) Fuurggoglrion 43108 Now G(j)and G(j+1)differ in exactly one bit position,namely in the position of the Numerical Recipes rightmost zero bit in the binary representation ofj(adding a leading zero to j if necessary).A consequence is that thej+1st Sobol'-Antonov-Saleev number can be obtained from the jth (outside by XORing it with a single Vi,namely with i the position of the rightmost zero bit in j.This North Software. makes the calculation of the sequence very efficient,as we shall see. Figure 7.7.1 plots the first 1024 points generated by a two-dimensional Sobol'sequence. One sees that successive points do"know"about the gaps left previously,and keep filling them in,hierarchically. We have deferred to this point a discussion ofhow the direction numbers Vi are generated. Some nontrivial mathematics is involved in that,so we will content ourself with a cookbook summary only:Each different Sobol'sequence(or component of an n-dimensional sequence) is based on a different primitive polynomial over the integers modulo 2,that is,a polynomial whose coefficients are either 0 or 1,and which generates a maximal length shift register sequence.(Primitive polynomials modulo 2 were used in 87.4,and are further discussed in $20.3.)Suppose P is such a polynomial,of degree g, P=x9+a1x9-1+a2x9-2+…+ag-1x+1 (7.7.1)
7.7 Quasi- (that is, Sub-) Random Sequences 311 Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). Degree Primitive Polynomials Modulo 2* 1 0 (i.e., x + 1) 2 1 (i.e., x2 + x + 1) 3 1, 2 (i.e., x3 + x + 1 and x3 + x2 + 1) 4 1, 4 (i.e., x4 + x + 1 and x4 + x3 + 1) 5 2, 4, 7, 11, 13, 14 6 1, 13, 16, 19, 22, 25 7 1, 4, 7, 8, 14, 19, 21, 28, 31, 32, 37, 41, 42, 50, 55, 56, 59, 62 8 14, 21, 22, 38, 47, 49, 50, 52, 56, 67, 70, 84, 97, 103, 115, 122 9 8, 13, 16, 22, 25, 44, 47, 52, 55, 59, 62, 67, 74, 81, 82, 87, 91, 94, 103, 104, 109, 122, 124, 137, 138, 143, 145, 152, 157, 167, 173, 176, 181, 182, 185, 191, 194, 199, 218, 220, 227, 229, 230, 234, 236, 241, 244, 253 10 4, 13, 19, 22, 50, 55, 64, 69, 98, 107, 115, 121, 127, 134, 140, 145, 152, 158, 161, 171, 181, 194, 199, 203, 208, 227, 242, 251, 253, 265, 266, 274, 283, 289, 295, 301, 316, 319, 324, 346, 352, 361, 367, 382, 395, 398, 400, 412, 419, 422, 426, 428, 433, 446, 454, 457, 472, 493, 505, 508 *Expressed as a decimal integer representing the interior bits (that is, omitting the high-order bit and the unit bit). The Sobol’sequence generates numbers between zero and one directly as binary fractions of length w bits, from a set of w special binary fractions, Vi, i = 1, 2,...,w, called direction numbers. In Sobol’s original method, the jth number Xj is generated by XORing (bitwise exclusive or) together the set of Vi’s satisfying the criterion on i, “the ith bit of j is nonzero.” As j increments, in other words, different ones of the Vi’s flash in and out of Xj on different time scales. V1 alternates between being present and absent most quickly, while Vk goes from present to absent (or vice versa) only every 2k−1 steps. Antonov and Saleev’s contribution was to show that instead of using the bits of the integer j to select direction numbers, one could just as well use the bits of the Gray code of j, G(j). (For a quick review of Gray codes, look at §20.2.) Now G(j) and G(j + 1) differ in exactly one bit position, namely in the position of the rightmost zero bit in the binary representation of j (adding a leading zero to j if necessary). A consequence is that the j + 1st Sobol’-Antonov-Saleev number can be obtained from the jth by XORing it with a single Vi, namely with i the position of the rightmost zero bit in j. This makes the calculation of the sequence very efficient, as we shall see. Figure 7.7.1 plots the first 1024 points generated by a two-dimensional Sobol’ sequence. One sees that successive points do “know” about the gaps left previously, and keep filling them in, hierarchically. We have deferred to this point a discussion of how the direction numbers Vi are generated. Some nontrivial mathematics is involved in that, so we will content ourself with a cookbook summary only: Each different Sobol’ sequence (or component of an n-dimensional sequence) is based on a different primitive polynomial over the integers modulo 2, that is, a polynomial whose coefficients are either 0 or 1, and which generates a maximal length shift register sequence. (Primitive polynomials modulo 2 were used in §7.4, and are further discussed in §20.3.) Suppose P is such a polynomial, of degree q, P = xq + a1xq−1 + a2xq−2 + ··· + aq−1x +1 (7.7.1)
312 Chapter 7.Random Numbers Initializing Values Used in sobseq Degree Polynomial Starting Values 0 1 (3) (5) (15) 2 1 (7) (11) 3 1 3 7 (5) 44。 3 2 3 3 (15) 3 http://www.nr read able files Permission is 4 1 5 9 83 Parenthesized values are not freely specifiable,but are forced by the required recurrence for this degree. 11-600 (including this one) -872 Define a sequence of integers Mi by the g-term recurrence relation, to any server from NUMERICAL RECIPES IN 19881992020ammm09e M4=2a1M-1⊕22a2M-2⊕…⊕29-1M-g+1ag-1⊕(29M1-g⊕Mi-g)(7.7.2) Here bitwise XOR is denoted by The starting values for this recurrence are that Mi,...,Mg (North can be arbitrary odd integers less than 2,...24,respectively.Then,the direction numbers Vi are given by America computer make one paper e University Press. THE V=M/2i=1,,w ART (7.7.3) The accompanying table lists all primitive polynomials modulo 2 with degree g 10. Programs Since the coefficients are either 0 or 1,and since the coefficients of r9 and of 1 are predictably 1,it is convenient to denote a polynomial by its middle coefficients taken as the bits of a binary number(higher powers of being more significant bits).The table uses this convention. Turn now to the implementation of the Sobol'sequence.Successive calls to the function sobseq(after a preliminary initializing call)return successive points in an n-dimensional Sobol'sequence based on the first n primitive polynomials in the table.As given,the routine is initialized for maximum n of 6 dimensions,and for a word length w of 30 bits.These OF SCIENTIFIC COMPUTING(ISBN parameters can be altered by changing MAXBIT (=w)and MAXDIM,and by adding more initializing data to the arrays ip (the primitive polynomials from the table),mdeg (their 1788-1982 degrees),and iv(the starting values for the recurrence,equation 7.7.2).A second table, above,elucidates the initializing data in the routine #include "nrutil.h" #define MAXBIT 30 Numerical Recipes 10-621 43108 #define MAXDIM 6 void sobseq(int *n,float x[]) (outside When n is negative,internally initializes a set of MAXBIT direction numbers for each of MAXDIM North Software. different Sobol'sequences.When n is positive (but <MAXDIM),returns as the vector x[1..n] the next values from n of these sequences.(n must not be changed between initializations. Ame int j,k,l; unsigned long i,im,ipp; static float fac; static unsigned long in,ix [MAXDIM+1],*iu[MAXBIT+1]; static unsigned long mdeg[MAXDIM+1]={0,1,2,3,3,4,4}; static unsigned long ip[MAXDIM+1]={0,0,1,1,2,1,4}; static unsigned long iv [MAXDIM*MAXBIT+1]={ 0,1,1,1,1,1,1,3,1,3,3,1,1,5,7,7,3,3,5,15,11,5,15,13,9]: if(*n<0){ Initialize,don't return a vector. for (k=1;k<=MAXDIM;k++)ix[k]=0;
312 Chapter 7. Random Numbers Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). Initializing Values Used in sobseq Degree Polynomial Starting Values 1 0 1 (3) (5) (15) ... 2 1 1 1 (7) (11) ... 3 1 1 3 7 (5) ... 3 2 1 3 3 (15) ... 4 1 1 1 3 13 ... 4 4 1 1 5 9 ... Parenthesized values are not freely specifiable, but are forced by the required recurrence for this degree. Define a sequence of integers Mi by the q-term recurrence relation, Mi = 2a1Mi−1 ⊕ 22 a2Mi−2 ⊕···⊕ 2q−1 Mi−q+1aq−1 ⊕ (2q Mi−q ⊕ Mi−q) (7.7.2) Here bitwise XOR is denoted by ⊕. The starting values for this recurrence are that M1,...,Mq can be arbitrary odd integers less than 2,..., 2q, respectively. Then, the direction numbers Vi are given by Vi = Mi/2i i = 1,...,w (7.7.3) The accompanying table lists all primitive polynomials modulo 2 with degree q ≤ 10. Since the coefficients are either 0 or 1, and since the coefficients of xq and of 1 are predictably 1, it is convenient to denote a polynomial by its middle coefficients taken as the bits of a binary number (higher powers of x being more significant bits). The table uses this convention. Turn now to the implementation of the Sobol’ sequence. Successive calls to the function sobseq (after a preliminary initializing call) return successive points in an n-dimensional Sobol’ sequence based on the first n primitive polynomials in the table. As given, the routine is initialized for maximum n of 6 dimensions, and for a word length w of 30 bits. These parameters can be altered by changing MAXBIT (≡ w) and MAXDIM, and by adding more initializing data to the arrays ip (the primitive polynomials from the table), mdeg (their degrees), and iv (the starting values for the recurrence, equation 7.7.2). A second table, above, elucidates the initializing data in the routine. #include "nrutil.h" #define MAXBIT 30 #define MAXDIM 6 void sobseq(int *n, float x[]) When n is negative, internally initializes a set of MAXBIT direction numbers for each of MAXDIM different Sobol’ sequences. When n is positive (but ≤MAXDIM), returns as the vector x[1..n] the next values from n of these sequences. (n must not be changed between initializations.) { int j,k,l; unsigned long i,im,ipp; static float fac; static unsigned long in,ix[MAXDIM+1],*iu[MAXBIT+1]; static unsigned long mdeg[MAXDIM+1]={0,1,2,3,3,4,4}; static unsigned long ip[MAXDIM+1]={0,0,1,1,2,1,4}; static unsigned long iv[MAXDIM*MAXBIT+1]={ 0,1,1,1,1,1,1,3,1,3,3,1,1,5,7,7,3,3,5,15,11,5,15,13,9}; if (*n < 0) { Initialize, don’t return a vector. for (k=1;k<=MAXDIM;k++) ix[k]=0;
7.7 Quasi-(that is,Sub-)Random Sequences 313 in=0; if (iv[1]!1)return; fac=1.0/(1L >mdeg[k]); for(1=mdeg[k]-1;1>=1;1--){ if (ipp 1)i iu[j-1][k] 1pp>=1; 鱼 18881892 iu[j][k]=i; 2 g NUMERICAL else Calculate the next vector in the se- imsin++; quence. for (j=1;j>=1; 令 if (j MAXBIT)nrerror("MAXBIT too small in sobseg"); im=(j-1)*MAXDIM: Americ computer Press. for (k=1;k<=IMIN(*n,MAXDIM);k++){ XOR the appropriate direction num- ix[k]“=1v[im+k]; ber into each component of the x[k]=ix[k]*fac; vector and convert to a floating number. SCIENTIFIC How good is a Sobol'sequence,anyway?For Monte Carlo integration of a smooth 6 function in n dimensions,the answer is that the fractional error will decrease with N,the number of samples,as(In N)"/N,i.e.,almost as fast as 1/N.As an example,let us integrate a function that is nonzero inside a torus (doughnut)in three-dimensional space.If the major radius of the torus is Ro,the minor radial coordinate r is defined by 1920 COMPUTING (ISBN r=(【e2+-P+2e (7.7.4) 10621 Let us try the function Numerica uctio 431086 f(x,y,z) cos r<To Recipes (7.7.5) 0 r20 (outside which can be integrated analytically in cylindrical coordinates,giving North Software. dr dy dz f(,y,z)=2n2a2Ro (7.7.6) With parameters Ro =0.6,ro =0.3,we did 100 successive Monte Carlo integrations of equation (7.7.4),sampling uniformly in the region-1 <y,z<1,for the two cases of uncorrelated random points and the Sobol'sequence generated by the routine sobseq.Figure 7.7.2 shows the results,plotting the r.m.s.average error of the 100 integrations as a function of the number of points sampled.(For any single integration,the error of course wanders from positive to negative,or vice versa,so a logarithmic plot of fractional error is not very informative.)The thin,dashed curve corresponds to uncorrelated random points and shows the familiar N-1/2 asymptotics.The thin,solid gray curve shows the result for the Sobol
7.7 Quasi- (that is, Sub-) Random Sequences 313 Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). in=0; if (iv[1] != 1) return; fac=1.0/(1L > mdeg[k]); for (l=mdeg[k]-1;l>=1;l--) { if (ipp & 1) i ^= iu[j-l][k]; ipp >>= 1; } iu[j][k]=i; } } } else { Calculate the next vector in the seim=in++; quence. for (j=1;j>= 1; } if (j > MAXBIT) nrerror("MAXBIT too small in sobseq"); im=(j-1)*MAXDIM; for (k=1;k<=IMIN(*n,MAXDIM);k++) { XOR the appropriate direction number into each component of the vector and convert to a floating number. ix[k] ^= iv[im+k]; x[k]=ix[k]*fac; } } } How good is a Sobol’ sequence, anyway? For Monte Carlo integration of a smooth function in n dimensions, the answer is that the fractional error will decrease with N, the number of samples, as (ln N) n/N, i.e., almost as fast as 1/N. As an example, let us integrate a function that is nonzero inside a torus (doughnut) in three-dimensional space. If the major radius of the torus is R0, the minor radial coordinate r is defined by r = [(x2 + y2 ) 1/2 − R0] 2 + z2 1/2 (7.7.4) Let us try the function f(x, y, z) = 1 + cos πr2 a2 r<r0 0 r ≥ r0 (7.7.5) which can be integrated analytically in cylindrical coordinates, giving dx dy dz f(x, y, z)=2π2 a2 R0 (7.7.6) With parameters R0 = 0.6, r0 = 0.3, we did 100 successive Monte Carlo integrations of equation (7.7.4), sampling uniformly in the region −1 < x, y, z < 1, for the two cases of uncorrelated random points and the Sobol’ sequence generated by the routine sobseq. Figure 7.7.2 shows the results, plotting the r.m.s. average error of the 100 integrations as a function of the number of points sampled. (For any single integration, the error of course wanders from positive to negative, or vice versa, so a logarithmic plot of fractional error is not very informative.) The thin, dashed curve corresponds to uncorrelated random points and shows the familiar N−1/2 asymptotics. The thin, solid gray curve shows the result for the Sobol’
314 Chapter 7.Random Numbers * N-1/2 .01 oc N-2/3 83 pseudo-random,hard boundary... 11-800-872 (including this one) pseudo-random,soft boundary N-1 quasi-random,hard boundary 7423 .001 to any server computer, tusers to make one paper 1988-1992 by Cambridge University Press. from NUMERICAL RECIPES IN C: quasi-random,soft boundary (North America THE 100 1000 10000 105 是 number of points N Programs Figure 7.7.2.Fractional accuracy of Monte Carlo integrations as a function of number of points sampled for two different integrands and two different methods of choosing random points.The quasi-random Sobol'sequence converges much more rapidly than a conventional pseudo-random sequence.Quasi- random sampling does better when the integrand is smooth(soft boundary")than when it has step discontinuities ("hard boundary").The curves shown are the r.m.s.average of 100 trials. to dir sequence.The logarithmic term in the expected(In N)/N is readily apparent as curvature ART OF SCIENTIFIC COMPUTING (ISBN in the curve,but the asymptotic N-is unmistakable. ectcustser 1788-1982 To understand the importance of Figure 7.7.2,suppose that a Monte Carlo integration of f with 1%accuracy is desired.The Sobol'sequence achieves this accuracy in a few thousand v@cam samples,while pseudorandom sampling requires nearly 100,000 samples.The ratio would be even greater for higher desired accuracies. A different,not quite so favorable,case occurs when the function being integrated has hard (discontinuous)boundaries inside the sampling region,for example the function that is Numerical Recipes 10-621 43108 one inside the torus,zero outside, fe,={8 (outside r≥ro (7.7.7) Software. where r is defined in equation(7.7.4).Not by coincidence,this function has the same analytic North integral as the function of equation (7.7.5),namely 2aRo. The carefully hierarchical Sobol'sequence is based on a set of Cartesian grids,but the boundary of the torus has no particular relation to those grids.The result is that it is essentially random whether sampled points in a thin layer at the surface of the torus,containing on the order of N2/3 points,come out to be inside,or outside,the torus.The square root law,applied to this thin layer,gives N/3 fluctuations in the sum,or N-2/3 fractional error in the Monte Carlo integral.One sees this behavior verified in Figure 7.7.2 by the thicker gray curve.The thicker dashed curve in Figure 7.7.2 is the result of integrating the function ofequation(7.7.7) using independent random points.While the advantage of the Sobol'sequence is not quite so dramatic as in the case of a smooth function,it can nonetheless be a significant factor (~5) even at modest accuracies like 1%,and greater at higher accuracies
314 Chapter 7. Random Numbers Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). .001 .01 .1 fractional accuracy of integral number of points N 100 1000 10000 105 ∝N−1/2 ∝N−2/3 ∝N−1 quasi-random, hard boundary pseudo-random, soft boundary pseudo-random, hard boundary quasi-random, soft boundary Figure 7.7.2. Fractional accuracy of Monte Carlo integrations as a function of number of points sampled, for two different integrands and two different methods of choosing random points. The quasi-random Sobol’ sequence converges much more rapidly than a conventional pseudo-random sequence. Quasirandom sampling does better when the integrand is smooth (“soft boundary”) than when it has step discontinuities (“hard boundary”). The curves shown are the r.m.s. average of 100 trials. sequence. The logarithmic term in the expected (ln N) 3/N is readily apparent as curvature in the curve, but the asymptotic N−1 is unmistakable. To understand the importance of Figure 7.7.2, suppose that a Monte Carlo integration of f with 1% accuracy is desired. The Sobol’ sequence achieves this accuracy in a few thousand samples, while pseudorandom sampling requires nearly 100,000 samples. The ratio would be even greater for higher desired accuracies. A different, not quite so favorable, case occurs when the function being integrated has hard (discontinuous) boundaries inside the sampling region, for example the function that is one inside the torus, zero outside, f(x, y, z) = 1 r<r0 0 r ≥ r0 (7.7.7) where r is defined in equation (7.7.4). Not by coincidence, this function has the same analytic integral as the function of equation (7.7.5), namely 2π2a2R0. The carefully hierarchical Sobol’ sequence is based on a set of Cartesian grids, but the boundary of the torus has no particular relation to those grids. The result is that it is essentially random whether sampled points in a thin layer at the surface of the torus, containing on the order of N2/3 points, come out to be inside, or outside, the torus. The square root law, applied to this thin layer, gives N1/3 fluctuations in the sum, or N−2/3 fractional error in the Monte Carlo integral. One sees this behavior verified in Figure 7.7.2 by the thicker gray curve. The thicker dashed curve in Figure 7.7.2 is the result of integrating the function of equation (7.7.7) using independent random points. While the advantage of the Sobol’ sequence is not quite so dramatic as in the case of a smooth function, it can nonetheless be a significant factor (∼5) even at modest accuracies like 1%, and greater at higher accuracies
7.7 Quasi-(that is,Sub-)Random Sequences 315 Note that we have not provided the routine sobseg with a means of starting the sequence at a point other than the beginning,but this feature would be easy to add.Once the initialization of the direction numbers iv has been done,the jth point can be obtained directly by XORing together those direction numbers corresponding to nonzero bits in the Gray code of j,as described above. The Latin Hypercube We might here give passing mention the unrelated technique of Latin square or Latin hypercube sampling,which is useful when you must sample an N-dimensional space exceedingly sparsely,at M points.For example,you may want to test the crashworthiness of cars as a simultaneous function of 4 different design parameters, but with a budget of only three expendable cars.(The issue is not whether this is a good plan-it isn't-but rather how to make the best of the situation!) The idea is to partition each design parameter(dimension)into M segments,so that the whole space is partitioned into MN cells.(You can choose the segments in each dimension to be equal or unequal,according to taste.)With 4 parameters and 3 cars,for example,you end up with 3 x 3 x 3 x 3 =81 cells. Next,choose M cells to contain the sample points by the following algorithm: 9 Randomly choose one of the MN cells for the first point.Now eliminate all cells that agree with this point on any of its parameters (that is,cross out all cells in the same row,column,etc.),leaving (M-1)N candidates.Randomly choose one of these,eliminate new rows and columns,and continue the process until there is only one cell left,which then contains the final sample point. 旦 The result of this construction is that each design parameter will have been tested in every one of its subranges.If the response of the system under test is dominated by one of the design parameters,that parameter will be found with this 61 sampling technique.On the other hand,if there is an important interaction among different design parameters,then the Latin hypercube gives no particular advantage. Use with care. CITED REFERENCES AND FURTHER READING: Numerica 10621 Halton,J.H.1960,Numerische Mathematik,vol.2,pp.84-90.[1] Bratley P.,and Fox,B.L.1988,ACM Transactions on Mathematical Software,vol.14,pp.88- 431 100.[2) Recipes Lambert,J.P.1988,in Numerical Mathematics-Singapore 1988,ISNM vol.86,R.P.Agarwal, Y.M.Chow,and S.J.Wilson,eds.(Basel:Birkhauser),pp.273-284. 腿 Niederreiter,H.1988,in Numerical Integration Ill,ISNM vol.85,H.Brass and G.Hammerlin, North eds.(Basel:Birkhauser),pp.157-171. Sobol',I.M.1967,USSR Computational Mathematics and Mathematical Physics,vol.7,no.4, pp.86-112.[3] Antonov,I.A.,and Saleev,V.M 1979,USSR Computational Mathematics and Mathematical Physics,vol.19,no.1,pp.252-256.[4] Dunn,O.J.,and Clark,V.A.1974,Applied Statistics:Analysis of Variance and Regression(New York,Wiley)[discusses Latin Square]
7.7 Quasi- (that is, Sub-) Random Sequences 315 Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). Note that we have not provided the routine sobseq with a means of starting the sequence at a point other than the beginning, but this feature would be easy to add. Once the initialization of the direction numbers iv has been done, the jth point can be obtained directly by XORing together those direction numbers corresponding to nonzero bits in the Gray code of j, as described above. The Latin Hypercube We might here give passing mention the unrelated technique of Latin square or Latin hypercube sampling, which is useful when you must sample an N-dimensional space exceedingly sparsely, at M points. For example, you may want to test the crashworthiness of cars as a simultaneous function of 4 different design parameters, but with a budget of only three expendable cars. (The issue is not whether this is a good plan — it isn’t — but rather how to make the best of the situation!) The idea is to partition each design parameter (dimension) into M segments, so that the whole space is partitioned into M N cells. (You can choose the segments in each dimension to be equal or unequal, according to taste.) With 4 parameters and 3 cars, for example, you end up with 3 × 3 × 3 × 3 = 81 cells. Next, choose M cells to contain the sample points by the following algorithm: Randomly choose one of the M N cells for the first point. Now eliminate all cells that agree with this point on any of its parameters (that is, cross out all cells in the same row, column, etc.), leaving (M − 1)N candidates. Randomly choose one of these, eliminate new rows and columns, and continue the process until there is only one cell left, which then contains the final sample point. The result of this construction is that each design parameter will have been tested in every one of its subranges. If the response of the system under test is dominated by one of the design parameters, that parameter will be found with this sampling technique. On the other hand, if there is an important interaction among different design parameters, then the Latin hypercube gives no particular advantage. Use with care. CITED REFERENCES AND FURTHER READING: Halton, J.H. 1960, Numerische Mathematik, vol. 2, pp. 84–90. [1] Bratley P., and Fox, B.L. 1988, ACM Transactions on Mathematical Software, vol. 14, pp. 88– 100. [2] Lambert, J.P. 1988, in Numerical Mathematics – Singapore 1988, ISNM vol. 86, R.P. Agarwal, Y.M. Chow, and S.J. Wilson, eds. (Basel: Birkha¨user), pp. 273–284. Niederreiter, H. 1988, in Numerical Integration III, ISNM vol. 85, H. Brass and G. H¨ammerlin, eds. (Basel: Birkha¨user), pp. 157–171. Sobol’, I.M. 1967, USSR Computational Mathematics and Mathematical Physics, vol. 7, no. 4, pp. 86–112. [3] Antonov, I.A., and Saleev, V.M 1979, USSR Computational Mathematics and Mathematical Physics, vol. 19, no. 1, pp. 252–256. [4] Dunn, O.J., and Clark, V.A. 1974, Applied Statistics: Analysis of Variance and Regression (New York, Wiley) [discusses Latin Square]
316 Chapter 7.Random Numbers 7.8 Adaptive and Recursive Monte Carlo Methods This section discusses more advanced techniques of Monte Carlo integration.As examples of the use of these techniques,we include two rather different,fairly sophisticated, multidimensional Monte Carlo codes:vegas[1,2],and miser [3].The techniques that we discuss all fall under the general rubric of reduction of variance ($7.6),but are otherwise quite distinct. 三 Importance Sampling The use of importance sampling was already implicit in equations (7.6.6)and (7.6.7). We now return to it in a slightly more formal way.Suppose that an integrand f can be written as the product of a function h that is almost constant times another,positive,function g.Then its integral over a multidimensional volume V is fdv =(f/g)gdv hgdv (7.8.1) In equation(7.6.7)we interpreted equation (7.8.1)as suggesting a change of variable to G,the indefinite integral of g.That made gdV a perfect differential.We then proceeded to use the basic theorem of Monte Carlo integration,equation (7.6.1).A more general 0车d Press. THE interpretation of equation (7.8.1)is that we can integrate f by instead sampling h not, however,with uniform probability density dV,but rather with nonuniform density gdV.In this second interpretation,the first interpretation follows as the special case,where the means of generating the nonuniform sampling of gdv is via the transformation method,using the Programs indefinite integral G(see 87.2). More directly,one can go back and generalize the basic theorem (7.6.1)to the case of nonuniform sampling:Suppose that points z are chosen within the volume V with a probability density p satisfying 6 pdV=1 (7.8.2) 1C% The generalized fundamental theorem is that the integral of any function f is estimated,using N sample points 1,...,N,by f2/p2)-U/p)2 Numerical Recipes 10-621 I=fav= (7.8.3) uction. 43108 where angle brackets denote arithmetic means over the N points,exactly as in equation (7.6.2).As in equation (7.6.1),the "plus-or-minus"term is a one standard deviation error estimate.Notice that equation (7.6.1)is in fact the special case of equation (7.8.3),with (outside p constant 1/V. What is the best choice for the sampling density p?Intuitively,we have already seen Software. that the idea is to make h=f/p as close to constant as possible.We can be more rigorous by focusing on the numerator inside the square root in equation (7.8.3),which is the variance per sample point.Both angle brackets are themselves Monte Carlo estimators of integrals, so we can write s=(〉-(}≈∫sw-V5'=5w-ra (7.84) We now find the optimal p subject to the constraint equation (7.8.2)by the functiona ariation o-(Ew-ffrav]'+xfoa (7.8.5)
316 Chapter 7. Random Numbers Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). 7.8 Adaptive and Recursive Monte Carlo Methods This section discusses more advanced techniques of Monte Carlo integration. As examples of the use of these techniques, we include two rather different, fairly sophisticated, multidimensional Monte Carlo codes: vegas [1,2], and miser [3]. The techniques that we discuss all fall under the general rubric of reduction of variance (§7.6), but are otherwise quite distinct. Importance Sampling The use of importance sampling was already implicit in equations (7.6.6) and (7.6.7). We now return to it in a slightly more formal way. Suppose that an integrand f can be written as the product of a function h that is almost constant times another, positive, function g. Then its integral over a multidimensional volume V is f dV = (f /g) gdV = h gdV (7.8.1) In equation (7.6.7) we interpreted equation (7.8.1) as suggesting a change of variable to G, the indefinite integral of g. That made gdV a perfect differential. We then proceeded to use the basic theorem of Monte Carlo integration, equation (7.6.1). A more general interpretation of equation (7.8.1) is that we can integrate f by instead sampling h — not, however, with uniform probability density dV , but rather with nonuniform density gdV . In this second interpretation, the first interpretation follows as the special case, where the means of generating the nonuniform sampling of gdV is via the transformation method, using the indefinite integral G (see §7.2). More directly, one can go back and generalize the basic theorem (7.6.1) to the case of nonuniform sampling: Suppose that points xi are chosen within the volume V with a probability density p satisfying p dV =1 (7.8.2) The generalized fundamental theorem is that the integral of any function f is estimated, using N sample points x1,...,xN , by I ≡ f dV = f p pdV ≈ f p ± f 2/p2−f /p 2 N (7.8.3) where angle brackets denote arithmetic means over the N points, exactly as in equation (7.6.2). As in equation (7.6.1), the “plus-or-minus” term is a one standard deviation error estimate. Notice that equation (7.6.1) is in fact the special case of equation (7.8.3), with p = constant = 1/V . What is the best choice for the sampling density p? Intuitively, we have already seen that the idea is to make h = f /p as close to constant as possible. We can be more rigorous by focusing on the numerator inside the square root in equation (7.8.3), which is the variance per sample point. Both angle brackets are themselves Monte Carlo estimators of integrals, so we can write S ≡ f 2 p2 − f p 2 ≈ f 2 p2 pdV − f p pdV 2 = f 2 p dV − f dV 2 (7.8.4) We now find the optimal p subject to the constraint equation (7.8.2) by the functional variation 0 = δ δp f 2 p dV − f dV 2 + λ p dV (7.8.5)