338 Chapter 8.Sorting 1=j; j<<=1; else break; Found rra's level.Terminate the sift-down. ra[i]=rra; Put rra into its slot. 2 CITED REFERENCES AND FURTHER READING: Knuth,D.E.1973,Sorting and Searching,vol.3 of The Art of Computer Programming(Reading. MA:Addison-Wesley),$5.2.3.[1] Sedgewick,R.1988,A/lgorithms,2nd ed.(Reading,MA:Addison-Wesley),Chapter 11.[2] nted for 8.4 Indexing and Ranking The concept of keys plays a prominent role in the management of data files.A data record in such a file may contain several items,or fields.For example,a record in a file of weather observations may have fields recording time,temperature,and 三¥2 令 wind velocity.When we sort the records,we must decide which of these fields we want to be brought into sorted order.The other fields in a record just come along for the ride,and will not,in general,end up in any particular order.The field on which the sort is performed is called the key field. For a data file with many records and many fields,the actual movement of N records into the sorted order of their keys Ki,i=1,...,N,can be a daunting task. Instead,one can construct an index table Ij,j=1,...,N,such that the smallest Ki has i=11,the second smallest has i=12,and so on up to the largest Ki with i=IN.In other words,the array K1,j=1,2,,W (8.4.1) is in sorted order when indexed by j.When an index table is available,one need not move records from their original order.Further,different index tables can be made from the same set of records,indexing them to different keys. Numerica 10.621 The algorithm for constructing an index table is straightforward:Initialize the index array with the integers from I to N,then perform the Quicksort algorithm, 431 moving the elements around as ifone were sorting the keys.The integer that initially numbered the smallest key thus ends up in the number one position,and so on. (outside Recipes Software. #include "nrutil.h" #define SWAP(a,b)itemp=(a);(a)=(b);(b)=itemp; #define M 7 #define NSTACK 50 void indexx(unsigned long n,float arr[],unsigned long indx[]) Indexes an array arr[1..n],i.e.,outputs the array indx[1..n]such that arr [indx[j]]is in ascending order for j=1,2,...,N.The input quantities n and arr are not changed. unsigned long i,indxt,ir=n,itemp,j,k,1=1; int jstack=0,*istack; float a; istack=ivector(1,NSTACK);
338 Chapter 8. Sorting Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). i=j; j <<= 1; } else break; Found rra’s level. Terminate the sift-down. } ra[i]=rra; Put rra into its slot. } } CITED REFERENCES AND FURTHER READING: Knuth, D.E. 1973, Sorting and Searching, vol. 3 of The Art of Computer Programming (Reading, MA: Addison-Wesley), §5.2.3. [1] Sedgewick, R. 1988, Algorithms, 2nd ed. (Reading, MA: Addison-Wesley), Chapter 11. [2] 8.4 Indexing and Ranking The concept of keys plays a prominent role in the management of data files. A data record in such a file may contain several items, or fields. For example, a record in a file of weather observations may have fields recording time, temperature, and wind velocity. When we sort the records, we must decide which of these fields we want to be brought into sorted order. The other fields in a record just come along for the ride, and will not, in general, end up in any particular order. The field on which the sort is performed is called the key field. For a data file with many records and many fields, the actual movement of N records into the sorted order of their keys Ki, i = 1,...,N, can be a daunting task. Instead, one can construct an index table I j , j = 1,...,N, such that the smallest Ki has i = I1, the second smallest has i = I2, and so on up to the largest Ki with i = IN . In other words, the array KIj j = 1, 2,...,N (8.4.1) is in sorted order when indexed by j. When an index table is available, one need not move records from their original order. Further, different index tables can be made from the same set of records, indexing them to different keys. The algorithm for constructing an index table is straightforward: Initialize the index array with the integers from 1 to N, then perform the Quicksort algorithm, moving the elements around as if one were sorting the keys. The integer that initially numbered the smallest key thus ends up in the number one position, and so on. #include "nrutil.h" #define SWAP(a,b) itemp=(a);(a)=(b);(b)=itemp; #define M 7 #define NSTACK 50 void indexx(unsigned long n, float arr[], unsigned long indx[]) Indexes an array arr[1..n], i.e., outputs the array indx[1..n] such that arr[indx[j]] is in ascending order for j = 1, 2,...,N. The input quantities n and arr are not changed. { unsigned long i,indxt,ir=n,itemp,j,k,l=1; int jstack=0,*istack; float a; istack=ivector(1,NSTACK);
8.4 Indexing and Ranking 339 original index rank sorted array table table array 14 ⑤ ① ① ① 8 4 个 ② ② ② 32 ② 6 Permission is 3 ③ ③ 7 ① 0 4 ④ 3 6 15 ⑤ ⑤ ⑤ 15 3 http://www.nr.com or call 1-800-872-7423 (North America (6 (6 (a) (b) (c) (d) Figure 8.4.1.(a)An unsorted array of six numbers.(b)Index table,whose entries are pointers to the elements of(a)in ascending order.(c)Rank table,whose entries are the ranks of the onty),or granted for internet users to make one paper copy for their corresponding elements of(a).(d)Sorted array of the elements in (a). for (j=1;j=1;i-)[ Copyright (C)1988-1992 by Cambridge University Press.Programs Copyright(C)1988-1992 by Numerical Recipes Sample page from NUMERICAL RECIPES IN C:THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) if (arr[indx[i]]>1; SWAP(indx [k],indx [1+1]) rsend email to directcustserv@cambridge.org(outside North America) if (arr[indx[1]]arr[indx[ir]]){ Software. SWAP(indx[1],indx[ir]) if (arr[indx[1+1]]arr[indx[ir]]){ SWAP(indx[1+1],indx [ir]) ying of machine if (arr[indx[1]]arr[indx[1+1]]){ SWAP(indx [1],indx[1+1]) =1+1; j=ir; indxt=indx[1+1]; a=arr [indxt]; for(;;)[
8.4 Indexing and Ranking 339 Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). 15 6 3 5 7 4 32 3 8 2 1 14 3 6 6 5 1 4 2 3 4 2 1 5 5 6 1 5 2 4 6 3 3 2 1 4 32 6 15 5 14 4 8 3 7 2 1 3 original array index table rank table sorted array (a) (b) (c) (d) Figure 8.4.1. (a) An unsorted array of six numbers. (b) Index table, whose entries are pointers to the elements of (a) in ascending order. (c) Rank table, whose entries are the ranks of the corresponding elements of (a). (d) Sorted array of the elements in (a). for (j=1;j=l;i--) { if (arr[indx[i]] > 1; SWAP(indx[k],indx[l+1]); if (arr[indx[l]] > arr[indx[ir]]) { SWAP(indx[l],indx[ir]) } if (arr[indx[l+1]] > arr[indx[ir]]) { SWAP(indx[l+1],indx[ir]) } if (arr[indx[l]] > arr[indx[l+1]]) { SWAP(indx[l],indx[l+1]) } i=l+1; j=ir; indxt=indx[l+1]; a=arr[indxt]; for (;;) {
340 Chapter 8. Sorting do i++;while (arr[indx[i]]a); do i--;while (arr[indx[i]]a); if (i=j-1)[ istack[jstack]=ir; istack[jstack-1]=i; http://www.nr. read able files 1r=j-1; else istack[jstack]=j-1; 839 istack[jstack-1]=l; granted for 1=1; 11-600 (including this one) free_ivector(istack,1,NSTACK); /Cambridge users to make from NUMERICAL RECIPES IN 19881992 (Nort If you want to sort an array while making the corresponding rearrangement of several or many other arrays,you should first make an index table,then use it to America server computer, UnN电.t rearrange each array in turn.This requires two arrays of working space:one to one paper THE ART hold the index,and another into which an array is temporarily moved,and from which it is redeposited back on itself in the rearranged order.For 3 arrays,the Programs procedure looks like this: #include "nrutil.h" void sort3(unsigned long n,float ran],float rb[],float rc[]) to dir Sorts an array ra[1..n]into ascending numerical order while making the corresponding re- arrangements of the arrays rb[1..n]and rc [1..n].An index table is constructed via the 19881982 OF SCIENTIFIC COMPUTING(ISBN routine indexx void indexx(unsigned long n,float arr[],unsigned long indx[]); unsigned long j,*iwksp; float *wksp; .Further reproduction, Numerical Recipes 10-621 iwksp=lvector(1,n); 43108 wksp=vector(1,n); indexx(n,ra,iwksp); Make the index table for (j=1;j<=n;j++)wksp[j]=ra[j]; Save the array ra. (outside for (j=1;j<=n;j++)ra[j]=wksp[iwksp[j]]; Copy it back in rearranged order. for (j=1;j<=n;j++) wksp[j]=rb[j]; Ditto rb. Software. for (j=1;j<=n;j++)rb[j]=wksp[iwksp[j]]; for (j=1;j<=n;j++)wksp[j]=rc[j]; Ditto rc. ying of for (j=1;j<=n;j++)rc[j]=wksp[iwksp[j]]; free_vector(wksp,1,n); free_lvector(iwksp,1,n); The generalization to any other number of arrays is obviously straightforward. A rank table is different from an index table.A rank table's jth entry gives the rank of the jth element of the original array of keys,ranging from 1(if that element
340 Chapter 8. Sorting Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). do i++; while (arr[indx[i]] a); if (j NSTACK) nrerror("NSTACK too small in indexx."); if (ir-i+1 >= j-l) { istack[jstack]=ir; istack[jstack-1]=i; ir=j-1; } else { istack[jstack]=j-1; istack[jstack-1]=l; l=i; } } } free_ivector(istack,1,NSTACK); } If you want to sort an array while making the corresponding rearrangement of several or many other arrays, you should first make an index table, then use it to rearrange each array in turn. This requires two arrays of working space: one to hold the index, and another into which an array is temporarily moved, and from which it is redeposited back on itself in the rearranged order. For 3 arrays, the procedure looks like this: #include "nrutil.h" void sort3(unsigned long n, float ra[], float rb[], float rc[]) Sorts an array ra[1..n] into ascending numerical order while making the corresponding rearrangements of the arrays rb[1..n] and rc[1..n]. An index table is constructed via the routine indexx. { void indexx(unsigned long n, float arr[], unsigned long indx[]); unsigned long j,*iwksp; float *wksp; iwksp=lvector(1,n); wksp=vector(1,n); indexx(n,ra,iwksp); Make the index table. for (j=1;j<=n;j++) wksp[j]=ra[j]; Save the array ra. for (j=1;j<=n;j++) ra[j]=wksp[iwksp[j]]; Copy it back in rearranged order. for (j=1;j<=n;j++) wksp[j]=rb[j]; Ditto rb. for (j=1;j<=n;j++) rb[j]=wksp[iwksp[j]]; for (j=1;j<=n;j++) wksp[j]=rc[j]; Ditto rc. for (j=1;j<=n;j++) rc[j]=wksp[iwksp[j]]; free_vector(wksp,1,n); free_lvector(iwksp,1,n); } The generalization to any other number of arrays is obviously straightforward. A rank table is different from an index table. A rank table’s jth entry gives the rank of the jth element of the original array of keys, ranging from 1 (if that element
8.5 Selecting the Mth Largest 341 was the smallest)to N (if that element was the largest).One can easily construct a rank table from an index table,however: void rank(unsigned long n,unsigned long indx[],unsigned long irank[]) Given indx[1..n]as output from the routine indexx,returns an array irank[1..n],the corresponding table of ranks. f unsigned long j; for (j=1;j100 we usually define k =N/2 to be the median element,pedants be damned. North The fastest general method for selection,allowing rearrangement,is partition- ing,exactly as was done in the Quicksort algorithm (88.2).Selecting a"random" partition element,one marches through the array,forcing smaller elements to the eft,larger elements to the right.As in Quicksort,it is important to optimize the inner loop,using "sentinels"(88.2)to minimize the number of comparisons.For sorting,one would then proceed to further partition both subsets.For selection, we can ignore one subset and attend only to the one that contains our desired kth element.Selection by partitioning thus does not need a stack of pending operations, and its operations count scales as N rather than as Nlog N (see [11).Comparison with sort in 88.2 should make the following routine obvious:
8.5 Selecting the Mth Largest 341 Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). was the smallest) to N (if that element was the largest). One can easily construct a rank table from an index table, however: void rank(unsigned long n, unsigned long indx[], unsigned long irank[]) Given indx[1..n] as output from the routine indexx, returns an array irank[1..n], the corresponding table of ranks. { unsigned long j; for (j=1;j 100 we usually define k = N/2 to be the median element, pedants be damned. The fastest general method for selection, allowing rearrangement, is partitioning, exactly as was done in the Quicksort algorithm (§8.2). Selecting a “random” partition element, one marches through the array, forcing smaller elements to the left, larger elements to the right. As in Quicksort, it is important to optimize the inner loop, using “sentinels” (§8.2) to minimize the number of comparisons. For sorting, one would then proceed to further partition both subsets. For selection, we can ignore one subset and attend only to the one that contains our desired kth element. Selection by partitioning thus does not need a stack of pending operations, and its operations count scales as N rather than as N log N (see [1]). Comparison with sort in §8.2 should make the following routine obvious: