Chapter 12.Fast Fourier Transform 12.0 Introduction A very large class of important computational problems falls under the general rubric of“Fourier transform methods'”or“spectral methods..”For some of these 11800-41 one) NUMERICAL problems,the Fourier transform is simply an efficient computational tool for accomplishing certain common manipulations of data.In other cases,we have problems for which the Fourier transform(or the related "power spectrum")is itself of intrinsic interest.These two kinds of problems share a common methodology. Largely for historical reasons the literature on Fourier and spectral methods has 玉疙 Press. C:THEA been disjoint from the literature on"classical"numerical analysis.Nowadays there is no justification for such a split.Fourier methods are commonplace in research and we shall not treat them as specialized or arcane.At the same time,we realize that many computer users have had relatively less experience with this field than with,say, 曾g differential equations or numerical integration.Therefore our summary of analytical SCIENTIFIC results will be more complete.Numerical algorithms,per se,begin in $12.2.Various applications of Fourier transform methods are discussed in Chapter 13. 6 A physical process can be described either in the time domain,by the values of some quantity h as a function of time t,e.g.,h(t),or else in the frequency domain, where the process is specified by giving its amplitude H(generally a complex number indicating phase also)as a function of frequency f,that is H(f),with -oo<f <oo.For many purposes it is useful to think of h(t)and H(f)as being two different representations of the same function.One goes back and forth between these two representations by means of the Fourier transform equations, H(f)= h(t)e2xiftdt (outside ecipes -00 (12.0.1) Software. h(t)= H(f)e-2xiftdf If t is measured in seconds,then f in equation(12.0.1)is in cycles per second, America). or Hertz(the unit of frequency).However,the equations work with other units too.If h is a function of position x(in meters),H will be a function of inverse wavelength (cycles per meter),and so on.If you are trained as a physicist or mathematician,you are probably more used to using angular frequencyw,which is given in radians per sec.The relation between w and f,H(w)and H(f)is w三2πf H(w)≡[H(f】f=w/2m (12.0.2) 496
Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). Chapter 12. Fast Fourier Transform 12.0 Introduction A very large class of important computational problems falls under the general rubric of “Fourier transform methods” or “spectral methods.” For some of these problems, the Fourier transform is simply an efficient computational tool for accomplishing certain common manipulations of data. In other cases, we have problems for which the Fourier transform (or the related “power spectrum”) is itself of intrinsic interest. These two kinds of problems share a common methodology. Largely for historical reasons the literature on Fourier and spectral methods has been disjoint from the literature on “classical” numerical analysis. Nowadays there is no justification for such a split. Fourier methods are commonplace in research and we shall not treat them as specialized or arcane. At the same time, we realize that many computer users have had relatively less experience with this field than with, say, differential equations or numerical integration. Therefore our summary of analytical results will be more complete. Numerical algorithms, per se, begin in §12.2. Various applications of Fourier transform methods are discussed in Chapter 13. A physical process can be described either in the time domain, by the values of some quantity h as a function of time t, e.g., h(t), or else in the frequency domain, where the process is specified by giving its amplitude H (generally a complex number indicating phase also) as a function of frequency f, that is H(f), with −∞ <f< ∞. For many purposes it is useful to think of h(t) and H(f) as being two differentrepresentations of the same function. One goes back and forth between these two representations by means of the Fourier transform equations, H(f) = ∞ −∞ h(t)e2πif tdt h(t) = ∞ −∞ H(f)e−2πif tdf (12.0.1) If t is measured in seconds, then f in equation (12.0.1) is in cycles per second, or Hertz (the unit of frequency). However, the equations work with other units too. If h is a function of position x (in meters), H will be a function of inverse wavelength (cycles per meter), and so on. If you are trained as a physicist or mathematician, you are probably more used to using angular frequency ω, which is given in radians per sec. The relation between ω and f, H(ω) and H(f) is ω ≡ 2πf H(ω) ≡ [H(f)]f=ω/2π (12.0.2) 496
12.0 Introduction 497 and equation (12.0.1)looks like this H(w)= h(t)eidt (12.0.3) 1 h(t)= H(w)e-idw We were raised on the w-convention,but we changed!There are fewer factors of 2 to remember if you use the f-convention,especially when we get to discretely sampled data in $12.1. From equation (12.0.1)it is evident at once that Fourier transformation is a linear operation.The transform of the sum of two functions is equal to the sum of the transforms.The transform of a constant times a function is that same constant times the transform of the function. In the time domain,function h(t)may happen to have one or more special symmetries It might be purely real or purely imaginary or it might be even, University RECIPES I h(t)=h(-t),or odd,h(t)=-h(-t).In the frequency domain,these symmetries 2 lead to relationships between H(f)and H(-f).The following table gives the correspondence between symmetries in the two domains: Press. THE f.… then... ART h(t)is real H(-f)=[H(f)]* h(t)is imaginary H(-f)=-[H(f)]* Programs h(t)is even H(-f)=H(f)[i.e.,H(f)is even] h(t)is odd H(-f)=-H(f)[i.e.,H(f)is odd] h(t)is real and even H(f)is real and even 、言 OF SCIENTIFIC( h(t)is real and odd H(f)is imaginary and odd h(t)is imaginary and even H(f)is imaginary and even h(t)is imaginary and odd H(f)is real and odd COMPUTING (ISBN 188810920 In subsequent sections we shall see how to use these symmetries to increase computational efficiency Numerical Recipes 10-621 Here are some other elementary properties of the Fourier transform.(We'll use 4310-5 the“←→”symbol to indicate transform pairs..)If (outside h(t)→H(f) North Software. is such a pair,then other transform pairs are h(at)→aHa 1H() "time scaling” (12.0.4) 高哈二B6D “frequency scaling" (12.0.5) h(t-to)→H(f)e2 ifto “"time shifting” (12.0.6) h(d)e-2miot→Hf-fo) “frequency shifting” (12.0.7)
12.0 Introduction 497 Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). and equation (12.0.1) looks like this H(ω) = ∞ −∞ h(t)eiωtdt h(t) = 1 2π ∞ −∞ H(ω)e−iωtdω (12.0.3) We were raised on the ω-convention, but we changed! There are fewer factors of 2π to remember if you use the f-convention, especially when we get to discretely sampled data in §12.1. From equation (12.0.1) it is evident at once that Fourier transformation is a linear operation. The transform of the sum of two functions is equal to the sum of the transforms. The transform of a constant times a function is that same constant times the transform of the function. In the time domain, function h(t) may happen to have one or more special symmetries It might be purely real or purely imaginary or it might be even, h(t) = h(−t), or odd, h(t) = −h(−t). In the frequency domain, these symmetries lead to relationships between H(f) and H(−f). The following table gives the correspondence between symmetries in the two domains: If... then ... h(t) is real H(−f)=[H(f)]* h(t) is imaginary H(−f) = −[H(f)]* h(t) is even H(−f) = H(f) [i.e., H(f) is even] h(t) is odd H(−f) = −H(f) [i.e., H(f) is odd] h(t) is real and even H(f) is real and even h(t) is real and odd H(f) is imaginary and odd h(t) is imaginary and even H(f) is imaginary and even h(t) is imaginary and odd H(f) is real and odd In subsequent sections we shall see how to use these symmetries to increase computational efficiency. Here are some other elementary properties of the Fourier transform. (We’ll use the “⇐⇒” symbol to indicate transform pairs.) If h(t) ⇐⇒ H(f) is such a pair, then other transform pairs are h(at) ⇐⇒ 1 |a| H( f a ) “time scaling” (12.0.4) 1 |b| h( t b ) ⇐⇒ H(bf) “frequency scaling” (12.0.5) h(t − t0) ⇐⇒ H(f) e2πif t0 “time shifting” (12.0.6) h(t) e−2πif0t ⇐⇒ H(f − f0) “frequency shifting” (12.0.7)
498 Chapter 12.Fast Fourier Transform With two functions h(t)and g(t),and their corresponding Fourier transforms H(f)and G(f),we can form two combinations of special interest.The comolution of the two functions,denoted g*h.is defined by g*h三 g(r)h(t-r)dr (12.0.8) Note that g*h is a function in the time domain and that g*h=h*g.It turns out that the function g*h is one member of a simple transform pair 81 g*h→G(f)H(f) "Convolution Theorem" (12.0.9) In other words,the Fourier transform of the convolution is just the product of the individual Fourier transforms. 云 The correlation of two functions,denoted Corr(g,h),is defined by Corr(g,h)≡g(r+t)h(r)d 12.0.10 RECIPES 令 The correlation is a function of t,which is called the lag.It therefore lies in the time domain,and it turns out to be one member of the transform pair: s Corr(g,h)→G(f)H*(f)“Correlation Theorem” (12.0.11) [More generally,the second member of the pair is G(f)H(-f),but we are restricting ourselves to the usual case in which g and h are real functions,so we take the liberty of IENTIFIC setting H(-f)=H*(f).]This result shows that multiplying the Fourier transform of one function by the complex conjugate of the Fourier transform of the other gives the Fourier transform of their correlation.The correlation of a function with itself is called its autocorrelation.In this case (12.0.11)becomes the transform pair Corr(g,g)→1G(f)2 "Wiener-Khinchin Theorem' (12.0.12) 、高”公 6 Numerica 10.621 The total power in a signal is the same whether we compute it in the time domain or in the frequency domain.This result is known as Parseval's theorem: 43126 Tota Per=h(Pt=厂IHfr产y (12.0.13) Frequently one wants to know"how much power"is contained in the frequency interval between f and f+df.In such circumstances one does not usually distinguish between positive and negative f,but rather regards f as varying from 0 ("zero frequency"or D.C.)to +oo.In such cases,one defines the one-sided power spectral density (PSD)of the function h as P(f)(f)+(f)2 0≤f<o (12.0.14) so that the total power is just the integral of P(f)from f=0 to f=oo.When the function h(t)is real,then the two terms in(12.0.14)are equal,so Pn(f)=2H(f)2
498 Chapter 12. Fast Fourier Transform Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). With two functions h(t) and g(t), and their corresponding Fourier transforms H(f) and G(f), we can form two combinations of special interest. The convolution of the two functions, denoted g ∗ h, is defined by g ∗ h ≡ ∞ −∞ g(τ)h(t − τ) dτ (12.0.8) Note that g ∗ h is a function in the time domain and that g ∗ h = h ∗ g. It turns out that the function g ∗ h is one member of a simple transform pair g ∗ h ⇐⇒ G(f)H(f) “Convolution Theorem” (12.0.9) In other words, the Fourier transform of the convolution is just the product of the individual Fourier transforms. The correlation of two functions, denoted Corr(g, h), is defined by Corr(g, h) ≡ ∞ −∞ g(τ + t)h(τ) dτ (12.0.10) The correlation is a function of t, which is called the lag. It therefore lies in the time domain, and it turns out to be one member of the transform pair: Corr(g, h) ⇐⇒ G(f)H*(f) “Correlation Theorem” (12.0.11) [More generally, the second member of the pair is G(f)H(−f), but we are restricting ourselves to the usual case in which g and h are real functions, so we take the liberty of setting H(−f) = H*(f).] This result shows that multiplying the Fourier transform of one function by the complex conjugate of the Fourier transform of the other gives the Fourier transform of their correlation. The correlation of a function with itself is called its autocorrelation. In this case (12.0.11) becomes the transform pair Corr(g, g) ⇐⇒ |G(f)| 2 “Wiener-Khinchin Theorem” (12.0.12) The total power in a signal is the same whether we compute it in the time domain or in the frequency domain. This result is known as Parseval’s theorem: Total Power ≡ ∞ −∞ |h(t)| 2 dt = ∞ −∞ |H(f)| 2 df (12.0.13) Frequently one wants to know “how much power” is contained in the frequency interval between f and f + df. In such circumstances one does not usually distinguish between positive and negative f, but rather regards f as varying from 0 (“zero frequency” or D.C.) to +∞. In such cases, one defines the one-sided power spectral density (PSD) of the function h as Ph(f) ≡ |H(f)| 2 + |H(−f)| 2 0 ≤ f < ∞ (12.0.14) so that the total power is just the integral of Ph(f) from f = 0 to f = ∞. When the function h(t)is real, then the two terms in (12.0.14) are equal, so P h(f)=2 |H(f)| 2
12.0 Introduction 499 a http://www.nr.com or call 1-800-872- read able files (including this one) granted fori 19881992 0 (b) to any server computer,is -7423(North America tusers to make one paper by Cambridge University Press. from NUMERICAL RECIPES IN C: THE only),or strictly pron Programs 0 Copyright (C) (c) Figure 12.0.1.Normalizations of one-and two-sided power spectra.The area under the square of the function,(a),equals the area under its one-sided power spectrum at positive frequencies,(b),and also mail to directcustsen equals the area under its two-sided power spectrum at positive and negative frequencies,(c). Be warned that one occasionally sees PSDs defined without this factor two.These. v@cam strictly speaking,are called two-sided power spectral densities,but some books are not careful about stating whether one-or two-sided is to be assumed.We 1988-1992 by Numerical Recipes ART OF SCIENTIFIC COMPUTING(ISBN 0-521-43108- will always use the one-sided density given by equation (12.0.14).Figure 12.0.1 contrasts the two conventions. If the function h(t)goes endlessly from -oo<t<oo,then its total power (outside and power spectral density will,in general,be infinite.Of interest then is the (one- North Software. or two-sided)power spectral density per unit time.This is computed by taking a long,but finite,stretch of the function h(t).computing its PSD [that is,the PSD Ame of a function that equals h(t)in the finite stretch but is zero everywhere elsel,and visit website then dividing the resulting PSD by the length of the stretch used.Parseval's theorem machine in this case states that the integral of the one-sided PSD-per-unit-time over positive frequency is equal to the mean square amplitude of the signal h(t). You might well worry about how the PSD-per-unit-time,which is a function of frequency f,converges as one evaluates it using longer and longer stretches of data. This interesting question is the content of the subject of"power spectrum estimation," and will be considered below in 813.4-813.7.A crude answer for now is:The
12.0 Introduction 499 Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). h(t)2 (a) (b) (c) f Ph( f ) (one-sided) − f 0 Ph( f ) (two-sided) t 0 f Figure 12.0.1. Normalizations of one- and two-sided power spectra. The area under the square of the function, (a), equals the area under its one-sided power spectrum at positive frequencies, (b), and also equals the area under its two-sided power spectrum at positive and negative frequencies, (c). Be warned that one occasionally sees PSDs defined without this factor two. These, strictly speaking, are called two-sided power spectral densities, but some books are not careful about stating whether one- or two-sided is to be assumed. We will always use the one-sided density given by equation (12.0.14). Figure 12.0.1 contrasts the two conventions. If the function h(t) goes endlessly from −∞ <t< ∞, then its total power and power spectral density will, in general, be infinite. Of interest then is the (oneor two-sided) power spectral density per unit time. This is computed by taking a long, but finite, stretch of the function h(t), computing its PSD [that is, the PSD of a function that equals h(t) in the finite stretch but is zero everywhere else], and then dividing the resulting PSD by the length of the stretch used. Parseval’s theorem in this case states that the integral of the one-sided PSD-per-unit-time over positive frequency is equal to the mean square amplitude of the signal h(t). You might well worry about how the PSD-per-unit-time, which is a function of frequency f, converges as one evaluates it using longer and longer stretches of data. This interesting question is the content of the subject of “power spectrum estimation,” and will be considered below in §13.4–§13.7. A crude answer for now is: The
500 Chapter 12.Fast Fourier Transform PSD-per-unit-time converges to finite values at all frequencies except those where h(t)has a discrete sine-wave (or cosine-wave)component of finite amplitude.At those frequencies,it becomes a delta-function,i.e.,a sharp spike,whose width gets narrower and narrower,but whose area converges to be the mean square amplitude of the discrete sine or cosine component at that frequency. We have by now stated all of the analytical formalism that we will need in this chapter with one exception:In computational work,especially with experimental data,we are almost never given a continuous function h(t)to work with,but are given,rather,a list of measurements of h(ti)for a discrete set ofti's.The profound implications of this seemingly unimportant fact are the subject of the next section. CITED REFERENCES AND FURTHER READING: Champeney,D.C.1973,Fourier Transforms and Their Physical Applications(New York:Academic Press). Elliott,D.F.,and Rao,K.R.1982,Fast Transforms:Algorithms,Analyses,Applications(New York: Academic Press). 12.1 Fourier Transform of Discretely Sampled 西 令 Press. Data In the most common situations,function h(t)is sampled (i.e.,its value is 是8e recorded)at evenly spaced intervals in time.Let A denote the time interval between consecutive samples,so that the sequence of sampled values is 6 hm=h(n△)n=.,-3,-2,-1,0,1,2,3,… (12.1.1) The reciprocal of the time interval A is called the sampling rate;if A is measured in seconds,for example,then the sampling rate is the number of samples recorded per second. Sampling Theorem and Aliasing 4 E喜 Numerical Recipes 10.621 43106 For any sampling interval A,there is also a special frequency fe,called the Nyquist critical frequency,given by (outside f三2△ (12.1.2) If a sine wave of the Nyquist critical frequency is sampled at its positive peak value, then the next sample will be at its negative trough value,the sample after that at the positive peak again,and so on.Expressed otherwise:Critical sampling of a sine wave is two sample points per cycle.One frequently chooses to measure time in units of the sampling interval A.In this case the Nyquist critical frequency is just the constant 1/2. The Nyquist critical frequency is important for two related,but distinct,reasons. One is good news,and the other bad news.First the good news.It is the remarkable
500 Chapter 12. Fast Fourier Transform Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). PSD-per-unit-time converges to finite values at all frequencies except those where h(t) has a discrete sine-wave (or cosine-wave) component of finite amplitude. At those frequencies, it becomes a delta-function, i.e., a sharp spike, whose width gets narrower and narrower, but whose area converges to be the mean square amplitude of the discrete sine or cosine component at that frequency. We have by now stated all of the analytical formalism that we will need in this chapter with one exception: In computational work, especially with experimental data, we are almost never given a continuous function h(t) to work with, but are given, rather, a list of measurements of h(t i) for a discrete set of ti’s. The profound implications of this seemingly unimportant fact are the subject of the next section. CITED REFERENCES AND FURTHER READING: Champeney, D.C. 1973, Fourier Transforms and Their Physical Applications (New York: Academic Press). Elliott, D.F., and Rao, K.R. 1982, Fast Transforms: Algorithms, Analyses, Applications (New York: Academic Press). 12.1 Fourier Transform of Discretely Sampled Data In the most common situations, function h(t) is sampled (i.e., its value is recorded) at evenly spaced intervals in time. Let ∆ denote the time interval between consecutive samples, so that the sequence of sampled values is hn = h(n∆) n = ..., −3, −2, −1, 0, 1, 2, 3,... (12.1.1) The reciprocal of the time interval ∆ is called the sampling rate; if ∆ is measured in seconds, for example, then the sampling rate is the number of samples recorded per second. Sampling Theorem and Aliasing For any sampling interval ∆, there is also a special frequency f c, called the Nyquist critical frequency, given by fc ≡ 1 2∆ (12.1.2) If a sine wave of the Nyquist critical frequency is sampled at its positive peak value, then the next sample will be at its negative trough value, the sample after that at the positive peak again, and so on. Expressed otherwise: Critical sampling of a sine wave is two sample points per cycle. One frequently chooses to measure time in units of the sampling interval ∆. In this case the Nyquist critical frequency is just the constant 1/2. The Nyquist critical frequency is important for two related, but distinct, reasons. One is good news, and the other bad news. First the good news. It is the remarkable