16.322 Stochastic Estimation and Control, Fall 2004 rof. vander velde Lecture 10 Last time: Random processes With f(x, t)we can compute all of the usual statistics x(1)=xf(x,41)dx Mean squared value x(t f(x,1 Higher order distribution and density functions. You can define these distributions of any order F(x,1;x212)=P[x(4)≤x,x(2)≤x2…,x(n)≤x] f(x1,12x2,12…)= ax,Ox,.x F(x1, 4s x2, 4 2 Fis the probability that one member of the ensemble x satisfies each of these constraints at times t. But we rarely work with distributions higher than second order A very important statistic of a random process for the study of random processes in linear systems is the autocorrelation function, Rxr the correlation of x(4,)and x(12) R(12)=a∫xx(x,x22) E[x(4),x(1) This could be computed as a moment of the second order density function(as above), but we usually just specify or measure the autocorrelation function directlv otice that the autocorrelation function and the first order probability density function express different kinds of information about the random process. Two different processes might have the same pdfs but quite different R(t)s Conversely, they might have the same R(r) but completely different pdfs
16.322 Stochastic Estimation and Control, Fall 2004 Prof. Vander Velde Page 1 of 9 Lecture 10 Last time: Random Processes With f ( ,) x t we can compute all of the usual statistics. Mean value: 1 1 x() (, ) t xf x t dx ∞ −∞ = ∫ Mean squared value: 2 2 1 1 x() (, ) t x f x t dx ∞ −∞ = ∫ Higher order distribution and density functions. You can define these distributions of any order. [ ] ( ) 11 2 2 1 1 2 2 11 2 2 11 2 2 1 2 ( , ; , ;...) ( ) , ( ) ,..., ( ) ( , ; , ;...) , ; , ;... ... n n n n F x t x t P xt x xt x xt x fxtxt Fxtxt xx x =≤ ≤ ≤ ∂ = ∂∂ ∂ F is the probability that one member of the ensemble x satisfies each of these constraints at times ti. But we rarely work with distributions higher than second order. A very important statistic of a random process for the study of random processes in linear systems is the autocorrelation function, Rxx– the correlation of 1 x( ) t and 2 x( ) t . ( ) [ ] 1 2 1 212 1 1 2 2 1 2 (, ) ,; , ( ), ( ) Rxx t t dx dx x x f x t x t E xt xt ∞ ∞ −∞ −∞ = = ∫ ∫ This could be computed as a moment of the second order density function (as above), but we usually just specify or measure the autocorrelation function directly. Notice that the autocorrelation function and the first order probability density function express different kinds of information about the random process. Two different processes might have the same pdfs but quite different ( ) Rxx τ s. Conversely, they might have the same ( ) Rxx τ but completely different pdfs
16.322 Stochastic Estimation and Control, Fall 2004 Prof vander velde This is called the autocorrelation function for the random process x(n).Note some useful propertie R2()=E[x()x小=E[x(n2]=x( R2(24)=E[x(1)x(4)]=E[x(1)x(2)=R(4) Also note that x(t2) is likely to be independent of x(t1) if |t2-t1 is large. For that case 2=R(4)→E[(4)x(2)=x(4)x) The members of the processes x(t)) and by(t)) must be associated as corresponding pairs of functions. There is a particular y which goes with an x To study the statistical interrelation among more than one random process we need to consider their joint distributions The general joint distribution function for two processes ix(t)) and b(tJ defined over the sample space of the same experiment in general Fm"(x,不…,xm,而m;y1车…,y,4)=P[x(4)≤x1…,x(m)xm,y(4)≤y…,y(n)≤y Examples include: Elevation, azimuth of radar tracker The general joint density function is ∫m(x,4…,xm,mny1…,y2,H)= ax..Oy. A Fim,mc)
16.322 Stochastic Estimation and Control, Fall 2004 Prof. Vander Velde Page 2 of 9 This is called the autocorrelation function for the random process {x( )t }. Note some useful properties: [ ] [ ][ ] 2 2 21 2 1 1 2 12 (,) () () () () ( , ) ( )() ()( ) (, ) xx xx xx R tt E xtxt E xt xt R t t E xt xt E xt xt R t t = == ⎡ ⎤ ⎣ ⎦ === Also note that x(t2) is likely to be independent of x(t1) if |t2-t1| is large. For that case: [ ] 2 1 12 1 2 1 2 | | lim ( , ) ( ) ( ) ( ) ( ) xx t t R t t E xt xt xt xt − →∞ → = The members of the processes {x(t)} and {y(t)} must be associated as corresponding pairs of functions. There is a particular y which goes with an x . To study the statistical interrelation among more than one random process we need to consider their joint distributions: The general joint distribution function for two processes {x(t)} and {y(t)}, defined over the sample space of the same experiment in general: [ ] ( ,) 11 11 1 1 1 1 ( , ;..., , ; , ;..., , ) ( ) ,..., ( ) , ( ) ,..., ( ) m n F x t x t y t y t P xt x xt x yt y yt y xy m m n n m m n n ′′ ′ =≤ ≤ ≤ ≤ Examples include: Elevation, azimuth of radar tracker. The general joint density function is: ( ,) ( ,) 11 11 1 1 ( , ;..., , ; , ;..., , ) (...) ... ... m n m n m n xy m m n n xy m n f xt x t yt yt F x xy y + ∂ ′ ′ = ∂ ∂∂ ∂
16.322 Stochastic Estimation and Control, Fall 2004 Prof vander Velde this by integration over the variables to be eliminated an be derived from Any of the lower ordered joint or single distributions The mathematical expectation of a function of these random variables is E{8{x(4),,x(n)y4…,y(那=「-d,g(x-y,)m”() By far the most important statistical parameter involved in the joint consideration of more than one random process is the cross correlation function Rn(,2)=E[x(4)y(4)=∫xm(x.y4 Rn(24)=E[x(2)y(4)=E[y(4)x(2)=Rn(4142) If x(o), y(o) are statistically independent R,(4,1)=E[x(4)y(2)=E[x(4)E[y(2)=x(4)v(2)=0 if either x(t) or y(t) or both is zero mean. Ifv(t)=x()+y(1)+() Rn(12)=Epv(4)n(12 =R(1)+Rn(412)+R(12) +Rx(12)+R(4,42)+R=(412) R(412)+R2(412)+R2(1 such that if any two of x(o), y(o or =(t) have zero mean and if they are all mutually independent, all the cross correlation terms vanish. Rmn(1,2)=R(1,l2)+Rn(1,2)+R2(1,12) Thus for independent processes with zero mean, the autocorrelation of the sum is the sum of the autocorrelations. This has special relevance since it implies that for independent processes with zero mean, the mean square of a sum is the sum of the mean squares. This simplifies the problem of minimizing the mean squared error when more than one random process is to be considered
16.322 Stochastic Estimation and Control, Fall 2004 Prof. Vander Velde Page 3 of 9 Any of the lower ordered joint or single distributions can be derived from this by integration over the variables to be eliminated. The mathematical expectation of a function of these random variables is: { } [ ] ( ,) 1 1 11 ( ),..., ( ), ( ),..., ( ) ... ( ... ) (...) m n E g x t x t y t y t dx dy g x y f m n n n xy ∞ ∞ −∞ −∞ ′ ′ = ∫ ∫ By far the most important statistical parameter involved in the joint consideration of more than one random process is the cross correlation function. [ ] (1,1) 12 1 2 1 2 (, ) ()( ) (,; , ) Rxy xy t t E x t y t dx xyf x t y t dy ∞ ∞ −∞ −∞ = = ∫ ∫ Rxy yx (,) ( )() ()( ) (, ) t t E xt yt E yt xt R t t 21 2 1 1 2 12 === [ ] [ ] If x( ), ( ) t yt are statistically independent R t t E xt yt E xt E yt xt yt xy (, ) ()() () () ()() 0 12 1 2 1 2 1 2 = = == [ ] [ ] [ ] if either x( )t or y t( ) or both is zero mean. If wt xt yt zt () () () () =++ Then 12 1 2 [ ] 12 12 12 12 12 12 12 12 12 (, ) ()() (, ) (, ) (, ) (, ) (, ) (, ) (, ) (, ) (, ) ww xx xy xz yx yy yz zx zy zz R t t E wt wt R tt R tt R tt R tt R tt R tt R tt R tt R tt = =++ ++ + ++ + such that if any two of x( )t , y t( ) or z t( ) have zero mean and if they are all mutually independent, all the cross correlation terms vanish. 12 12 12 12 (, ) (, ) (, ) (, ) Rww xx yy zz tt R tt R tt R tt =++ Thus for independent processes with zero mean, the autocorrelation of the sum is the sum of the autocorrelations. This has special relevance since it implies that for independent processes with zero mean, the mean square of a sum is the sum of the mean squares. This simplifies the problem of minimizing the mean squared error when more than one random process is to be considered
16.322 Stochastic Estimation and Control, Fall 2004 Prof vander velde [x()±3)=x(4)±2x(4)y(4)+y(2 =R(1,4)±2R2(1,2)+Rn(2)20 干R(44)≤[R2(4)+Bn(a2) R0,4)55[R2(4)+Rn(a (1)2+y(t2)2 k4)52x0)+x) Intuitively we feel that if the conditions under which an experiment is performed are time independent, then the statistical quantities associated with a random process resulting from the experiment should be independent of time analytically we say that a process is stationary if every translation in time transforms members of the process into other members of the process in such a way that probability is preserved This could also be stated by the statement that all distribution functions associated with the process (x1,1,x2,l2,xn,n)=F(x11,x2,41 be functions of the differences in the t, only and independent of the actual values of the 4. Define n-1 t. Then f(m is independent of t, f(x,1)→f(x) x(1)→x
16.322 Stochastic Estimation and Control, Fall 2004 Prof. Vander Velde Page 4 of 9 [ ]2 2 2 1 2 1 12 2 11 12 2 2 12 11 2 2 12 11 2 2 2 2 1 2 2 2 12 1 2 () ( ) () 2()( ) ( ) (,) 2 (, ) (, ) 0 1 (, ) (,) (, ) 2 1 (, ) (,) (, ) 2 1 () () 2 1 (, ) () ( ) 2 xx xy yy xy xx yy xy xx yy xx xt yt xt xt yt yt R tt R tt R tt R tt R tt R tt R tt R tt R tt xt yt R t t xt xt ± =± + =± + ≥ ≤ + ⎡ ⎤ ⎣ ⎦ ≤ + ⎡ ⎤ ⎣ ⎦ ≤ + ⎡ ⎤ ⎣ ⎦ ≤ + ⎡ ⎤ ⎣ m ⎦ Intuitively we feel that if the conditions under which an experiment is performed are time independent, then the statistical quantities associated with a random process resulting from the experiment should be independent of time. Analytically we say that a process is stationary if every translation in time transforms members of the process into other members of the process in such a way that probability is preserved. This could also be stated by the statement that all distribution functions associated with the process ( )( ) () () 11 2 2 11 21 2 1 , , , ,..., , , , , ,..., , n n F xtxt xt F xtxt xt nn n n = ++ τ τ be functions of the differences in the i t only and independent of the actual values of the i t . Define n −1 i τ . Then ( ) n f is independent of 1 t . 2 2 ( ,) ( ) ( ) ( ) f xt f x xt x xt x ⇒ ⇒ ⇒
16.322 Stochastic Estimation and Control, Fall 2004 Prof vander velde Still depends on differences between time samples, but does not depend on time at which sampling starts f(x,1:x22…,xn,Ln)=f(x,t,x2,z2x3,2…xn,n) l2=+2 f(x,1;x2,2)→f(x1,x2, Rx(12)→Ra(z) R(r)=x(ox(t+r) x(r-r)x(o) =x(1)x(t-z) R A stationary random process is further said to have the ergodic property if the time average of any function over a randomly selected member of the ensemble is equal to the ensemble average of the function with probability 1 This means there must be probability 1 of picking a function which represents the ensemble. Note that a"representative member of the ensemble excludes any with special properties which belong to a set of zero probability. A representative member must display at various points in time the same distributions of amplitude and rates of change of amplitude as are displayed in the entire ensemble at any one point in time
16.322 Stochastic Estimation and Control, Fall 2004 Prof. Vander Velde Page 5 of 9 Still depends on differences between time samples, but does not depend on time at which sampling starts. 11 2 2 2 2 3 3 2 2 ( , ; , ;..., , ) ( , ; , ; , ;...; , ) nn n n f x t x t x t f xtx x x t t τ τ τ τ = = + 11 2 2 1 2 2 1 1 2 ( ,; , ) ( , ,) (, ) () ( ) () ( ) ( ) () () ( ) ( ) xx xx xx xx fxtxt fxx t t R tt R R xtxt x t xt xtxt R τ τ τ τ τ τ τ τ ⇒ = − ⇒ = + = − = − = − A stationary random process is further said to have the ergodic property if the time average of any function over a randomly selected member of the ensemble is equal to the ensemble average of the function with probability 1. This means there must be probability 1 of picking a function which represents the ensemble. Note that a “representative” member of the ensemble excludes any with special properties which belong to a set of zero probability. A representative member must display at various points in time the same distributions of amplitude and rates of change of amplitude as are displayed in the entire ensemble at any one point in time
16.322 Stochastic Estimation and Control, Fall 2004 Prof vander Velde This is why no member of the ensemble of constant functions is representative of the ensemble Autocorrelation of a stationary function is an even function of its time argument R(0) (4.)=2[R(14)+B(24) R2)5[R2(O)+R2() 2 ≤R(0) R(r)=x(1)y(t+r) x(t-ry( y(1)x(t-r)
16.322 Stochastic Estimation and Control, Fall 2004 Prof. Vander Velde Page 6 of 9 This is why no member of the ensemble of constant functions is representative of the ensemble. Autocorrelation of a stationary function is an even function of its time argument. 2 12 11 2 2 2 2 2 (0) 1 (, ) (,) (, ) 2 1 ( ) (0) (0) 2 1 2 ( ) (0) xx xy xx yy xy xx yy xx xx x R R tt R tt R tt R RR x y R x R τ τ = ≤ + ⎡ ⎤ ⎣ ⎦ ≤ + ⎡ ⎤ ⎣ ⎦ ≤ + ⎡ ⎤ ⎣ ⎦ ≤ ≤ ( ) () ( ) ( ) () () ( ) ( ) xy yx R xt yt x t yt ytxt R τ τ τ τ τ = + = − = − = −
16.322 Stochastic Estimation and Control, Fall 2004 Prof vander velde x(t) x(t) x(t) Ergodic A stationary random process is further said to have the ergodic property if the time average of any function over a randomly selected member of the ensemble is equal to the ensemble average of the function with probability 1 This means there must be probability 1 of picking a function which represents the ensemble. Note that a"representative member of the ensemble excludes any with special properties which belong to a set of zero probability. A representative member must display at various points in time the same distributions of amplitude and rates of change of amplitude as are displayed in the entire ensemble at any one point in time. This is why no member of the ensemble of constant functions is representative of the ensemble Example: Constants (t) x(t) Page 7 of 9
16.322 Stochastic Estimation and Control, Fall 2004 Prof. Vander Velde Page 7 of 9 Ergodic property A stationary random process is further said to have the ergodic property if the time average of any function over a randomly selected member of the ensemble is equal to the ensemble average of the function with probability 1. This means there must be probability 1 of picking a function which represents the ensemble. Note that a “representative” member of the ensemble excludes any with special properties which belong to a set of zero probability. A representative member must display at various points in time the same distributions of amplitude and rates of change of amplitude as are displayed in the entire ensemble at any one point in time. This is why no member of the ensemble of constant functions is representative of the ensemble. Example: Constants
16.322 Stochastic Estimation and Control, Fall 2004 Prof vander velde Stationary, but not ergodic Time average ensemble expectation, with probability 1 Example: Sinusoids x(=Asinat+0) Ao fixed 0 random f() not stationary tationary if phase not ergodic distribution uniform Ensemble expectation of g(x(D)) E[g(Asin(ot+0)]=g[Asin(ot+0)]-d0 Time average of g(x(D)
16.322 Stochastic Estimation and Control, Fall 2004 Prof. Vander Velde Page 8 of 9 Stationary, but not ergodic. Time average = ensemble expectation, with probability 1. Example: Sinusoids xt A t ( ) sin( ) = + ω θ , fixed random A ω θ Ensemble expectation of g xt ( ( )) : [ ][ ] 2 0 1 ( sin( )) sin( ) 2 EgA t gA t d π ω θ ωθ θ π += + ∫ Time average of g xt ( ) ( ) :
16.322 Stochastic Estimation and Control, Fall 2004 Prof vander velde Ave[( Asin(ot+0)]=a[Asin(ot+ )]dr Ave[8(Asin( t +0)]=138(4sin(6+0ja Ave[g(sin(ol+0)]=8D 2丌 [Asin(p+0)]φ ensemble expectation of g(x(o) So when f(e) is uniformly distributed, the process is stationary and ergodic
16.322 Stochastic Estimation and Control, Fall 2004 Prof. Vander Velde Page 9 of 9 ( ) [ ] ( ) [ ] ( ) [ ] ( ) 0 0 2 0 1 Ave sin( ) sin( ) 1 Ave sin( ) sin( ) 2 1 Ave sin( ) sin( ) 2 ensemble expectation of ( ) T T g A t g A t dt T t d gA t gA T T gA t gA d g xt ω π ωθ ωθ φ ω φ ω θ φθ ω ω π ω θ φθ φ π ⎡ ⎤ += + ⎣ ⎦ = ⎡ ⎤ += + ⎣ ⎦ = ⎡ ⎤ += + ⎣ ⎦ = ∫ ∫ ∫ So when f ( ) θ is uniformly distributed, the process is stationary and ergodic