12.540 Principles of the globa Positioning System Lecture 10 Prof. Thomas Herring 03/1203 12540Lec10 Estimation Introduction · Homework review Overview Basic concepts in estimation Models: Mathematical and Statistical Statistical concepts
03/12/03 12.540 Lec 10 1 12.540 Principles of the Global Positioning System Lecture 10 Prof. Thomas Herring 03/12/03 12.540 Lec 10 2 Estimation: Introduction – Basic concepts in estimation – Models: Mathematical and Statistical – Statistical concepts • Homework review • Overview 1
Basic concepts Basic problem: We measure range and phase data that are related to the positions of the ground receiver, satellites and other quantities How do we determine the " best position for the receiver and other quantities What do we mean by best " estimate? Inferring parameters from measurements is estimation 03/1203 12540Lec10 Basic estimation Two styles of estimation(appropriate for geodetic type measurements) Parametric estimation where the quantities to be that express the observablesarables in equations Condition estimation where conditions can be formulated among the observations. Rarely used, most common application is leveling where the sum of the height differences around closed circuits must be zero
03/12/03 12.540 Lec 10 3 Basic concepts estimation 03/12/03 12.540 Lec 10 4 Basic estimation – Parametric estimation where the quantities to be that express the observables – formulated among the observations. Rarely used, most common application is leveling where the sum of the height differences around closed circuits must be zero • Basic problem: We measure range and phase data that are related to the positions of the ground receiver, satellites and other quantities. How do we determine the “best” position for the receiver and other quantities. • What do we mean by “best” estimate? • Inferring parameters from measurements is • Two styles of estimation (appropriate for geodetic type measurements) estimated are the unknown variables in equations Condition estimation where conditions can be 2
Basics of parametric estimation All parametric estimation methods can be broken into a few main steps Observation equations: equations that relate the arameters to be estimated to the observed quantities(observables). Mathematical model position sat elite position (implicit in po clocks receiver Stochastic model: Statistical description that describes the random fluctuations in the measurements and maybe the parameters Inversion that determines the parameters values from the mathematical model consistent with the statistical mode 03/1203 12540Lec10 Observation model Observation model are equations relating observables to parameters of model Observable= function(parameters) Observables should not appear on right-hand-side of equation Often function is non-linear and most common method is linearization of function using Taylor senes expansion Sometimes log linearization for f=a bc ie Products fo parameters
03/12/03 12.540 Lec 10 5 Basics of parametric estimation – Observation equations: equations that relate the parameters to be estimated to the observed position, satellite position (implicit in r), clocks, atmospheric and ionosphere delays – Stochastic model: Statistical description that describes the random fluctuations in the measurements and maybe the parameters – Inversion that determines the parameters values from the mathematical model consistent with the statistical model. 03/12/03 12.540 Lec 10 6 Observation model – – of equation • All parametric estimation methods can be broken into a few main steps: quantities (observables). Mathematical model. • Example: Relationship between pseudorange, receiver • Observation model are equations relating observables to parameters of model: Observable = function (parameters) Observables should not appear on right-hand-side • Often function is non-linear and most common method is linearization of function using Taylor series expansion. • Sometimes log linearization for f=a.b.c ie. Products fo parameters 3
Taylor series expansion In most common Taylor series approach y=f(x1,x2,x3,x4) yo+ ay=f(x)+(xAx x=( x2,x3, 4) The estimation is made using the difference between the observations and the expected values based on apriori values for the parameters The estimation returns adjustments to apriori parameter values 03/1203 12540Lec10 Linearization Since the linearization is only an approximation, the estimation should be iterated until the adjustments to the parameter values are zero For GPs estimation Convergence rate is 100 1000: 1 typically(ie, a 1 meter error in apriori coordinates could results in 1-10 mm of non linearity error)
03/12/03 12.540 Lec 10 7 Taylor series expansion • In most common Taylor series approach: • The estimation is made using the difference between • The estimation returns adjustments to apriori y = f (x1, x2, x3, x4 ) y0 y = f (x) x 0 + ∂f (x) ∂x Dx x = (x1, x2, x3, x4 ) the observations and the expected values based on apriori values for the parameters. parameter values + D 03/12/03 12.540 Lec 10 8 Linearization • Since the linearization is only an approximation, the estimation should be iterated until the adjustments to the parameter values are zero. • For GPS estimation: Convergence rate is 100- 1000:1 typically (ie., a 1 meter error in apriori coordinates could results in 1-10 mm of nonlinearity error). 4
Estimation (Will return to statistical model shortly) Most common estimation method is"least-squares"in which the parameter estimates are the values that minimize the sum of the squares of the differences between the observations and modeled values based on parameter estimates For linear estimation problems, direct matrix formulation for solution For non-linear problems: Linearization or search technique where parameter space is searched for minimum value Care with search methods that local minimum is not found (will not treat in this course) 12540Lec10 Least squares estimation Originally formulated by Gauss Basic equations: Ay is vector of observations A is linear matrix relating parameters to observables; Ax is vector of parameters; v is esidual △y=AAx+v minimize(v v); superscript T means transpose △x=(AA)A△y
03/12/03 12.540 Lec 10 9 • (Will return to statistical model shortly) • minimize the sum of the squares of the differences on parameter estimates. • For linear estimation problems, direct matrix formulation for solution • minimum value • found (will not treat in this course) Estimation Most common estimation method is “least-squares” in which the parameter estimates are the values that between the observations and modeled values based For non-linear problems: Linearization or search technique where parameter space is searched for Care with search methods that local minimum is not 5 03/12/03 12.540 Lec 10 10 Least squares estimation D observables; D residual Dy = ADx + v minimize vT ( ) v ; Dx = (AT A) -1 AT Dy • Originally formulated by Gauss. • Basic equations: y is vector of observations; A is linear matrix relating parameters to x is vector of parameters; v is superscript T means transpose
Weighted Least s In standard least squares, nothing is assumed about the residuals v except that they are zero me One often sees weight-least-squares in which a weight matrix is assigned to the residuals Residuals with larger elements in W are given more weight. (V W ) △x=(AwA)Aw△y 03/1203 12540Lec10 Statistical approach to least squares If the weight matrix used in weighted least squares is the inverse of the covariance matrix of the residuals, then weighted least squares is a maximum likelihood estimator for Gaussian distributed random errors This latter form of least-squares is most statistically rigorous version ights are pirically
03/12/03 12.540 Lec 10 11 mean. v( ) TWv ; Dx = (ATWA) -1 ATWDy 03/12/03 12.540 Lec 10 12 Statistical approach to least squares Weighted Least Squares • In standard least squares, nothing is assumed about the residuals v except that they are zero • One often sees weight-least-squares in which a weight matrix is assigned to the residuals. Residuals with larger elements in W are given more weight. minimize • If the weight matrix used in weighted least squares is the inverse of the covariance matrix of the residuals, then weighted least squares is a maximum likelihood estimator for Gaussian distributed random errors. • This latter form of least-squares is most statistically rigorous version. • Sometimes weights are chosen empirically 6
Review of statistics Random errors in measurements are expressed with probability density functions that give the probability of values falling between x and xtdx Integrating the probability density function gives the probability of value falling within a finite interval Given a large enough sample of the random variable, the density function can be deduced from a histogram of residuals 03/1203 12540Lec10 xample of random variables ×x文x 0.00 80000
7 03/12/03 12.540 Lec 10 13 Review of statistics • Random errors in measurements are expressed with probability density functions that give the probability of values falling between x and x+dx. • Integrating the probability density function gives the probability of value falling within a finite interval • Given a large enough sample of the random variable, the density function can be deduced from a histogram of residuals. 03/12/03 12.540 Lec 10 14 Example of random variables -4.0 -3.0 -2.0 -1.0 0.0 1.0 2.0 3.0 4.0 0.00 200.00 400.00 600.00 800.00 Uniform Gaussian Random variable Sample
Histograms of random variables 100.0 00527178070218AN2 andom variable x 03/1203 12540Lec10 Characterization random variables When the probability distribution is known, the following statistical descriptions are used for random variable x with density function f(x) Expected value h(x)f(x)a Expectation xf (x)dx=u Variance (x-u2'f(x)dx Square root of variance is called standard deviation
Histograms of random variables 200.0 Gaussian Uniform 490/sqrt(2pi)*exp(-x^2/2) 150.0 100.0 Number of samples 50.0 0.0 -3.75 -2.75 -1.75 -0.75 0.25 1.25 2.25 3.25 Random Variable x 03/12/03 12.540 Lec 10 15 03/12/03 12.540 Lec 10 16 Characterization Random Variables Expected Value Ú h(x) f (x)dx Expectation Ú xf (x)dx = m Variance (x - m) 2 Ú f (x)dx • When the probability distribution is known, the following statistical descriptions are used for random variable x with density function f(x): Square root of variance is called standard deviation 8
Theorems for expectations For linear operations the following theorems are used For a constant =c Linear operator =C Summation = + Covariance: the relationship betwe variables fx,(x,y) is joint probability distribution ∫(x-)y-)fn(x,y)h Correlate Estimation on moments Expectation and variance are the first and second moments of a probability distribution ∑xn1N=J∫x(d (x-H,)N=∑x-A)2/ As n goes to infinity these expressions approach their expectations. ( Note the N-1 in form which uses mean
03/12/03 12.540 Lec 10 17 Theorems for expectations – For a constant = c – Linear operator = c – Summation = + xy s xy == (x - mx )(y - my ) f xy Ú (x, y)dxdy rxy = s xy /s x s y • For linear operations, the following theorems are used: • Covariance: The relationship between random variables f (x,y) is joint probability distribution: Correlation : 9 03/12/03 12.540 Lec 10 18 • moments of a probability distribution • As N goes to infinity these expressions approach their expectations. (Note the N-1 in form which uses mean) mˆ x ª xn n=1 N Â /N ª 1 T Ú x(t)dt sˆ x 2 ª (x - mx n=1 N Â ) 2 /N ª (x - mˆ x n=1 N Â ) 2 /(N -1) Estimation on moments Expectation and variance are the first and second
Probability distributions While there are many probability distributions there are only a couple that are common used Gaussian f(x) (x-)2(2a2 a√2兀 -(x-)V-(x-) Multivariant f(x) (2) Chi-squared x (x) r(r12)22 03/1203 12540Lec10 Probability distributions The chi-squared distribution is the sum of the squares of r Gaussian random variables with expectation 0 and variance 1 With the probability density function known, the probability of events occurring can be determined For Gaussian distribution in 1-D; P(lxk<1o)=0.68 P(lx<2o)=0.955;P(x<3o)=0.9974 Conceptually, people thing of standard deviations in terms of probability of events occurring(ie. 68% of values should be within 1-sigma)
03/12/03 12.540 Lec 10 19 Probability distributions • Gaussian f (x) = 1 s 2p e-(x-m) 2 s 2 ) f (x) = 1 (2p) n V e -1 2 (x-m) T V-1 (x-m) Chi - squared cr 2 (x) = xr/ 2-1 e-x / 2 G(r/ r/ 2 • While there are many probability distributions there are only a couple that are common used: /(2 Multivariant 2)2 03/12/03 12.540 Lec 10 20 Probability distributions • and variance 1. • With the probability density function known, the probability of events occurring can be determined. For Gaussian distribution in 1-D; P(|x|<1s) = 0.68; P(|x|<2s) = 0.955; P(|x|<3s) = 0.9974. • Conceptually, people thing of standard deviations in terms of probability of events occurring (ie. 68% of values should be within 1-sigma). The chi-squared distribution is the sum of the squares of r Gaussian random variables with expectation 0 10