System Identification white-box) Real-time parameter estimation 1. Introduction Combination(grey-box) 2. Least squares and regression Experiment planning 3. Dy namical sys .Choiceof modelstructure ems Transfer functions 4. Experiment al conditions es po nse 5. Examples State models 6. Conclusions Parameter estimat ion Inverse pro blems Valid at ion Identification Techniq ues Transient response Real-time Parameter Estimation Correlat ion a nalysis Int roduct ion Paramet ric met hods 2. Least squares and reg ression Maximum likelihood ml Experiment al co ndit ions Ident ifi cat ion for cont rol e Related areas 6. Conclusions Numerical a nalysi plication C K. J. Ast ro m and B. Wittenmark
Real-time Parameter Estimation 1. Introduction 2. Least squares and regression 3. Dynamical systems 4. Experimental conditions 5. Examples 6. Conclusions System Identication How to get models? { Physics (white-box) { Experiments (black-box) { Combination (grey-box) Experiment planning Choice of model structure { Transfer functions { Impulse response { State models Parameter estimation { Statistics { Inverse problems Validation Identication Techniques Nonparametric methods { Frequency response { Transient response { Correlation analysis { Spectral analysis Parametric methods { Least squares LS { Maximum likelihood ML Identication for control Related areas { Statistics { Numerical analysis { Econometrics { Many applications Real-time Parameter Estimation 1. Introduction 2. Least squares and regression 3. Dynamical systems 4. Experimental conditions 5. Examples 6. Conclusions c K. J. Åström and B. Wittenmark 1
Good Methods are adopted by Least Squares and Regression Mat hematics . ion St atis · The Ls problem Interpret at io n Phys Geo met ric Econo mics Stat istic ● Recursive Continuous time models Cont rol Signal processing THEORIA The least st quare Met hod MOTVS CORPORVM The pro blem The Orbit of Ceres The pro blem solver Karl Fried rich gauss COELESTIV M most probable system of values of the unknown quantities, in w hich the sum of the squares SECTIONIBVS CONICIS S(LEM AMBIENT!VAL of the differences between the o bserved and A TORE computed values, multiplied by numbers t hat measure the deg ree of precisio n, is a CAROL.O FRIDERICO GAVSS minimum In co nclus io n, the principle that the sum of the bserved and co m put ed quant iti minimum, may be co nsidered inde pendently of t he calculus of probabilit E#am::=:ri…Capt,d An observat io n: Ot her criteria co uld be used #: But of all these principles ours is the most simple; by the ot hers we should be led into the C K. J. Ast ro m and B. Wittenmark
Least Squares and Regression Introduction The LS problem Interpretation { Geometric { Statistic Recursive Calculations Continuous time models Good Methods are adopted by Everybody Mathematics Statistics Numerical analysis Physics Economics Biology Medicine Control Signal processing The Least Squares Method The problem: The Orbit of Ceres The problem solver: Karl Friedrich Gauss The principle: \Therefore, that will be the most probable system of values of the unknown quantities, in which the sum of the squares of the dierences between the observed and computed values, multiplied by numbers that measure the degree of precision, is a minimum." In conclusion, the principle that the sum of the squares of the dierences between the observed and computed quantities must be a minimum, may be considered independently of the calculus of probabilities. An observation: Other criteria could be used. \But of all these principles ours is the most simple; by the others we should be led into the most complicated calculations." c K. J. Åström and B. Wittenmark 2
Mat hematical formulation Solving the Ls Problem The regress io n model Mini mize wit h respect to 8 y(t)=91(t)61+2(t)62+…+9n(t)6n V(6,+) 2∑(2=5>()-=76)2 Bi-unk now n parameters 0A0-b0+2c pPi-known funct ions reg ression variables Some not at io ns A=∑yi)92(a) φ(t)={(t)y2()…gn(t ∑)y(l)c=∑y2() Y(t)=[v(1)y(2).…y(t) E(t)=[(1)e(2).(t The parameter a that minimizes the loss φ(1) function are given by the normal equations Φ(t)= If the mat rix A is nonsingular, the mini mum is uniq ue and given by P(t) (乎(t)乎(t) e(a)=y()-y()=y()-92(a) How to construct the eq uat io ns Example continued Model bo 61 b2 b3 ple y()=b+b1(t)+b212(t)+e(t 1.110450.11 0.031 1.130.370.14-0.0030.027 (m)=[1u(t)u2(t Estimated models y(t) Model 2: y(t) bo +b1u Model 3: y(t) +b1+ Model 4: y(t)=bo+b1u+b2u4+b30 C K. J. Ast ro m and B. Wittenmark
Mathematical Formulation The regression model y(t) = '1(t)1+'2(t)2++'n(t)n = '(t)T y { observed data i { unknown parameters 'i { known functions regression variables Some notations 'T (t) = '1(t) '2(t) : : :'n(t) T = 1 2 : : :n Y (t) = y(1) y(2) :::y(t)T E(t) = "(1) "(2) :::"(t)T (t) = 0 B @ 'T (1) . . . 'T (t) 1 C A P (t) = Xt i=1 '(i)'T (i)1 = T (t)(t)1 "(i) = y(i) y^(i) = y(i) ' T (i) Solving the LS Problem Minimize with respect to V (; t) = 1 2 Xt i=1 "(i)2 = 1 2 Xt i=1 y(i) 'T (i) 2 = 1 2 T A bT + 1 2 c where A = Xt i=1 '(i)' T (i) b = Xt i=1 '(i)y(i); c = Xt i=1 y 2 (i) The parameter ^ that minimizes the loss function are given by the normal equations A^ = b If the matrix A is nonsingular, the minimum is unique and given by ^ = A1b = P b How to construct the equations! An Example y(t) = b0 + b1u(t) + b2u2 (t) + e(t) = 0:1 'T (T ) = 1 u(t) u2 (t) T = [b0 b1 b2] Estimated models Model 1 : y(t) = b0 Model 2 : y(t) = b0 + b1u Model 3 : y(t) = b0 + b1u + b2u2 Model 4 : y(t)= b0 + b1u + b2u2 + b3u3 Example Continued Model ^ b0 ^ b1 ^ b2 ^ b3 V 1 3:85 34:46 2 0:57 1:09 1:01 3 1:11 0:45 0:11 0:031 4 1:13 0:37 0:14 0:003 0:027 0 2 4 6 0 2 4 6 8 Input Output 0 2 4 6 0 2 4 6 8 Input Output 0 2 4 6 0 2 4 6 8 Input Output 0 2 4 6 0 2 4 6 8 Input Output c K. J. Åström and B. Wittenmark 3
Geometric Interpretation Statistical Interpretation E y(t)=p2(+)e°+e(t) true paramet ers e(t-independent rando m variables wit h zero mean and variance g HfΦΦ is no nsing ular,then 2V(,t)/(t-n) is an unbiased estimate of a2 When is E as small as possible n - number of parameters in g0 (p)(y-6y2-62 ng)=0 t number of data The normal eq uat io ns Recursive Least squares The Ls estimate is given by 8(t)=P(t p()y()+p(t)y(t) ∑)( Recursive Least squares P(t)-1=P(t-1)-1+g(t)y2(t) But e Want to avoid repeating all calculations if dat dat sively ∑p()y(i)=P(t-1)-(t-1) . Does t here exist a recursive formula t hat expresses 8(t)in terms of A(t-1) P(t)-1e(t-1)-p(t)y2(t)(t-1) The e(t)=6(t-1)-P(t)yp(t)(t)(t-1)+P(t)y(t)y(t) e(t-1)+P(t)y((t)-yr(e(t-1) 6(t-1)+K(t)(t) Want r C K. J. Ast ro m and B. Wittenmark
Geometric Interpretation E θ 1ϕ1 θ 2ϕ2 ϕ2 ϕ1 ˆ Y Y E = Y ' 11 ' 22 'nn When is E as small as possible? ('i)T y 1'1 2'2 n'n = 0 The normal equations! Statistical Interpretation y(t) = ' T (t) 0 + e(t) 0 { \true" parameters e(t) { independent random variables with zero mean and variance 2 If T is nonsingular, then E ^ = 0 cov ^ = 2 (T )1 = 2P s 2 = 2V (^ ; t)=(t n) is an unbiased estimate of 2 n { number of parameters in 0 t { number of data Recursive Least Squares Idea: Want to avoid repeating all calculations if data new data arrives recursively Does there exist a recursive formula that expresses ^ (t) in terms of ^ (t 1)? Recursive Least Squares The LS estimate is given by ^ (t) = P (t) Xt i=1 '(i)y(i) + '(t)y(t)! P (t) = Xt i=1 '(i)'T (i)!1 P (t)1 = P (t 1)1 + '(t)'T (t) But Xt1 i=1 '(i)y(i) = P (t 1)1 ^ (t 1) = P (t)1 ^ (t 1) '(t)'T (t)^ (t 1) The estimate at time t can now be written as ^ (t) = ^ (t 1) P (t)'(t)'T (t)^ (t 1) + P (t)'(t)y(t) = ^ (t 1) + P (t)'(t) y(t) 'T (t)^ (t 1) = ^ (t 1) + K(t)"(t) Want recursive equation for P (t) not for P (t)1 c K. J. Åström and B. Wittenmark 4
Recursion for P(t) The mat rix inversio n lem ma gives The mat rix Inversion lemma Let A, C, and(C+ da B)be nonsingular P(t) y(l)9() i=1 square mat rices. then ∑)()+叫(ty"(t) (A+BCD A-1-A-B(C-1+DA-1B)-1DA-1 P(t-1)-1+y()9r(+) P(t-1)-P(t-1)(t Prove by direct substitution (I+92(t)P(t-1)(t)y2(t)P(t-1) Given a we can get the Lhs inverse Hence What about the inverse on the rhs? K(t=P(tp(t) =P(-1)p(t)(I+g7(t)P(t-1)() Recursive Least-squares rls Time-v ary ing paramet ers 6(t)=6(t-1)+K(t)(y(t)-y(t)(t-1) Loss funct io n wit h disco unting K(t)=P(tp(t) P(t-1)y(t)(I+p2(t)P(t-1)(t) v(,t)=∑x-(v(i)-f(i)2 P(t)=P(t-1)-P(t-1)p() (I+φ()P(t-1)(t)-p°(t)P(t-1) The ls estimate then becomes (I-K(t)y2(t)P(t-1) Intuitive int erpret at ion 6(t)=6(t-1)+K(t)(y(t)-yr(t)(t-1) k(t= P(tp(t) . Kalman filte I nterpret ation of 0 and P P(t-1)(t)(x+p(t)P(t-1)(t)-2 P(t)=(I-K(t)p(t+)P(t-1) C K. J. Ast ro m and B. Wittenmark
The Matrix Inversion Lemma Let A, C, and (C1 + DA1B) be nonsingular square matrices. Then (A + BCD)1 = A1 A1B(C1 + DA1B)1DA1 Prove by direct substitution Given A1 we can get the LHS inverse What about the inverse on the RHS? Recursion for P (t) The matrix inversion lemma gives P (t) = Xt i=1 '(i)'T (i)!1 = Xt1 i=1 '(i)' T (i) + '(t)' T (t)!1 = P (t 1)1 + '(t)'T (t)1 = P (t 1) P (t 1)'(t) I + 'T (t)P (t 1)'(t)1 'T (t)P (t 1) Hence K(t) = P (t)'(t) = P (t 1)'(t) I + 'T (t)P (t 1)'(t)1 Recursive Least-Squares RLS ^ (t) = ^ (t 1) + K(t)y(t) 'T (t)^ (t 1) K(t) = P (t)'(t) = P (t 1)'(t) I + 'T (t)P (t 1)'(t)1 P (t) = P (t 1) P (t 1)'(t) I + 'T (t)P (t 1)'(t)1 'T (t)P (t 1) = I K(t)'T (t)P (t 1) Intuitive interpretation Kalman lter Interpretation of and P Initial values (P (0) = r I ) Time-varying Parameters Loss function with discounting V (; t) = 1 2 X t i=1 ti y(i) ' T (i) 2 The LS estimate then becomes ^ (t) = ^ (t 1) + K(t)y(t) 'T (t)^ (t 1) K(t) = P (t)'(t) = P (t 1)'(t) I + 'T (t)P (t 1)'(t)1 P (t) = I K(t)'T (t) P (t 1) = c K. J. Åström and B. Wittenmark 5
et t g F 0<入<1 R Equivalent time constant y(t)=g(t)1+2(t)2+…+gn(t)n=g(t)26 nction withf genung Hence y(s)-2(s)) Rule of thumb The normal eq uations Memary decay to 10% after p(s)p(s)dse(t) 2 t fargets dd information Estimate is uniq ue if the matrix Adapts quickly when the process R(t=/e-a(t-p(s)p"(s)ds . The estimates get noise sens itive The p matrix ve Recursive equatio ns for co ntinuo us T Models Regression modd Real-time parameter estim atio n y(t)=y(t)2 Introduction Rearsive least equations 2. Least sq uare and regression 3. Dynamica systems dt= P(t)e(tje(t) 4. Expeimerita conditions e(t)=y(t)-y2(t)(t) dPo)=ap(t)-p(t)(t)o()P(t) dR dt t oyp CK. J. AstRo m and Bwittenmark
Forgetting Factor 0 < 1 Equivalent time constant eh=T = Hence T = h log h 1 Rule of thumb: Memory decay to 10% after N = 2 1 + Forgets old information + Adapts quickly when the process changes The estimates get noise sensitive The P matrix may grow (Wind-up) Continuous Time Models Regression model y(t) = '1(t)1+'2(t)2++'n(t)n = '(t)T Loss function with forgetting V () = Z t 0 e (ts) y(s) 'T (s) 2 ds The normal equations Z t 0 e(ts)'(s)' T (s)ds^ (t) = Z t 0 e(ts)'(s)y(s)ds Estimate is unique if the matrix R(t) = Z t 0 e(ts)'(s)'T (s) ds is positive denite. Recursive Equations for Continuous Time Models Regression model y(t) = '(t)T Recursive least equations d^ dt = P (t)'(t)e(t) e(t) = y(t) 'T (t)^ (t) dP (t) dt = P (t) P (t)'(t)'T (t)P (t) dR dt = R + ''T Real-time Parameter Estimation 1. Introduction 2. Least squares and regression 3. Dynamical systems 4. Experimental conditions 5. Examples 6. Conclusions c K. J. Åström and B. Wittenmark 6
Finite Impulse Response(FIR) Models Estimating Parameters in Dynamical y(t)=bnu(t-1)+b2u(t-2)++bnu(t-n) Syst y(t)=97(t-1) Basic idea: Rewrite the eq uations as a reg res where · Dynamical systems (-1).u(t-n) FIR model A reg ress ion model ARMA Continuous time models 9(t)=b1(t-1)u(t-1)+…+bn(t-1)u(-n) Nonli models ent al conditio Excit ation FIR filter Closed loo p ident ificat io n Adjustment echanism Transfer Function models y +bu Pulse transfer Function models y(t)+aiy(t-1)+.+any(t-n A(p)y(t)=B(pu(t) b1u(t-1)+…+bnu(t-n) Int rod y(t)=9(t-1)6 A(p),(t)=B(pu,(t ere where t=F(py(t) 4(t)=F(p)u(t) F and F()has pole excess great er than n 6 quat Io n error p2-F(p)y….-F(p)y H A re del K.. Ast ro m and B. Wittenmark
Estimating Parameters in Dynamical Systems Basic idea: Rewrite the equations as a regression model! Dynamical systems { FIR models { ARMA models { Continuous time models { Nonlinear models Experimental conditions { Excitation { Closed loop identication Finite Impulse Response (FIR) Models y(t) = b1u(t 1) + b2u(t 2) + + bnu(t n) or y(t) = ' T (t 1) where T = [b1 :::bn] ' T (t 1) = [u(t 1) :::u(t n)] A regression model! y^(t) = ^ b1(t 1)u(t 1) + +^ bn(t 1)u(t n) ε FIR filter Adjustment mechanism θ y u Σ y − 1 y ˆ Pulse Transfer Function Models y(t) + a1y(t 1) + + any(t n) = b1u(t 1) + + bnu(t n) Write as y(t) = '(t 1)T where '(t 1) = [y(t 1) ::: y(t n) u(t 1) : : :u(t n)]T = [a1 : : :an b1 : : :bn]T Autoregression! Equation error Transfer Function Models Write the model dny dtn +a1 dn1y dtn1 ++any = b1 dn1u dtn1 ++bnu as A(p)y(t) = B(p)u(t) Introduce A(p)yf (t) = B(p)uf (t) where yf (t) = F (p)y(t) uf (t) = F (p)u(t) and F (p) has pole excess greater than n = a1 :::an b1 : : :bn T 'f (t) = p n1 yf ::: yf p n1uf : : :uf = p n1F (p)y ::: F (p)y p n1F (p)u : : :F (p)u Hence yf (t) = 'T f (t) A regression model c K. J. Åström and B. Wittenmark 7
Nonlinear models Consider the model Real-time parameter estimation y(t)+ay(-1)=b1(t-1)+b22(t-1) 1. nt roduct ion Int roduce 8=a b1 b2 2. Least squares and reg ressio n stems (t)=[-v()u()a2t) 4. Ex periment al co ndit ions ence y(t)=g2(t-1) usions A Linearity in the parameters Persistent excitation The mat rix k-nu1 p(k)p(k)is given by (k-1) (k-1)u(k Experimental Conditions ∑u(k-1)n(k 7+1 Excit ation Closed loop ident ificat ion ● Model st ruct ure c(7-1) c(1) c(n-2) c(n-1)c(n-2) c(k)=1im7∑a((a-k) A sig nal u is called persistently exciting(PE)of if t he mat rix c itⅳ ve defi ni C K. J. Ast ro m and B. Wittenmark
Nonlinear Models Consider the model y(t) + ay(t 1) = b1u(t 1) + b2u2(t 1) Introduce = [a b1 b2]T and 'T (t) = y(t) u(t) u2(t) Hence y(t) = ' T (t 1) Autoregression! Linearity in the parameters Real-time Parameter Estimation 1. Introduction 2. Least squares and regression 3. Dynamical systems 4. Experimental conditions 5. Examples 6. Conclusions Experimental Conditions Excitation Closed loop identication Model structure Persistent Excitation The matrix Pt k=n+1 '(k)'T (k) is given by 0 B B B B B @ Xt n+1 u2 (k 1) ::: Xt n+1 u(k 1)u(k n) . . . Xt n+1 u(k 1)u(k n) ::: Xt n+1 u2 (k n) 1 C C C C C A Cn = lim t!1 1 t T Cn = 0 B B @ c(0) c(1) ::: c(n 1) c(1) c(0) ::: c(n 2) . . . c(n 1) c(n 2) ::: c(0) 1 C C A c(k) = lim t!1 1 t X t i=1 u(i)u(i k) A signal u is called persistently exciting (PE) of order n if the matrix Cn is positive denite. c K. J. Åström and B. Wittenmark 8
Another Characterization xam A signal u is pe rs istently excit ing of order n if A sig nal u is called persistently exciting(PE)of ∑(4q)()>0 An equivalent condition A step is pe of order 1 for all nonzero polyno mials a of de (q-1u(t)=0 Proof Let the poly nomial a be sinusoid is pe of order 2 h+1)ul(t)=0 A straig ht forward calculat io n gives White noise ●PRBS 飞少1(+n-1)+…+an-1(k)2 Physical Mat hematical meaning Loss of Identifiability due to Feedback y(t)=ay(t-1)+bu(t-1)+e(t) u(t)=-hy(t) Multiply by a and n y()=(a+ak)y(t-1)+(b+a)u(t-1)+e(t) Real-time Parameter Estimation Same I/O relation for all a and b such that Int roduct ion 2. Least squares and reg ression Experiment al co ndit ions X True value C K. J. Ast ro m and B. Wittenmark
Another Characterization A signal u is persistently exciting of order n if and only if U = lim t!1 1 t Xt k=1 (A(q)u(k))2 > 0 for all nonzero polynomials A of degree n 1 or less. Proof Let the polynomial A be A(q) = a0q n1 + a1q n2 + + an1 A straightforward calculation gives U = lim t!1 1 t Xt k=1 (a0u(k + n 1) + + an1u(k))2 = a T Cna Examples A signal u is called persistently exciting (PE) of order n if the matrix Cn is positive denite. An equivalent condition A step is PE of order 1 (q 1)u(t)=0 A sinusoid is PE of order 2 (q 2 2q cos !h + 1)u(t)=0 White noise PRBS Physical meaning Mathematical meaning Loss of Identiability due to Feedback y(t) = ay(t 1) + bu(t 1) + e(t) u(t) = ky(t) Multiply by and add, then y(t)=(a + k)y(t 1) + (b + )u(t 1) + e(t) Same I/O relation for all a^ and ^ b such that a^ = a + k ^ b = b + • True value a b a ˆ ˆ b Slope − 1 k Real-time Parameter Estimation 1. Introduction 2. Least squares and regression 3. Dynamical systems 4. Experimental conditions 5. Examples 6. Conclusions c K. J. Åström and B. Wittenmark 9
Ex Exa mples y(t+ay(t-1)=bt-1)+et e Excit ation ar ameters ● Model st ruct ure Closed loop estimat io n Forgetting old dat a 6(0)=0 e gradient or least P(0)=10 p(t-1)=(-y(t-1)u(t-1) f feedb ack xit at ion Case out=-02y(t) Square wave of unit amplit ude and period Case 2: ut=-032y(t-1 b C K J. Ast rom and BWittenmark
Examples Excitation Model structure Closed loop estimation Forgetting old data Gradient or least squares Examples Model y(t) + ay(t 1) = bu(t 1) + e(t) Parameters a = 0:9 b = 0:5 = 0:5 ^ (0) = 0 P (0) = 100 ^ = a^ ^ b '(t 1) = ( y(t 1) u(t 1) ) Excitation Input: Unit pulse at t = 50 Square wave of unit amplitude and period 100 0 200 400 600 800 1000 −1 0 1 0 200 400 600 800 1000 −1 0 1 Time Time (a) (b) ^ b a^ ^ b a^ Eect of Feedback Case 1: u(t = 0:2y(t) −2 −1 0 1 −1 0 1 (a) (b) ^ b ^a ^ b a^ Case 2: u(t) = 0:32y(t 1) −2 −1 0 1 −1 0 1 ^ b a^ c K. J. Åström and B. Wittenmark 10