804 Chapter 18.Integral Equations and Inverse Theory 18.4 Inverse Problems and the Use of A Priori Information Later discussion will be facilitated by some preliminary mention of a couple of mathematical points.Suppose that u is an "unknown"vector that we plan to determine by some minimization principle.Let A[u]>0 and Blu]>0 be two positive functionals of u,so that we can try to determine u by either minimize:Au or minimize:B[u] (18.4.1) 8 (Of course these will generally give different answers for u.)As another possibility, now suppose that we want to minimize Au]subject to the constraint that Bu]have some particular value,say b.The method of Lagrange multipliers gives the variation 0 {4[u+X1(Bu-b)}=。(4[u+1B[▣)=0 (18.4.2) RECIPES where A is a Lagrange multiplier.Notice that b is absent in the second equality, 9 since it doesn't depend on u. Next,suppose that we change our minds and decide to minimize B[u]subject to the constraint that Alu]have a particular value,a.Instead of equation(18.4.2) we have 豆w 9 9 6 0 {B网+(A回-a}=a(B[回+4)=0 (18.4.3) with,this time,A2 the Lagrange multiplier.Multiplying equation(18.4.3)by the 61 constant 1/A2,and identifying 1/A2 with A1,we see that the actual variations are exactly the same in the two cases.Both cases will yield the same one-parameter family of solutions,say,u(1).As A1 varies from 0 to oo,the solution u(1) varies along a so-called trade-off curve between the problem of minimizing A and the problem of minimizing B.Any solution along this curve can equally well be thought of as either (i)a minimization ofA for some constrained value of B, or(ii)a minimization of B for some constrained value ofA,or(iii)a weighted Numerica 10621 minimization of the sum A+AB. 43106 The second preliminary point has to do with degenerate minimization principles. In the example above,now suppose that Au]has the particular form A[回=A·u-c2 (18.4.4) for some matrix A and vector c.If A has fewer rows than columns,or if A is square but degenerate (has a nontrivial nullspace,see $2.6,especially Figure 2.6.1),then minimizing A[u]will not give a unique solution for u.(To see why,review 815.4, and note that for a"design matrix"A with fewer rows than columns,the matrix A.A in the normal equations 15.4.10 is degenerate.)However,if we add any multiple A times a nondegenerate quadratic form B[u],for example u.H.u with H a positive definite matrix,then minimization of Au+ABu]will lead to a unique solution for u.(The sum of two quadratic forms is itself a quadratic form,with the second piece guaranteeing nondegeneracy.)
804 Chapter 18. Integral Equations and Inverse Theory Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). 18.4 Inverse Problems and the Use of A Priori Information Later discussion will be facilitated by some preliminary mention of a couple of mathematical points. Suppose that u is an “unknown” vector that we plan to determine by some minimization principle. Let A[u] > 0 and B[u] > 0 be two positive functionals of u, so that we can try to determine u by either minimize: A[u] or minimize: B[u] (18.4.1) (Of course these will generally give different answers for u.) As another possibility, now suppose that we want to minimize A[u] subject to the constraint that B[u] have some particular value, say b. The method of Lagrange multipliers gives the variation δ δu {A[u] + λ1(B[u] − b)} = δ δu (A[u] + λ1B[u]) = 0 (18.4.2) where λ1 is a Lagrange multiplier. Notice that b is absent in the second equality, since it doesn’t depend on u. Next, suppose that we change our minds and decide to minimize B[u] subject to the constraint that A[u] have a particular value, a. Instead of equation (18.4.2) we have δ δu {B[u] + λ2(A[u] − a)} = δ δu (B[u] + λ2A[u]) = 0 (18.4.3) with, this time, λ2 the Lagrange multiplier. Multiplying equation (18.4.3) by the constant 1/λ2, and identifying 1/λ2 with λ1, we see that the actual variations are exactly the same in the two cases. Both cases will yield the same one-parameter family of solutions, say, u(λ1). As λ1 varies from 0 to ∞, the solution u(λ1) varies along a so-called trade-off curve between the problem of minimizing A and the problem of minimizing B. Any solution along this curve can equally well be thought of as either (i) a minimization of A for some constrained value of B, or (ii) a minimization of B for some constrained value of A, or (iii) a weighted minimization of the sum A + λ1B. The second preliminary point has to do with degenerateminimization principles. In the example above, now suppose that A[u] has the particular form A[u] = |A · u − c| 2 (18.4.4) for some matrix A and vector c. If A has fewer rows than columns, or if A is square but degenerate (has a nontrivial nullspace, see §2.6, especially Figure 2.6.1), then minimizing A[u] will not give a unique solution for u. (To see why, review §15.4, and note that for a “design matrix” A with fewer rows than columns, the matrix AT · A in the normal equations 15.4.10 is degenerate.) However, if we add any multiple λ times a nondegenerate quadratic form B[u], for example u · H · u with H a positive definite matrix, then minimization of A[u] + λB[u] will lead to a unique solution for u. (The sum of two quadratic forms is itself a quadratic form, with the second piece guaranteeing nondegeneracy.)
18.4 Inverse Problems and the Use of A Priori Information 805 We can combine these two points,for this conclusion:When a quadratic minimization principle is combined with a quadratic constraint,and both are positive, only one of the two need be nondegenerate for the overall problem to be well-posed. We are now equipped to face the subject of inverse problems. The Inverse Problem with Zeroth-Order Regularization Suppose that u()is some unknown or underlying(u stands for both unknown and underlying!)physical process,which we hope to determine by a set of N measurements ci,i=1,2,...,N.The relation between u(z)and the ci's is that each ci measures a(hopefully distinct)aspect ofu()through its own linear response kernel ri,and with its own measurement error ni.In other words, B ci=si+ni=ri()u(x)dr ni (18.4.5) 答 (compare this to equations 13.3.1 and 13.3.2).Within the assumption of linearity, this is quite a general formulation.The ci's might approximate values of u()at certain locations zi,in which case ri()would have the form of a more or less narrow instrumental response centered around=i.Or,the ci's might"live"in an 9 entirely different function space from u(),measuring different Fourier components of u(x)for example. The inverse problem is,given the ci's,the ri(r)'s,and perhaps some information about the errors ni such as their covariance matrix 在之w 9 Si≡Covar[ni,njl (18.4.6) how do we find a good statistical estimator of u(),call it () It should be obvious that this is an ill-posed problem.After all,how can we reconstruct a whole function (z)from only a finite number of discrete values ci? Yet,whether formally or informally,we do this all the time in science.We routinely measure“enough points'”and then“draw a curve through them.”In doing so,we are making some assumptions,either about the underlying function u(z),or about the nature of the response functions ri(),or both.Our purpose now is to formalize g分 these assumptions,and to extend our abilities to cases where the measurements and Numerica 10621 underlying function live in quite different function spaces.(How do you"draw a 431 curve"through a scattering of Fourier coefficients?) Recipes We can't really want every point x of the function (x).We do want some large number M of discrete points=1,2,...,M,where M is sufficiently large,and the 's are sufficiently evenly spaced,that neither u()nor ri()varies much between any and (Here and following we will use Greek letters like u to denote values in the space of the underlying process,and Roman letters like i to denote values of immediate observables.)For such a dense set of 's,we can replace equation (18.4.5)by a quadrature like a=∑Rinu(rr)+n (18.4.7) where the N x M matrix R has components Rμ≡r(cu)(cu+1-工μ-1)/2 (18.4.8)
18.4 Inverse Problems and the Use of A Priori Information 805 Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). We can combine these two points, for this conclusion: When a quadratic minimization principle is combined with a quadratic constraint, and both are positive, only one of the two need be nondegenerate for the overall problem to be well-posed. We are now equipped to face the subject of inverse problems. The Inverse Problem with Zeroth-Order Regularization Suppose that u(x) is some unknown or underlying (u stands for both unknown and underlying!) physical process, which we hope to determine by a set of N measurements ci, i = 1, 2,...,N. The relation between u(x) and the ci’s is that each ci measures a (hopefully distinct) aspect of u(x) through its own linear response kernel ri, and with its own measurement error ni. In other words, ci ≡ si + ni = ri(x)u(x)dx + ni (18.4.5) (compare this to equations 13.3.1 and 13.3.2). Within the assumption of linearity, this is quite a general formulation. The ci’s might approximate values of u(x) at certain locations xi, in which case ri(x) would have the form of a more or less narrow instrumental response centered around x = xi. Or, the ci’s might “live” in an entirely different function space from u(x), measuring different Fourier components of u(x) for example. The inverse problem is, given the ci’s, the ri(x)’s, and perhaps some information about the errors ni such as their covariance matrix Sij ≡ Covar[ni, nj ] (18.4.6) how do we find a good statistical estimator of u(x), call it u(x)? It should be obvious that this is an ill-posed problem. After all, how can we reconstruct a whole function u(x) from only a finite number of discrete values c i? Yet, whether formally or informally, we do this all the time in science. We routinely measure “enough points” and then “draw a curve through them.” In doing so, we are making some assumptions, either about the underlying function u(x), or about the nature of the response functions ri(x), or both. Our purpose now is to formalize these assumptions, and to extend our abilities to cases where the measurements and underlying function live in quite different function spaces. (How do you “draw a curve” through a scattering of Fourier coefficients?) We can’t really want every point x of the function u(x). We do want some large number M of discrete points xµ, µ = 1, 2,...,M, where M is sufficiently large, and the xµ’s are sufficiently evenly spaced, that neither u(x) nor ri(x) varies much between any xµ and xµ+1. (Here and following we will use Greek letters like µ to denote values in the space of the underlying process, and Roman letters like i to denote values of immediate observables.) For such a dense set of x µ’s, we can replace equation (18.4.5) by a quadrature like ci = µ Riµu(xµ) + ni (18.4.7) where the N × M matrix R has components Riµ ≡ ri(xµ)(xµ+1 − xµ−1)/2 (18.4.8)
806 Chapter 18.Integral Equations and Inverse Theory (or any other simple quadrature-it rarely matters which).We will view equations (18.4.5)and (18.4.7)as being equivalent for practical purposes. How do you solve a set of equations like equation(18.4.7)for the unknown u()'s?Here is a bad way,but one that contains the germ of some correct ideas: Form a x2 measure of how well a model a()agrees with the measured data, -22-小-a (18.4.9) Ci (compare with equation 15.1.5).Here S is the inverse of the covariance matrix, and the approximate equality holds if you can neglect the off-diagonal covariances, with =(Covar[i,i])1/2. Now you can use the method of singular value decomposition(SVD)in $15.4 to find the vector u that minimizes equation(18.4.9).Don't try to use the method of normal equations;since M is greater than N they will be singular,as we already 9 discussed.The SVD process will thus surely find a large number of zero singular values,indicative of a highly non-unique solution.Among the infinity of degenerate solutions (most of them badly behaved with arbitrarily large ()'s)SVD will select the one with smallest u in the sense of 屋是号9 导三个 9 ∑rP a minimum (18.4.10) 61 (look at Figure 2.6.1).This solution is often called the principal solution.It is a limiting case of what is called zeroth-order regularization,corresponding to minimizing the sum of the two positive functionals minimize:x(u (18.4.11) in the limit of small A.Below.we will learn how to do such minimizations.as well Numerica 10621 as more general ones,without the ad hoc use of SVD. 431 What happens if we determine u by equation(18.4.11)with a non-infinitesimal Recipes value of A?First.note that if M>N(many more unknowns than equations).then u will often have enough freedom to be able to make x2(equation 18.4.9)quite unrealistically small,if not zero.In the language of 815.1.the number of degrees of North freedom v=N-M,which is approximately the expected value of x2 when v is large,is being driven down to zero (and,not meaningfully,beyond).Yet,we know that for the true underlying function u(x),which has no adjustable parameters,the number of degrees of freedom and the expected value of x2 should be aboutN Increasing A pulls the solution away from minimizingx2 in favor of minimizing u.u.From the preliminary discussion above,we can view this as minimizing uu subject to the constraint that x2 have some constant nonzero value.A popular choice,in fact,is to find that value of A which yields x2=N,that is,to get about as much extra regularization as a plausible value of x2 dictates.The resulting ()is called the solution of the inverse problem with zeroth-order regularization
806 Chapter 18. Integral Equations and Inverse Theory Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). (or any other simple quadrature — it rarely matters which). We will view equations (18.4.5) and (18.4.7) as being equivalent for practical purposes. How do you solve a set of equations like equation (18.4.7) for the unknown u(xµ)’s? Here is a bad way, but one that contains the germ of some correct ideas: Form a χ2 measure of how well a model u(x) agrees with the measured data, χ2 = N i=1 N j=1 ci − M µ=1 Riµu(xµ) S−1 ij cj − M µ=1 Rjµu(xµ) ≈ N i=1 ci − M µ=1 Riµu(xµ) σi 2 (18.4.9) (compare with equation 15.1.5). Here S−1 is the inverse of the covariance matrix, and the approximate equality holds if you can neglect the off-diagonal covariances, with σi ≡ (Covar[i, i])1/2. Now you can use the method of singular value decomposition (SVD) in §15.4 to find the vector u that minimizes equation (18.4.9). Don’t try to use the method of normal equations; since M is greater than N they will be singular, as we already discussed. The SVD process will thus surely find a large number of zero singular values, indicative of a highly non-unique solution. Among the infinity of degenerate solutions (most of them badly behaved with arbitrarily large u(x µ)’s) SVD will select the one with smallest |u| in the sense of µ [u(xµ)]2 a minimum (18.4.10) (look at Figure 2.6.1). This solution is often called the principal solution. It is a limiting case of what is called zeroth-order regularization, corresponding to minimizing the sum of the two positive functionals minimize: χ2[u] + λ(u · u) (18.4.11) in the limit of small λ. Below, we will learn how to do such minimizations, as well as more general ones, without the ad hoc use of SVD. What happens if we determine u by equation (18.4.11) with a non-infinitesimal value of λ? First, note that if M N (many more unknowns than equations), then u will often have enough freedom to be able to make χ2 (equation 18.4.9) quite unrealistically small, if not zero. In the language of §15.1, the number of degrees of freedom ν = N − M, which is approximately the expected value of χ2 when ν is large, is being driven down to zero (and, not meaningfully, beyond). Yet, we know that for the true underlying function u(x), which has no adjustable parameters, the number of degrees of freedom and the expected value of χ 2 should be about ν ≈ N. Increasing λ pulls the solution away from minimizing χ2 in favor of minimizing u · u. From the preliminary discussion above, we can view this as minimizing u · u subject to the constraint that χ2 have some constant nonzero value. A popular choice, in fact, is to find that value of λ which yields χ2 = N, that is, to get about as much extra regularization as a plausible value of χ2 dictates. The resulting u(x) is called the solution of the inverse problem with zeroth-order regularization.
18.4 Inverse Problems and the Use of A Priori Information 807 achievable solutions http://www.nr.com or call 1-800-872- Permission is read able files (including this one) bes olutions best agreement (independent of smoothness) from NUMERICAL RECIPES IN C: ← Better Smoothness B 州pWer:s to -7423 (North America t users to make one paper THE Figure 18.4.1. Almost all inverse problem methods involve a trade-off between two optimizations: agreement between data and solution,or"sharpness"of mapping between true and estimated solution (here denoted A),and smoothness or stability of the solution (here denoted B).Among all possible solutions, Programs shown here schematically as the shaded region,those on the boundary connecting the unconstrained minimum ofA and the unconstrained minimum of B are the "best"solutions,in the sense that every send other solution is dominated by at least one solution on the curve. Copyright (C) The value N is actually a surrogate for any value drawn from a Gaussian to dir distribution with mean N and standard deviation (2N)1/2(the asymptoticx distribution).One might equally plausibly try two values of A,one giving x2= N+(2N)1/2,the other N-(2N)1/2 ectcustser ART OF SCIENTIFIC COMPUTING (ISBN 0-521 Zeroth-order regularization,though dominated by better methods,demonstrates most of the basic ideas that are used in inverse problem theory.In general,there are v@cam two positive functionals,call themA and B.The first,A,measures something like the agreement of a model to the data(e.g.,x2),or sometimes a related quantity like .Further reproduction, 1988-1992 by Numerical Recipes the "sharpness"of the mapping between the solution and the underlying function. -43108.5 When A by itself is minimized,the agreement or sharpness becomes very good (often impossibly good),but the solution becomes unstable,wildly oscillating,or in (outside other ways unrealistic,reflecting that A alone typically defines a highly degenerate North Software. minimization problem That is where B comes in.It measures something like the "smoothness"of the Ame desired solution,or sometimes a related quantity that parametrizes the stability of visit website the solution with respect to variations in the data,or sometimes a quantity reflecting machine a priori judgments about the likelihood of a solution.B is called the stabilizing functional or regularizing operator.In any case,minimizing B by itself is supposed to give a solution that is“smooth'”or“stable'”or "likely'”一and that has nothing at all to do with the measured data
18.4 Inverse Problems and the Use of A Priori Information 807 Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). best agreement (independent of smoothness) best smoothness (independent of agreement) best solutions Better Smoothness Better Agreement achievable solutions Figure 18.4.1. Almost all inverse problem methods involve a trade-off between two optimizations: agreement between data and solution, or “sharpness” of mapping between true and estimated solution (here denoted A), and smoothness or stability of the solution (here denoted B). Among all possible solutions, shown here schematically as the shaded region, those on the boundary connecting the unconstrained minimum of A and the unconstrained minimum of B are the “best” solutions, in the sense that every other solution is dominated by at least one solution on the curve. The value N is actually a surrogate for any value drawn from a Gaussian distribution with mean N and standard deviation (2N)1/2 (the asymptotic χ2 distribution). One might equally plausibly try two values of λ, one giving χ 2 = N + (2N)1/2, the other N − (2N)1/2. Zeroth-order regularization, though dominated by better methods, demonstrates most of the basic ideas that are used in inverse problem theory. In general, there are two positive functionals, call them A and B. The first, A, measures something like the agreement of a model to the data (e.g., χ2), or sometimes a related quantity like the “sharpness” of the mapping between the solution and the underlying function. When A by itself is minimized, the agreement or sharpness becomes very good (often impossibly good), but the solution becomes unstable, wildly oscillating, or in other ways unrealistic, reflecting that A alone typically defines a highly degenerate minimization problem. That is where B comes in. It measures something like the “smoothness” of the desired solution, or sometimes a related quantity that parametrizes the stability of the solution with respect to variations in the data, or sometimes a quantity reflecting a priori judgments about the likelihood of a solution. B is called the stabilizing functional or regularizing operator. In any case, minimizing B by itself is supposed to give a solution that is “smooth” or “stable” or “likely” — and that has nothing at all to do with the measured data
808 Chapter 18.Integral Equations and Inverse Theory The single central idea in inverse theory is the prescription minimize:A+入B (18.4.12) for various values of 0<A<oo along the so-called trade-off curve (see Figure 18.4.1),and then to settle on a"best"value of A by one or another criterion,ranging from fairly objective (e.g.,making x2=N)to entirely subjective.Successful methods.several of which we will now describe.differ as to their choices ofA and B,as to whether the prescription(18.4.12)yields linear or nonlinear equations,as to their recommended method for selecting a final A,and as to their practicality for 8 computer-intensive two-dimensional problems like image processing. They also differ as to the philosophical baggage that they (or rather,their proponents)carry.We have thus far avoided the word "Bayesian."(Courts have consistently held that academic license does not extend to shouting"Bayesian"in a crowded lecture hall.)But it is hard,nor have we any wish,to disguise the fact that B has something to do with a priori expectation,or knowledge,of a solution,while A has something to do with a posteriori knowledge.The constant A adjudicates a delicate compromise between the two.Some inverse methods have acquired a more Bayesian stamp than others,but we think that this is purely an accident of history. An outsider looking only at the equations that are actually solved,and not at the 需 accompanying philosophical justifications,would have a difficult time separating the so-called Bayesian methods from the so-called empirical ones,we think. The next three sections discuss three different approaches to the problem of inversion,which have had considerable success in different fields.All three fit within the general framework that we have outlined,but they are quite different in OF SCIENTIFIC detail and in implementation. CITED REFERENCES AND FURTHER READING: 6 Craig,I.J.D.,and Brown,J.C.1986,Inverse Problems in Astronomy(Bristol,U.K.:Adam Hilger). Twomey,S.1977,Introduction to the Mathematics of Inversion in Remote Sensing and Indirect Measurements (Amsterdam:Elsevier). Tikhonov.A.N..and Arsenin,V.Y.1977,Solutions of Il/-Posed Problems(New York:Wiley). Tikhonov,A.N.,and Goncharsky,A.V.(eds.)1987,Il/-Posed Problems in the Natural Sciences (Moscow:MIR). Parker,R.L.1977,Annual Review of Earth and Planetary Science,vol.5,pp.35-64 Frieden,B.R.1975,in Picture Processing and Digital Filtering,T.S.Huang,ed.(New York: Numerical Recipes 10621 43106 Springer-Verlag). Tarantola,A.1987,Inverse Problem Theory (Amsterdam:Elsevier). (outside Baumeister,J.1987,Stable Solution of Inverse Problems(Braunschweig.Germany:Friedr.Vieweg Sohn)[mathematically oriented]. North Software. Titterington,D.M.1985.Astronomy and Astrophysics,vol.144,pp.381-387. Jeffrey,W.,and Rosner,R.1986,Astrophysical Journal,vol.310,pp.463-472. 18.5 Linear Regularization Methods What we will call linear regularization is also called the Phillips-Twomey method (1.21,the constrained linear inversion method 31,the method of regulariza- tion [41,and Tikhonov-Miller regularization [5-71.(It probably has other names also
808 Chapter 18. Integral Equations and Inverse Theory Permission is granted for internet users to make one paper copy for their own personal use. Further reproduction, or any copyin Copyright (C) 1988-1992 by Cambridge University Press. Programs Copyright (C) 1988-1992 by Numerical Recipes Software. Sample page from NUMERICAL RECIPES IN C: THE ART OF SCIENTIFIC COMPUTING (ISBN 0-521-43108-5) g of machinereadable files (including this one) to any server computer, is strictly prohibited. To order Numerical Recipes books or CDROMs, visit website http://www.nr.com or call 1-800-872-7423 (North America only), or send email to directcustserv@cambridge.org (outside North America). The single central idea in inverse theory is the prescription minimize: A + λB (18.4.12) for various values of 0 <λ< ∞ along the so-called trade-off curve (see Figure 18.4.1), and then to settle on a “best” value of λ by one or another criterion, ranging from fairly objective (e.g., making χ2 = N) to entirely subjective. Successful methods, several of which we will now describe, differ as to their choices of A and B, as to whether the prescription (18.4.12) yields linear or nonlinear equations, as to their recommended method for selecting a final λ, and as to their practicality for computer-intensive two-dimensional problems like image processing. They also differ as to the philosophical baggage that they (or rather, their proponents) carry. We have thus far avoided the word “Bayesian.” (Courts have consistently held that academic license does not extend to shouting “Bayesian” in a crowded lecture hall.) But it is hard, nor have we any wish, to disguise the fact that B has something to do with a priori expectation, or knowledge, of a solution, while A has something to do with a posteriori knowledge. The constant λ adjudicates a delicate compromise between the two. Some inverse methods have acquired a more Bayesian stamp than others, but we think that this is purely an accident of history. An outsider looking only at the equations that are actually solved, and not at the accompanying philosophical justifications, would have a difficult time separating the so-called Bayesian methods from the so-called empirical ones, we think. The next three sections discuss three different approaches to the problem of inversion, which have had considerable success in different fields. All three fit within the general framework that we have outlined, but they are quite different in detail and in implementation. CITED REFERENCES AND FURTHER READING: Craig, I.J.D., and Brown, J.C. 1986, Inverse Problems in Astronomy (Bristol, U.K.: Adam Hilger). Twomey, S. 1977, Introduction to the Mathematics of Inversion in Remote Sensing and Indirect Measurements (Amsterdam: Elsevier). Tikhonov, A.N., and Arsenin, V.Y. 1977, Solutions of Ill-Posed Problems (New York: Wiley). Tikhonov, A.N., and Goncharsky, A.V. (eds.) 1987, Ill-Posed Problems in the Natural Sciences (Moscow: MIR). Parker, R.L. 1977, Annual Review of Earth and Planetary Science, vol. 5, pp. 35–64. Frieden, B.R. 1975, in Picture Processing and Digital Filtering, T.S. Huang, ed. (New York: Springer-Verlag). Tarantola, A. 1987, Inverse Problem Theory (Amsterdam: Elsevier). Baumeister, J. 1987, Stable Solution of Inverse Problems (Braunschweig, Germany: Friedr. Vieweg & Sohn) [mathematically oriented]. Titterington, D.M. 1985, Astronomy and Astrophysics, vol. 144, pp. 381–387. Jeffrey, W., and Rosner, R. 1986, Astrophysical Journal, vol. 310, pp. 463–472. 18.5 Linear Regularization Methods What we will call linear regularization is also called the Phillips-Twomey method [1,2], the constrained linear inversion method [3], the method of regularization [4], and Tikhonov-Miller regularization [5-7]. (It probably has other names also