第3章 Fuzzy Identification and estimation 教学内容 这一章主要讲述用模糊进行评估和辨识。模糊辨识设计的最主要的问题是用已知的 离散数据构建一个模糊系统。首先介绍最基本的函数近似问题,然后介绍传统的辨识方 法:最小二乘法,即怎样用批量最小二乘法和递归最小二乘法来辨识一个系统以匹配输 入输出数据。最后讲述用这两种方法直接训练模糊系统 教学重点 重点是最小二乘算法。模糊辨识的最小二乘算法包括成批最小二乘算法和递推最小 乘算法。 教学难点 对最小二乘算法的准确把握和理解,关键是应用模糊最小二乘法实现模糊辨识和估 计 教学要求 要求掌握模糊辨识和估计的基本概念和知识,主要掌握最小二乘算法的推导和应 用 3.1 Overview While up to this point we have focused on control, in this chapter we will examine how to use fuzzy systems for estimation and identification The basic problem to be studied here is how to construct a fuzzy system from numerical data. This is in contrast to our discussion in Chapters 2 and 3, where we used linguistics as the starting point to specify a fuzzy system. If the numerical data is plant input-output data obtained from ar experiment, we may identify a fuzzy system model of the plant. This may be useful for simulation purposes and sometimes for use in a controller On the other hand, the data may come from other sources, and a fuzzy
第3章 Fuzzy Identification and Estimation 教学内容 这一章主要讲述用模糊进行评估和辨识。模糊辨识设计的最主要的问题是用已知的 离散数据构建一个模糊系统。首先介绍最基本的函数近似问题,然后介绍传统的辨识方 法:最小二乘法,即怎样用批量最小二乘法和递归最小二乘法来辨识一个系统以匹配输 入输出数据。最后讲述用这两种方法直接训练模糊系统。 教学重点 重点是最小二乘算法。模糊辨识的最小二乘算法包括成批最小二乘算法和递推最小 二乘算法。 教学难点 对最小二乘算法的准确把握和理解,关键是应用模糊最小二乘法实现模糊辨识和估 计。 教学要求 要求掌握模糊辨识和估计的基本概念和知识,主要掌握最小二乘算法的推导和应 用。 3.1 Overview While up to this point we have focused on control, in this chapter we will examine how to use fuzzy systems for estimation and identification. The basic problem to be studied here is how to construct a fuzzy system from numerical data. This is in contrast to our discussion in Chapters 2 and 3, where we used linguistics as the starting point to specify a fuzzy system. If the numerical data is plant input-output data obtained from an experiment, we may identify a fuzzy system model of the plant. This may be useful for simulation purposes and sometimes for use in a controller. On the other hand, the data may come from other sources, and a fuzzy
system may be used to provide for a parameterized nonlinear function that fits the data by using its basic interpolation capabilities. For instance suppose that we have a human expert who controls some process and we observe how she or he does this by observing what numerical plant input the expert picks for the given numerical data that she or he observes Suppose further that we have many such associations between decision-making data "The methods in this chapter will show how to construct rules for a fuzzy controller from this data(i.e, identify a controller from the human-generated decision-making data) and in this sense they provide another method to design controllers Yet another problem that can be solved with the methods in this hapter is that of how to construct a fuzzy system that will serve as a parameter estimator. To do this, we need data that shows, roughly how the input-output mapping of the estimator should behave (i.e, how it should estimate). One way to generate this data is to begin by establishing a simulation test bed for the plant for which parameter estimation must be performed. Then a set of simulations can be conducted, each with a different value for the parameter to be estimated by coupling the test conditions and simulation-generated data with the parameter values, you can gather appropriate data pairs that allow for the construction of a fuzz estimator, For some plants it may be possible to perform this procedure with actual experimental data(by physically adjusting the parameter to be
system may be used to provide for a parameterized nonlinear function that fits the data by using its basic interpolation capabilities. For instance, suppose that we have a human expert who controls some process and we observe how she or he does this by observing what numerical plant input the expert picks for the given numerical data that she or he observes. Suppose further that we have many such associations between "decision-making data." The methods in this chapter will show how to construct rules for a fuzzy controller from this data (i.e., identify a controller from the human-generated decision-making data), and in this sense they provide another method to design controllers. Yet another problem that can be solved with the methods in this chapter is that of how to construct a fuzzy system that will serve as a parameter estimator. To do this, we need data that shows, roughly how the input-output mapping of the estimator should behave (i.e., how it should estimate). One way to generate this data is to begin by establishing a simulation test bed for the plant for which parameter estimation must be performed. Then a set of simulations can be conducted, each with a different value for the parameter to be estimated .by coupling the test conditions and simulation-generated data with the parameter values, you can gather appropriate data pairs that allow for the construction of a fuzzy estimator, Forsome plants it may be possible to perform this procedure with actual experimental data (by physically adjusting the parameter to be
estimated). In a similar way, you could construct fuzzy predictors using the approaches developed in this chapter We begin this chapter by setting up the basic function approximation problem in Section 3. 2, where we provide an overview of some of the fundamental issues in how to fit a' function to input-output data, including how to incorporate linguistic information into the function that we are trying to force to match the data. We explain how to measure how well a function fits data and provide an example of how to choose a data set for an engine failure estimation problem(a type of parameter estimation problem in which when estimates of the parameters take on certain values, we say that a failure has occurred) In Section 3, 3 we introduce conventional least squares methods for identification, explain how they can be used to tune fuzzy systems provide a simple example, and offer examples of how they can be used to train fuzzy systems-Next, in Section 3, 4 we show how gradient methods can be used to train a standard and Takagi-Sugeno fuzzy system, These methods are quite similar to the ones used to train neural networks(e. g the"back-propagation technique"). We provide examples for standard and Takagi-Sugeno fuzzy systems. We highlight the fact that via either the recursive least squares method for fuzzy systems or the gradient method we can perform on-line parameter estimation. We will see in Chapter 6 that these methods can be combined with a controller
estimated). In a similar way, you could construct' fuzzy predictors using the approaches developed in this chapter We begin this chapter by setting up the basic function approximation problem in Section 3.2, where we provide an overview of some of the fundamental issues in how to fit a' function to input-output data, including how to incorporate linguistic information into the function that we are trying to force to match the data. We explain how to measure how well a function fits data and provide an example of how to choose a data set for an engine failure estimation problem (a type of parameter estimation problem in which when estimates of the parameters take on certain values, we say that a failure has occurred). In Section 3,3 we introduce conventional least squares methods for identification, explain how they can be used to tune fuzzy systems, provide a simple example, and offer examples of how they can be used to train fuzzy systems- Next, in Section 3,4 we show how gradient methods can be used to train a standard and Takagi-Sugeno fuzzy system, These methods are quite similar to the ones used to train neural networks (e.g., the "back-propagation technique"). We provide examples for standard and Takagi-Sugeno fuzzy systems. We highlight the fact that via either the recursive least squares method for fuzzy systems or the gradient method we can perform on-line parameter estimation. We will see in Chapter 6 that these methods can be combined with a controller
construction procedure to provide a method for adaptive fuzzy control In Section 3.5 we introduce two techniques for training fuzzy systems based on clustering. The first uses"c-means clustering"and least squares to train the premises and consequents, respectively, of the Takagi-Sugeno fuzzy system; while the second uses a nearest neighborhood technique to train standard fuzzy systems. In Section 3.6 we present two"learning from examples"(LFE) methods for constructing rules for fuzzy systems from input-output data Compared to the previous methods, these do not use optimization to construct the fuzzy system parameters. Instead, the LFE methods are based on simple procedures to extract rules directly from the data In Section 3. 7 we show how hybrid methods for training fuzzy systems can be developed by combining the methods described in this chapter Finally, in Section 3.8, we provide a design and implementation case study for parameter estimation in an internal combustion engine Overall, the objective of this chapter is to show how to construct fuzzy systems from numerical data. This will provide the reader with another general approach for fuzzy system design that may augment or extend the approach described in Chapters 2 and 3, where we start from linguistic information. With a good understanding of Chapter 2, the reader can complete this chapter without having read Chapters 3 and 4 The section on indirect adaptive control in Chapter 6 relies on the
construction procedure to provide a method for adaptive fuzzy control. In Section 3.5 we introduce two techniques for training fuzzy systems based on clustering. The first uses "c-means clustering" and least squares to train the premises and consequents, respectively, of the Takagi-Sugeno fuzzy system; while the second uses a nearest neighborhood technique to train standard fuzzy systems. In Section 3.6 we present two "learning from examples" (LFE) methods for constructing rules for fuzzy systems from input-output data. Compared to the previous methods, these do not use optimization to construct the fuzzy system parameters. Instead, the LFE methods are based on simple procedures to extract rules directly from the data. In Section 3.7 we show how hybrid methods for training fuzzy systems can be developed by combining the methods described in this chapter. Finally, in Section 3.8, we provide .a design and implementation case study for parameter estimation in an internal combustion engine. Overall, the objective of this chapter is to show how to construct fuzzy systems from numerical data. This will provide the reader with another general approach for fuzzy system design that may augment or extend the approach described in Chapters 2 and 3, where we start from linguistic information. With a good understanding of Chapter 2, the reader can complete this chapter without having read Chapters 3 and 4- The section on indirect adaptive control in Chapter 6 relies on the
gradient and least squares methods discussed in this chapter, and a portion of the section on gain schedule construction in Chapter 7 relies on the reader knowing at least one method from this chapter. In other words this chapter is important since many adaptive control techniques depend on the use of an estimator moreover. the sections on neural networks and genetic algorithms in Chapter 8 depend on this chapter in the sense that if you understand this chapter and those sections, you will see how those techniques relate to the ones discussed here. Otherwise the remainder of the book can be completed without this chapter; however, this chapter will provide for a deeper understanding of many of the concepts to be presented in Chapters 6 and 7. For example, the learning mechanism for the fuzzy model reference learning controller(FMrLC)described in Chapter 6 can be viewed as an identification algorithm that is used to tune a fuzzy controller 3.2 Fitting functions to data We begin this section by precisely defining the function approximation problem, in which you seek to synthesize a function to approximate another function that is inherently represented via a finite number of input-output associations (i.e, we only know how the function maps a finite number of points in its domain to its range). Next, we show how the problem of how to construct nonlinear system identifiers and nonlinear estimators is a special case of the problem of how to perform
gradient and least squares methods discussed in this chapter, and a portion of the section on gain schedule construction in Chapter 7 relies on the reader knowing at least one method from this chapter. In other words, this chapter is important since many adaptive control techniques depend on the use of an estimator. Moreover, the sections on neural networks and genetic algorithms in Chapter 8 depend on this chapter in the sense that if you understand this chapter and those sections, you will see how those techniques relate to the ones discussed here. Otherwise, the remainder of the book can be completed without this chapter; however, this chapter will provide for a deeper understanding of many of the concepts to be presented in Chapters 6 and 7. For example, the learning mechanism for the fuzzy model reference learning controller (FMRLC) described in Chapter 6 can be viewed as an identification algorithm that is used to tune a fuzzy controller. 3.2 Fitting Functions to Data We begin this section by precisely defining the function approximation problem, in which you seek to synthesize a function to approximate another function that is inherently represented via a finite number of input-output associations (i.e., we only know how the function maps a finite number of points in its domain to its range). Next, we show how the problem of how to construct nonlinear system identifiers and nonlinear estimators is a special case of the problem of how to perform
function approximation. Finally, we discuss issues in the choice of the data that we use to construct the approximators, discuss the incorporation of linguistic information, and provide an example of how to construct a data set for a parameter estimation problem 3.2.1 The Function Approximation Problem Given some function where x o" and y c, we wish to construct a fuzzy system f: X where X cx and Yc y are some domain and range of interest, by choosing a parameter vector 0(which may include membership function centers, widths, etc. )so that g(x)=f(x|0)+e(x) (3.1) for all x=[x,x,,]'eX where the approximation error e(x)is as small as possible. If we want to refer to the input at time k, we will use x(k) for the vector and x;i (k) for its "component Assume that all that is available to choose the parameters e of the fuzzy system f(re) is some part of the function g in the form of a finite set of input-output data pairs (i.e, the functional mapping implemented by g is largely unknown). The ith input-output data pair from the system g is denoted by (x', y) where xEX, yEr, and y=g(x'). We let x'=[x, x,.,x,' represent the input vector for the i"data pair. Hence, x, is the j element of the ith data vector(it has a specific value and is
function approximation. Finally, we discuss issues in the choice of the data that we use to construct the approximators , discuss the incorporation of linguistic information, and provide an example of how to construct a data set for a parameter estimation problem. 3.2.1 The Function Approximation Problem Given some function gx y : → where n x ⊂ ℜ and y ⊂ ℜ, we wish to construct a fuzzy system f : X Y → where X ⊂ x and are some domain and range of interest, by choosing a parameter vector Y y ⊂ θ (which may include membership function centers, widths, etc.) so that gx f x ex () ( ) () = + θ (3.1) for all 1 2 [ , ,..., ]T n x = xx x X ∈ where the approximation error e(x) is as small as possible. If we want to refer to the input at time k, we will use x(k) for the vector and xj(k) for its j'h component. Assume that all that is available to choose the parameters θ of the fuzzy system f (x θ) is some part of the function g in the form of a finite set of input-output data pairs (i.e., the functional mapping implemented by g is largely unknown). The input-output data pair from the system g is denoted by ( , th i ) i i x y where i x ∈ X , i y ∈Y , and . We let i = g(x ) i y 1 2 [ , ,..., ] i ii i n T x = xx x represent the input vector for the i"1 data pair. Hence, i j x is the j'h element of the data vector (it has a specific value and is th i
not a variable). We call the set of input-output data pairs the training data set and denote it by G={(x,y2),(x,y)}cX×Y (3.2) where M denotes the number of input-output data pairs contained in g For convenience, we will sometimes use the notation d(i) for data pair To get a graphical picture of the function approximation problem, see Figure 3. 1. This clearly shows the challenge; it can certainly be hard to come up with a good function f to match the mapping g when we know only a little bit about the association between X and y in the form of data pairs G. Moreover, it may be hard to know when we have a good approximation-that is, when f approximates g over the whole space of nputs X FIGURE 3. I Function mapping with three known input-output data pairs To make the function approximation problem even more concrete consider a simple example. Suppose that n=2, XC2, Y=[0, 101, and g:X-Y. Let M=3 and the training data set
not a variable). We call the set of input-output data pairs the training data set and denote it by 1 1 {( , ),...,( , )} M M G xy x y X = ⊂ ×Y (3.2) where M denotes the number of input-output data pairs contained in G. For convenience, we will sometimes use the notation d i( ) for data pair (, ) i i x y . To get a graphical picture of the function approximation problem, see Figure 3.1. This clearly shows the challenge; it can certainly be hard to come up with a good function f to match the mapping g when we know only a little bit about the association between X and Y in the form of data pairs G. Moreover, it may be hard to know when we have a good approximation—that is, when f approximates over the whole space of inputs X. g FIGURE 3.1 Function mapping with three known input-output data pairs. To make the function approximation problem even more concrete, consider a simple example. Suppose that , , Y = [0, 10], and . Let M = 3 and the training data set n=2 2 X ⊂ ℜ g : X Y →
(3.3) 6 which partially specifies g as shown in Figure 3. 2. The function approximation problem amounts to finding a function f(ro) by manipulating 0 so that f(re) approximates g as closely as possible We will use this simple data set to illustrate several of the methods we develop in this chapter How do we evaluate how closely a fuzzy system f(ra) approximates the function g(x) for all xe X for a given Notice that sup(x)-f(lo) (34) is a bound on the approximation error (if it exists). However, specification of such a bound requires that the function g be completely known; however, as stated above, we know only a part of g given by the finite set G. Therefore, we are only able to evaluate the accuracy of approximation by evaluating the error between g(x) and f(re) at certain points xe X given by available input-output data. We call this set of input-output data the test set and denote it as r, where FIGURE 3. 2 The training data G generated from the function g
0 23 ,1 , ,5 ,6 2 46 G ⎧ ⎫ ⎪ ⎪ ⎛ ⎞ ⎛ ⎞⎛ ⎡⎤ ⎡⎤ ⎡⎤ = ⎨⎜ ⎟ ⎜ ⎟⎜ ⎢⎥ ⎢⎥ ⎢⎥ ⎪ ⎪ ⎩ ⎭ ⎝ ⎠ ⎝ ⎠⎝ ⎣⎦ ⎣⎦ ⎣⎦ ⎞ ⎟⎬ ⎠ (3.3) which partially specifies g as shown in Figure 3.2. The function approximation problem amounts to finding a function f (x θ) by manipulating θ so that f (x θ) approximates g as closely as possible. We will use this simple data set to illustrate several of the methods we develop in this chapter. How do we evaluate how closely a fuzzy system f ( ) x θ approximates the function g (x) for all x∈ X for a givenθ ? Notice that sup{ () ( )} x X gx f x θ ∈ − (3.4) is a bound on the approximation error (if it exists). However, specification of such a bound requires that the function g be completely known; however, as stated above, we know only a part of g given by the finite set G. Therefore, we are only able to evaluate the accuracy of approximation by evaluating the error between g(x) and f (x θ) at certain points x∈ X given by available input-output data. We call this set of input-output data the test set and denote it as Γ , where 0 2 x 1234567 1 x 1 2 3 4 5 6 7 0 1234567 y FIGURE 3.2 The training data G generated from the function g
Here, Mr denotes the number of known input-output data pairs contained within the test set. It is important to note that the input-output data pairs (x, y)contained in r may not be contained in G, or vice versa. It also might be the case that the test set is equal to the training set (G=r) however, this choice is not al ways a good one. Most often you will want to test the system with at least some data that were not used to construct f(re) since this will often provide a more realistic assessment of the quality of the approximation We see that evaluation of the error in approximation between g and a fuzzy system f(lo) based on a test set F may or may not be a true measure of the error between g and f for every xEX, but it is the only evaluation we can make based on known information. Hence you can use measures like ∑(g(x)-f(x) (3.6) or sup ig(x)-f(re (3.7) (r, her to measure the approximation error. Accurate function approximation requires that some expression of this nature be small; however, this clearly does not guarantee perfect representation of g with f since most often we cannot test that f matches g over all possible input points We would like to emphasize that the type of function that you
Here, MΓ denotes the number of known input-output data pairs contained within the test set. It is important to note that the input-output data pairs (, ) i i x y contained in Γ may not be contained in G, or vice versa. It also might be the case that the test set is equal to the training set (G = ) Γ ; however, this choice is not always a good one. Most often you will want to test the system with at least some data that were not used to construct f ( ) x θ since this will often provide a more realistic assessment of the quality of the approximation. We see that evaluation of the error in approximation between g and a fuzzy system f (x θ) based on a test set F may or may not be a true measure of the error between g and f for every x∈X, but it is the only evaluation we can make based on known information. Hence, you can use measures like ( ( ) ( )) 2 (,) i i i i x y gx f x θ ∈Γ ∑ − (3.6) or { } (,) sup () ( ) i i x y gx f x θ ∈Γ − (3.7) to measure the approximation error. Accurate function approximation requires that some expression of this nature be small; however, this clearly does not guarantee perfect representation of g with f since most often we cannot test that f matches g over all possible input points. We would like to emphasize that the type of function that you
choose to adjust (i. e, f(x 0)) can have a significant impact on the ultimate accuracy of the approximator. For instance, it may be that a Takagi-Sugeno (or functional) fuzzy system will provide a better approximator than a standard fuzzy system for a particular application We think of f(x0)as a structure for an approximator that is parameterized by 6. In this chapter we will study the use of fuzz systems as approximators, and use a fuzzy system as the structure for the approximator. The choice of the parameter vector 0 depends on,for example, how many membership functions and rules you use. Generally you want enough membership functions and rules to be able to get good accuracy, but not too many since if your function is"overparameterized this can actually degrade approximation accuracy. Often, it is best if the structure of the approximator is based on some physical knowledge of the system, as we explain how to do in Section 3. 2. 4 on page 228 Finally, while in this book we focus primarily on fuzzy systems(or if you understand neural networks you will see that several of the methods of this chapter directly apply to those also), at times it may be beneficial to use other approximation structures such as neural networks polynomials, wavelets, or splines(see Section 3. 10 For Further Study on page 287) 3.2.2 Relation to ldentification estimation and prediction Many applications exist in the control and signal processing areas that
choose to adjust (i.e., f (x θ)) can have a significant impact on the ultimate accuracy of the approximator. For instance, it may be that a Takagi-Sugeno (or functional) fuzzy system will provide a better approximator than a standard fuzzy system for a particular application. We think of f (x θ) as a structure for an approximator that is parameterized by θ . In this chapter we will study the use of fuzzy systems as approximators, and use a fuzzy system as the structure for the approximator. The choice of the parameter vector θ depends on, for example, how many membership functions and rules you use. Generally, you want enough membership functions and rules to be able to get good accuracy, but not too many since if your function is "overparameterized" this can actually degrade approximation accuracy. Often, it is best if the structure of the approximator is based on some physical knowledge of the system, as we explain how to do in Section 3.2.4 on page 228. Finally, while in this book we focus primarily on fuzzy systems (or, if you understand neural networks you will see that several of the methods of this chapter directly apply to those also), at times it may be beneficial to use other approximation structures such as neural networks, polynomials, wavelets, or splines (see Section 3.10 "For Further Study," on page 287). 3.2.2 Relation to Identification, Estimation, and Prediction Many applications exist in the control and signal processing areas that