正在加载图片...
Structure Description of Exclusive-OR Classes with General decision regions problem meshed regions region shapes Half plane bounded by hyperplane Single layer Arbitrary (complexity limited by number of hidden Two layer units) Arbitrary (complexity limited by number of hidden Three layer units) Figure 8.A geometric interpretation of the role of hidden unit in a two-dimensional input space. perform only a few tasks well.The last column of Table di [0,1]m,an m-dimensional hypercube.For classifi- 2 lists the tasks that each algorithm can perform.Due to cation purposes,m is the number of classes.The squared- space limitations,we do not discuss some other algo- error cost function most frequently used in the ANN rithms,including Adaline,Madaline,linear discrimi- literature is defined as nant analysis,15 Sammon's projection,1s and principal component analysis.2Interested readers can consult the (2) corresponding references (this article does not always )-d2 cite the first paper proposing the particular algorithms). MULTILAYER FEED-FORWARD The back-propagation algorithmis a gradient-descent NETWORKS method to minimize the squared-error cost function in Figure 7 shows a typical three-layer perceptron.In gen- Equation 2 (see "Back-propagation algorithm"sidebar). eral,a standard L-layer feed-forward network (we adopt A geometric interpretation (adopted and modified from the convention that the input nodes are not counted as a Lippmann)shown in Figure 8 can help explicate the role layer)consists of an input stage,(L-1)hidden lavers,and of hidden units (with the threshold activation function). an output layer of units successively connected (fully or Each unit in the first hidden layer forms a hyperplane locally)in a feed-forward fashion with no connections in the pattern space;boundaries between pattern classes between units in the same layer and no feedback connec- can be approximated by hyperplanes.A unit in the sec- tions between layers. ond hidden layer forms a hyperregion from the outputs of the first-layer units;a decision region is obtained by Multilayer perceptron performing an AND operation on the hyperplanes.The The most popular class of multilayer feed-forward net- output-layer units combine the decision regions made by works is multilayer perceptrons in which each computa- the units in the second hidden layer by performing logi- tional unit employs either the thresholding function or the cal OR operations.Remember that this scenario is sigmoid function.Multilayer perceptrons can form arbi- depicted only to explain the role of hidden units.Their trarily complex decision boundaries and represent any actual behavior,after the network is trained,could differ. Boolean function.5 The development of the back-propa- A two-layer network can form more complex decision gation learning algorithm for determining weights in a boundaries than those shown in Figure 8.Moreover,mul- multilayer perceptron has made these networks the most tilayer perceptrons with sigmoid activation functions can popular among researchers and users of neural networks form smooth decision boundaries rather than piecewise We denote w as the weight on the connection between linear boundaries. the ith unit in layer(1-1)to ith unit in layer l. Let(,d),(x2,d),...(x,d))be a set ofp Radial Basis Function network training patterns (input-output pairs),where xte R is The Radial Basis Function(RBF)network,3 which has the input vector in the n-dimensional pattern space,and two layers,isa specialclassof multilayer feed-forward net- March 1996 39Description of I decision regions Structure Two laver I Arbitrary (complexity limited by number of hidden units) Half plane bounded by hyperplane Arbitrary (complexity I i mited by number of hidden Three laver I units) Exclusive-OR I Classes with I General II 4 I Figure 8. A geometric interpretation of the role of hidden unit in a two-dimensional input space. perform only a few tasks well. The last column of Table 2 lists the tasks that each algorithm can perform. Due to space limitations, we do not discuss some other algo￾rithms, including Adaline, Madaline,14 linear discrimi￾nant analysis,15 Sammon's projecti~n,~~ and principal component analysis.2 Interested readers can consult the corresponding references (this article does not always cite the first paper proposing the particular algorithms). MULTllAYER FEED-FORWARD NETWORKS Figure 7 shows a typical three-layer perceptron. In gen￾eral, a standard L-layer feed-forward network (we adopt the convention that the input nodes are not counted as a layer) consists of an input stage, (L-1) hidden layers, and an output layer of units successively connected (fully or locally) in a feed-forward fashion with no connections between units in the same layer and no feedback connec￾tions between layers. Multilayer perceptron The most popular class of multilayer feed-forward net￾works is multilayer perceptrons in which each computa￾tional unit employs either the thresholding function or the sigmoid function. Multilayer perceptrons can form arbi￾trarily complex decision boundaries and represent any Boolean function.6 The development of the back-propa￾gation learning algorithm for determining weights in a multilayer perceptron has made these networks the most popular among researchers and users of neural networks. We denote w,I(l) as the weight on the connection between the ith unit in layer (2-1) to jth unit in layer 1. Let {(x(l), dc1)), (x(2), d@)), . . . , (xb), d(p))} be a set ofp training patterns (input-output pairs), where x(l) E R" is the input vector in the n-dimensional pattern space, and d(l) E [0, l]", an m-dimensional hypercube. For classifi￾cation purposes, m is the number of classes. The squared￾error cost function most frequently used in the ANN literature is defined as The back-propagation algorithm9 is a gradient-descent method to minimize the squared-error cost function in Equation 2 (see "Back-propagation algorithm" sidebar). A geometric interpretation (adopted and modified from Lippmann") shown in Figure 8 can help explicate the role of hidden units (with the threshold activation function). Each unit in the first hidden layer forms a hyperplane in the pattern space; boundaries between pattern classes can be approximated by hyperplanes. A unit in the sec￾ond hidden layer forms a hyperregion from the outputs of the first-layer units; a decision region is obtained by performing an AND operation on the hyperplanes. The output-layer units combine the decision regions made by the units in the second hidden layer by performing logi￾cal OR operations. Remember that this scenario is depicted only to explain the role of hidden units. Their actual behavior, after the network is trained, could differ. A two-layer network can form more complex decision boundaries than those shown in Figure 8. Moreover, mul￾tilayer perceptrons with sigmoid activation functions can form smooth decision boundaries rather than piecewise linear boundaries. Radial Basis Function network The Radial Basis Function (RBF) network,3 which has two layers, is a special class of multilayer feed-forward net￾March 1996
<<向上翻页向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有