SMCB-E-02252003-0112.R1 A Multi-Agent Genetic Algorithm for Global Numerical Optimization ZHONG Weicai,LIU Jing,XUE Mingzhi,and JIAO Licheng,Senior Member,IEEE problems arise in almost every field of science,engineering,and Abstract-In this paper,multi-agent systems and genetic business.Since many of these problems cannot be solved algorithms are integrated to form a new algorithm,Multi-Agent analytically,GAs become one of the popular methods to address Genetic Algorithm (MAGA),for solving the global numerical them.But the major problem of GAs is that they may be trapped optimization problem.An agent in MAGA represents a candidate solution to the optimization problem in hand.All agents live in a in the local optima of the objective function.Therefore,various latticelike environment,with each agent fixed on a lattice-point.In new methods have been proposed,such as combining mutation order to increase energies,they compete or cooperate with their operators [5],macroevolutionary algorithm [6],immune genetic neighbors,and they can also use knowledge.Making use of these algorithm [7],orthogonal genetic algorithm [8],microgenetic agent-agent interactions,MAGA realizes the purpose of algorithm [9],and so on.These algorithms proved to be minimizing the objective function value.Theoretical analyses show effective and boosted the development of genetic algorithms. that MAGA converges to the global optimum.In the first part of the experiments,10 benchmark functions are used to test the Agent-based computation has been studied for several years performance of MAGA,and the scalability of MAGA along the in the field of distributed artificial intelligence [10]-[13]and has problem dimension is studied with great care.The results show been widely used in other branches of computer science [14] that MAGA achieves a good performance when the dimensions are [15].Problem solving is an area that many multi-agent-based increased from 20 to 10,000.Moreover,even when the dimensions applications are concerned with.It includes the following are increased to as high as 10,000,MAGA still can find high subareas:distributed solutions to problems,solving distributed quality solutions at a low computational cost.Therefore,MAGA has good scalability and is a competent algorithm for solving high problems,and distributed techniques for problem solving [12] dimensional optimization problems.To the best of our knowledge, [13].Reference [15]introduced an application of distributed no researchers have ever optimized the functions with 10,000 techniques for solving constraint satisfaction problems.They dimensions by means of evolution.In the second part of the solved the 7000-queen problem by an energy-based multi-agent experiments,MAGA is applied to a practical case,the model.Enlightened by them,this paper integrates multi-agent approximation of linear systems,with a satisfactory result. systems with GAs to form a new algorithm,Multi-Agent Index Terms-Genetic algorithms,linear system,multi-agent Genetic Algorithm (MAGA),for solving the global numerical systems,numerical optimization. optimization problem.In MAGA.all agents live in a latticelike environment.Making use of the search mechanism of GAs MAGA realizes the ability of agents to sense and act on the I.INTRODUCTION environment that they live in.During the process of interacting ENETIC algorithms (GAs)are stochastic global with the environment and other agents,each agent increases its Joptimization methods inspired by the biological energy as much as possible,so that MAGA can achieve the mechanisms of evolution and heredity,which were first ultimate purpose of minimizing the objective function value. developed by Holland in the 1960s [1].In recent years.GAs Being similar to MAGA,cellular genetic algorithms [16]-[18] have been widely used for numerical optimization. also use a lattice-based population.In cellular GAs,each combinatorial optimization,classifier systems,and many other individual is located in a cell of the lattice.All operations of engineering problems [2]-[4].Global numerical optimization cellular GAs and traditional GAs are identical except that there is a neighborhood structure in the former while there is no neighborhood structure in the latter.In essence,cellular GAs are Manuscript received February 25,2003.This work was supported by the National Natural Science Foundation of China under Grant 60133010. greedy techniques and can present the same problem of ZHONG Weicai is with the National Key Laboratory for Radar Signal premature convergence of traditional GAs [18].That is,cellular Processing and the Institute of Intelligent Information Processing,Xidian GAs are only a kind of techniques for enabling a fine-grained University,Xi'an,710071,China (phone: 86-029-8209786;fax: 86-029-8201023;e-mail:neouma@163.com), parallel implementation of GAs.But in MAGA,each individual LIU Jing is with the National Key Laboratory for Radar Signal Processing, is considered as an agent,which has its own purpose and Xidian University,Xi'an,710071,China. behaviors.The experimental results show that MAGA achieves XUE Mingzhi is with the National Key Laboratory for Radar Signal a good performance even for the functions with 10,000 Processing,Xidian University,Xi'an,710071,China and the Department of Mathematics,Shangqiu Teachers College,Shangqiu,476000,China. dimensions,which illustrate that MAGA overcomes the JIAO Licheng is with the National Key Laboratory for Radar Signal problem of premature convergence of traditional GAs in some Processing,Xidian University,Xi'an,710071,China
SMCB-E-02252003-0112.R1 1 Abstract—In this paper, multi-agent systems and genetic algorithms are integrated to form a new algorithm, Multi-Agent Genetic Algorithm (MAGA), for solving the global numerical optimization problem. An agent in MAGA represents a candidate solution to the optimization problem in hand. All agents live in a latticelike environment, with each agent fixed on a lattice-point. In order to increase energies, they compete or cooperate with their neighbors, and they can also use knowledge. Making use of these agent-agent interactions, MAGA realizes the purpose of minimizing the objective function value. Theoretical analyses show that MAGA converges to the global optimum. In the first part of the experiments, 10 benchmark functions are used to test the performance of MAGA, and the scalability of MAGA along the problem dimension is studied with great care. The results show that MAGA achieves a good performance when the dimensions are increased from 20 to 10,000. Moreover, even when the dimensions are increased to as high as 10,000, MAGA still can find high quality solutions at a low computational cost. Therefore, MAGA has good scalability and is a competent algorithm for solving high dimensional optimization problems. To the best of our knowledge, no researchers have ever optimized the functions with 10,000 dimensions by means of evolution. In the second part of the experiments, MAGA is applied to a practical case, the approximation of linear systems, with a satisfactory result. Index Terms—Genetic algorithms, linear system, multi-agent systems, numerical optimization. I. INTRODUCTION ENETIC algorithms (GAs) are stochastic global optimization methods inspired by the biological mechanisms of evolution and heredity, which were first developed by Holland in the 1960s [1]. In recent years, GAs have been widely used for numerical optimization, combinatorial optimization, classifier systems, and many other engineering problems [2]-[4]. Global numerical optimization Manuscript received February 25, 2003. This work was supported by the National Natural Science Foundation of China under Grant 60133010. ZHONG Weicai is with the National Key Laboratory for Radar Signal Processing and the Institute of Intelligent Information Processing, Xidian University, Xi’an, 710071, China (phone: 86-029-8209786; fax: 86-029-8201023; e-mail: neouma@163.com). LIU Jing is with the National Key Laboratory for Radar Signal Processing, Xidian University, Xi’an, 710071, China. XUE Mingzhi is with the National Key Laboratory for Radar Signal Processing, Xidian University, Xi’an, 710071, China and the Department of Mathematics, Shangqiu Teachers College, Shangqiu, 476000, China. JIAO Licheng is with the National Key Laboratory for Radar Signal Processing, Xidian University, Xi’an, 710071, China. problems arise in almost every field of science, engineering, and business. Since many of these problems cannot be solved analytically, GAs become one of the popular methods to address them. But the major problem of GAs is that they may be trapped in the local optima of the objective function. Therefore, various new methods have been proposed, such as combining mutation operators [5], macroevolutionary algorithm [6], immune genetic algorithm [7], orthogonal genetic algorithm [8], microgenetic algorithm [9], and so on. These algorithms proved to be effective and boosted the development of genetic algorithms. Agent-based computation has been studied for several years in the field of distributed artificial intelligence [10]-[13] and has been widely used in other branches of computer science [14], [15]. Problem solving is an area that many multi-agent-based applications are concerned with. It includes the following subareas: distributed solutions to problems, solving distributed problems, and distributed techniques for problem solving [12], [13]. Reference [15] introduced an application of distributed techniques for solving constraint satisfaction problems. They solved the 7000-queen problem by an energy-based multi-agent model. Enlightened by them, this paper integrates multi-agent systems with GAs to form a new algorithm, Multi-Agent Genetic Algorithm (MAGA), for solving the global numerical optimization problem. In MAGA, all agents live in a latticelike environment. Making use of the search mechanism of GAs, MAGA realizes the ability of agents to sense and act on the environment that they live in. During the process of interacting with the environment and other agents, each agent increases its energy as much as possible, so that MAGA can achieve the ultimate purpose of minimizing the objective function value. Being similar to MAGA, cellular genetic algorithms [16]-[18] also use a lattice-based population. In cellular GAs, each individual is located in a cell of the lattice. All operations of cellular GAs and traditional GAs are identical except that there is a neighborhood structure in the former while there is no neighborhood structure in the latter. In essence, cellular GAs are greedy techniques and can present the same problem of premature convergence of traditional GAs [18]. That is, cellular GAs are only a kind of techniques for enabling a fine-grained parallel implementation of GAs. But in MAGA, each individual is considered as an agent, which has its own purpose and behaviors. The experimental results show that MAGA achieves a good performance even for the functions with 10,000 dimensions, which illustrate that MAGA overcomes the problem of premature convergence of traditional GAs in some A Multi-Agent Genetic Algorithm for Global Numerical Optimization ZHONG Weicai, LIU Jing, XUE Mingzhi, and JIAO Licheng, Senior Member, IEEE G
SMCB-E-02252003-0112.R1 2 degree. The purpose of a is to increase its energy as much as possible. The rest of this paper is organized as follows:Section II As can be seen,each agent carries all variables of the describes MAGA and analyzes its convergence.Sections IlI and objective function to be optimized.In order to realize the local IV describe the experimental studies on the problems of global perceptivity of agents,the environment is organized as a numerical optimization and the optimal approximation of linear latticelike structure,which is defined as follows: systems,respectively.Finally,conclusions are presented in Definition 2:All agents live in a latticelike environment,L. Section V. which is called an agent lattice.The size of L is L where Lis an integer.Each agent is fixed on a lattice-point and it can only interact with its neighbors.Suppose that the agent located II.MULTI-AGENT GENETIC ALGORITHM AND ITS at(i,/)is represented asLu,,j=1,2,...Le,then the neighbors CONVERGENCE ofL Neighbors are defined as follows: According to [13],[15],an agent is a physical or virtual entity Neighbors= (3) that essentially has the following properties:(a)it is able to live and act in the environment:(b)it is able to sense its local i-1i≠1 where i'= j-1j≠1 environment;(c)it is driven by certain purposes and(d)it has lL j=1,P= i+1i≠Lec 1 i= some reactive behaviors.Multi-agent systems are [j+1j≠Lc computational systems in which several agents interact or work together in order to achieve goals.As can be seen,the meaning j=Lo of an agent is very comprehensive,and what an agent represents Therefore,the agent lattice can be represented as the one in is different for different problems.In general,four elements Fig.1.Each circle represents an agent,the data in a circle should be defined when multi-agent systems are used to solve represents its position in the lattice,and two agents can interact problems.The first is the meaning and the purpose of each agent. with each other if and only if there is a line connecting them. The second is the environment where all agents live.Since each agent has only local perceptivity,so the third is the definition of the local environment.The last is the behaviors that each agent can take to achieve its purpose.In what follows,the definitions of these elements for global numerical optimization problems are described. A.The Agent for Numerical Optimization A global numerical optimization can be formulated as solving the following objective function minimize f(x),x=(,x)eS (1) where SR"defines the search space which is an n-dimensional space bounded by the parametric constraints s≤x≤x,i-l,…n.Thus,S=[s,],where =(.,x)and=(,,.,)Because many 'x' notations have been used throughout this paper,they are Fig.1.The model of the agent lattice. explained explicitly to prevent confusion.Thex'in boldface represents a real-valued vector in the search space,and thex with a subscript represents a component in the vectorx'.The In traditional GAs,those individuals that will generate boldfaced'with an underline represents the vector of the offspring are usually selected from all individuals according to their fitness.Therefore,the global fitness distribution of a lower bound of the search space,and thex'with a subscript population must be determined.But in nature,a global selection and an underline represents a component in the vector'x'.The does not exist,and the global fitness distribution cannot be boldfaced'with an overline represents the vector of the determined either.In fact,the real natural selection only occurs upper bound of the search space,and thex.'with a subscript in a local environment,and each individual can only interact and an overline represents a component in the vector'.An with those around it.That is,in some phase,the natural agent for numerical optimization problems is defined as evolution is just a kind of local phenomenon.The information follows: can be shared globally only after a process of diffusion Definition 1:An agent,a,represents a candidate solution to In the aforementioned agent lattice,to achieve their purposes, the optimization problem in hand.The value of its energy is agents will compete or cooperate with others so that they can equal to the negative value of the objective function, gain more resources.Since each agent can only sense its local ae s and Energy(a)=-f(a) (2) environment,its behaviors of competition and cooperation can only take place between the agent and its neighbors.There is no
SMCB-E-02252003-0112.R1 2 degree. The rest of this paper is organized as follows: Section II describes MAGA and analyzes its convergence. Sections III and IV describe the experimental studies on the problems of global numerical optimization and the optimal approximation of linear systems, respectively. Finally, conclusions are presented in Section V. II. MULTI-AGENT GENETIC ALGORITHM AND ITS CONVERGENCE According to [13], [15], an agent is a physical or virtual entity that essentially has the following properties: (a) it is able to live and act in the environment; (b) it is able to sense its local environment; (c) it is driven by certain purposes and (d) it has some reactive behaviors. Multi-agent systems are computational systems in which several agents interact or work together in order to achieve goals. As can be seen, the meaning of an agent is very comprehensive, and what an agent represents is different for different problems. In general, four elements should be defined when multi-agent systems are used to solve problems. The first is the meaning and the purpose of each agent. The second is the environment where all agents live. Since each agent has only local perceptivity, so the third is the definition of the local environment. The last is the behaviors that each agent can take to achieve its purpose. In what follows, the definitions of these elements for global numerical optimization problems are described. A. The Agent for Numerical Optimization A global numerical optimization can be formulated as solving the following objective function minimize ( ), ( , , ) 1 n f xx x x = ∈ S (1) where n S R⊆ defines the search space which is an n-dimensional space bounded by the parametric constraints iii xxx ≤ ≤ , i=1,…,n. Thus, S = [ ] x x , , where 1 2 ( , , ..., ) n x = xx x and 1 2 ( , , ..., ) n x = xx x . Because many ‘x’ notations have been used throughout this paper, they are explained explicitly to prevent confusion. The ‘x’ in boldface represents a real-valued vector in the search space, and the ‘xi’ with a subscript represents a component in the vector ‘x’. The boldfaced ‘ x ’ with an underline represents the vector of the lower bound of the search space, and the ‘ i x ’ with a subscript and an underline represents a component in the vector ‘ x ’. The boldfaced ‘ x ’ with an overline represents the vector of the upper bound of the search space, and the ‘ i x ’ with a subscript and an overline represents a component in the vector ‘ x ’. An agent for numerical optimization problems is defined as follows: Definition 1: An agent, a, represents a candidate solution to the optimization problem in hand. The value of its energy is equal to the negative value of the objective function, a∈S and Energy f () () a a = − (2) The purpose of a is to increase its energy as much as possible. As can be seen, each agent carries all variables of the objective function to be optimized. In order to realize the local perceptivity of agents, the environment is organized as a latticelike structure, which is defined as follows: Definition 2: All agents live in a latticelike environment, L, which is called an agent lattice. The size of L is Lsize×Lsize, where Lsize is an integer. Each agent is fixed on a lattice-point and it can only interact with its neighbors. Suppose that the agent located at (i, j) is represented as Li,j, i, j=1,2,…,Lsize, then the neighbors of Li,j, Neighborsi,j, are defined as follows: Neighbors L L L L ij i j ij i j ij , ,, ,, = { ′ ′ ′′ ′′ , , , } (3) where 1 1 1 size i i i L i − ≠ ′ = = , 1 1 1 size j j j L j − ≠ ′ = = , 1 1 size size i iL i i L + ≠ ′′ = = , 1 1 size size j jL j j L + ≠ ′′ = = . Therefore, the agent lattice can be represented as the one in Fig.1. Each circle represents an agent, the data in a circle represents its position in the lattice, and two agents can interact with each other if and only if there is a line connecting them. In traditional GAs, those individuals that will generate offspring are usually selected from all individuals according to their fitness. Therefore, the global fitness distribution of a population must be determined. But in nature, a global selection does not exist, and the global fitness distribution cannot be determined either. In fact, the real natural selection only occurs in a local environment, and each individual can only interact with those around it. That is, in some phase, the natural evolution is just a kind of local phenomenon. The information can be shared globally only after a process of diffusion. In the aforementioned agent lattice, to achieve their purposes, agents will compete or cooperate with others so that they can gain more resources. Since each agent can only sense its local environment, its behaviors of competition and cooperation can only take place between the agent and its neighbors. There is no Fig. 1. The model of the agent lattice
SMCB-E-02252003-0112.R1 global selection at all,so the global fitness distribution is not similar to the inversion operation in AEA [19].It has the required.An agent interacts with its neighbors so that function of random searching,but is better than random information is transferred to them.In such a manner,the searching in that it makes use of the information of a winner. information is diffused to the whole agent lattice.As can be seen, Therefore,occupying strategy I puts emphasis on exploitation the model of the agent lattice is more close to the real while occupying strategy 2 on exploration. evolutionary mechanism in nature than the model of the Neighborhood orthogonal crossover operator:The population in traditional GAs. orthogonal crossover operator is a new operator proposed by [8]. B.Four Evolutionary Operators for Agents It generates new individuals by means of the orthogonal design. Because we usually have no information about the location of To achieve its purposes,each agent has some behaviors.In the global optimum,and an orthogonal array can specify a small addition to the aforementioned behaviors of competition and number of individuals that are scattered uniformly over the cooperation,each agent can also increase its energy by using its search space,the orthogonal crossover operator can generate a knowledge.On the basis of such behaviors,four evolutionary small,but representative sample of potential individuals.In operators are designed for the agents.The neighborhood MAGA,this operator is performed on L and Max to achieve competition operator and the neighborhood orthogonal the purpose of cooperation.In what follows,its basic concept is crossover operator realize the behaviors of competition and introduced,and for more details,see [8]. cooperation,respectively.The mutation operator and the Suppose that the search space defined by L and Max,is self-learning operator realize the behaviors of making use of knowledge.Suppose that the four operators are performed on [M,LM]where the agent located at (i,),L(h,.....),and Max =(min(h,m),min(,m2),..min(I m)) (m,mz,....m)is the agent with maximum energy among the (9) neighbors of Lu namely,MaxeNeighborsu and xw=(max(4,m,),max(凸2,m2),,max(n,mn)) Vae Neighborsup then Energy(a)Energ☒Maxd) (4) If L is a winner,it can still live in the agent lattice.If L is a 月,=min(0,m)+-1)()2≤j≤02-1 (10) loser,it must die,and its lattice-point will be occupied by Max. Max has two strategies to occupy the lattice-point,and it max(1,m) j=02 selects them with probability P If U(0,1)x,k-l,…,n (5) f2)=(B2,月+22,,月a) m+U(-1,1)x(m-1)otherwise (12) In occupying strategy 2,Max,is first mapped on [0,1] according to, fg)=(B+e,月2e,Be m=(m-x)/八区-玉),k-l…,n (6) The orthogonal array,)=[is applied to Then,New=(,e2,e)is determined by, generate the following M agents: Ne=(m,m,…,m-m,m2-1… (7) f()(,,f(色e月》 m+,m,m,m+,…,mn) f(b2),5(b22),,f(b2F)》 where 1<i <n,1<i2<n,and i<i. (13) Finally,New is obtained by mapping New,,back to according to, f(bw),(b,,f(bF》 e&=玉+e(-玉),l,…n (8) Finally,the agent with maximum energy among the M2 agents is In this operator,two strategies play different roles.WhenL selected to replace L The details about the construction of the is a loser,it perhaps still has some useful information,so orthogonal array were given in []For the sake of simplicity,in occupying strategy 1,a kind ofheuristic crossover,is in favor of MAGA,2,Fand M2 are fixed at 3,4 and 9,respectively,which are same as those of [8].Therefore,no parameter needs to be reserving some information of a loser.Occupying strategy 2 is
SMCB-E-02252003-0112.R1 3 global selection at all, so the global fitness distribution is not required. An agent interacts with its neighbors so that information is transferred to them. In such a manner, the information is diffused to the whole agent lattice. As can be seen, the model of the agent lattice is more close to the real evolutionary mechanism in nature than the model of the population in traditional GAs. B. Four Evolutionary Operators for Agents To achieve its purposes, each agent has some behaviors. In addition to the aforementioned behaviors of competition and cooperation, each agent can also increase its energy by using its knowledge. On the basis of such behaviors, four evolutionary operators are designed for the agents. The neighborhood competition operator and the neighborhood orthogonal crossover operator realize the behaviors of competition and cooperation, respectively. The mutation operator and the self-learning operator realize the behaviors of making use of knowledge. Suppose that the four operators are performed on the agent located at (i, j), Li,j=(l1,l2,…,ln), and Maxi,j= (m1,m2,…,mn) is the agent with maximum energy among the neighbors of Li,j, namely, Maxi,j∈Neighborsi,j and ∀a∈Neighborsi,j, then Energy(a)≤Energy(Maxi,j). Neighborhood competition operator: If Li,j satisfies (4), it is a winner; otherwise it is a loser. Energy(Li,j)>Energy(Maxi,j) (4) If Li,j is a winner, it can still live in the agent lattice. If Li,j is a loser, it must die, and its lattice-point will be occupied by Maxi,j. Maxi,j has two strategies to occupy the lattice-point, and it selects them with probability Po. If U(0,1) +− × − , k=1,…,n (5) In occupying strategy 2, Maxi,j is first mapped on [0, 1] according to, m mx xx k kk kk ′ =− − ( ) ( ) , k=1,…,n (6) Then, , 12 (, , , ) New e e e ij n ′ ′′ ′ = is determined by, 1 22 1 12 2 , 12 1 1 1 12 ( , , , , , , , , , , , , ) ij i i i i ii i n New m m m m m m mm m m − − + ++ ′ ′′ ′ ′ ′ = ′ ′′ ′ ′ (7) where 1<i1<n, 1<i2<n, and i1<i2. Finally, Newi,j is obtained by mapping Newi j , ′ back to [ ] , k k x x according to, ( ) k kkk k e xexx = +⋅ − ′ , k=1,…,n (8) In this operator, two strategies play different roles. When Li,j is a loser, it perhaps still has some useful information, so occupying strategy 1, a kind of heuristic crossover, is in favor of reserving some information of a loser. Occupying strategy 2 is similar to the inversion operation in AEA [19]. It has the function of random searching, but is better than random searching in that it makes use of the information of a winner. Therefore, occupying strategy 1 puts emphasis on exploitation while occupying strategy 2 on exploration. Neighborhood orthogonal crossover operator: The orthogonal crossover operator is a new operator proposed by [8]. It generates new individuals by means of the orthogonal design. Because we usually have no information about the location of the global optimum, and an orthogonal array can specify a small number of individuals that are scattered uniformly over the search space, the orthogonal crossover operator can generate a small, but representative sample of potential individuals. In MAGA, this operator is performed on Li,j and Maxi,j to achieve the purpose of cooperation. In what follows, its basic concept is introduced, and for more details, see [8]. Suppose that the search space defined by Li,j and Maxi,j is [ ] , LM LM x x where ( ) () ( ) ( ) ( ) () ( ) ( ) 11 2 2 11 2 2 min , , min , , ..., min , max , , max , , ..., max , LM n n LM n n lm lm lm lm lm lm = = x x (9) The domain of the ith dimension is quantized into 2 ,1 ,2 , , , ..., ββ β i i iQ where ( ) ( ) ( ) ( ) ( ) 2 | | , 2 1 2 min , 1 min , 1 2 1 max , i i i i l m ij i i Q i i lm j lm j jQ lm jQ β − − = = + −⋅ ≤≤ − = (10) (F-1) integers, (k1, k2, …, kF-1), are generated randomly such that 1<k1<k2< … <kF-1<n, and then create the following F factors for any agent a=(x1, x2, …, xn), ( )( ) ( ) 1 12 1 11 2 1 1 , ..., , , ..., , ..., , ..., F k k k Fk n xx x x x x + + − ff f == = (11) Q2 levels for the ith factor fi are defined as follows: ( ) ( ) ( ) 1 1 1 1 12 1 2 2 1,1 2,1 ,1 1,2 2,2 ,2 2 1, 2, , (1) , , ..., (2) , , ..., ... ... ( ) , , ..., ii i ii i ii i i kk k i kk k i k Q k Q kQ Q ββ β ββ β ββ β − − − − − − + + + + + + = = = f f f (12) The orthogonal array, ( ) 2 2 2 , F M ij M F LQ b × = , is applied to generate the following M2 agents: ( ) () ( ) ( ) ( ) () ( ) ( ) ( ) ( )( ) ( ) 22 2 1 1,1 2 1,2 1, 1 2,1 2 2,2 2, 1 ,1 2 ,2 , , , ..., , , ..., ... ... , , ..., F F F F M M F MF bb b bb b bb b ff f ff f ff f (13) Finally, the agent with maximum energy among the M2 agents is selected to replace Li,j. The details about the construction of the orthogonal array were given in [8]. For the sake of simplicity, in MAGA, Q2, F and M2 are fixed at 3, 4 and 9, respectively, which are same as those of [8]. Therefore, no parameter needs to be
SMCB-E-02252003-0112.R1 tuned for this operator,and the orthogonal array is L(3),ALGORITHM 1 Self-learning operator which is shown in (14). sL:represents the agent lattice in the rth generation,and sL1/2 is the mid-lattice 「1111 between sL and sL1.sBest'is the best agent 1222 among sL°,si,,sI,and sCBest'is the best 1333 agent in sL.sPm is the probability to 2123 perform the mutation operator,and sGen is the number of generations. L,(3)=2231 (14) step 1:Generate sLo according to (16)and 2312 (17),update sBesto,and r-0; 3132 Step 2:Perform the neighborhood competition 321 operator on each agent in sL',obtaining 3 sL+1/2: 3321 step 3:For each agent in sL1/2,if U(0,1)Energy(sBest'),then otherwise sBest1←-sCBest+1;otherwise sBest1←-sBest', where G(0,is a Gaussian random number generator,and t is sCBest+1←-sBest"; the evolution generation. step 5:If rx,k=l,…,n. mutation operator. U(1-sRadius,1+sRadius)otherwise step1:Initialize L°,update Best°,andt←-0; Step 2:Perform the neighborhood competition (17) operator on each agent in L',obtaining L+1/3; where sRadiuse [0,1]represents the search radius. Step 3:For each agent in 1t1/3,if U(0,1)<Pc Next,the neighborhood competition operator and the perform the neighborhood orthogonal mutation operator are iteratively performed on sL.Finally,L is crossover operator on it,obtaining Lt2/3; replaced by the agent with maximum energy found during the step 4:For each agent in 12/,if U(0,1)<P above process.For more clarity,ALGORITHM 1 describes the perform the mutation operator on it, details of this operator. obtaining L1; step 5:Find CBesttti in L1,and then perform the self-learning operator on CBestt+1;
SMCB-E-02252003-0112.R1 4 tuned for this operator, and the orthogonal array is 4 9 L (3 ) , which is shown in (14). 4 9 1111 1222 1333 2123 (3 ) 2231 2312 3132 3213 3321 L = (14) Mutation operator: A new agent, Newi,j=(e1,e2,…,en), is generated as, 1 1 (0,1) (0, ) otherwise k n k k t l U e l G ⋅ , k=1,…,n. (17) where sRadius∈[0, 1] represents the search radius. Next, the neighborhood competition operator and the mutation operator are iteratively performed on sL. Finally, Li,j is replaced by the agent with maximum energy found during the above process. For more clarity, ALGORITHM 1 describes the details of this operator. ALGORITHM 1 Self-learning operator sLr represents the agent lattice in the rth generation, and sLr+1/2 is the mid-lattice between sLr and sLr+1. sBestr is the best agent among sL0 , sL1 , …, sLr , and sCBestr is the best agent in sLr . sPm is the probability to perform the mutation operator, and sGen is the number of generations. Step 1: Generate sL0 according to (16) and (17), update sBest0 , and r←0; Step 2: Perform the neighborhood competition operator on each agent in sLr , obtaining sLr+1/2; Step 3: For each agent in sLr+1/2, if U(0,1)Energy(sBestr), then sBestr+1←sCBestr+1; otherwise sBestr+1←sBestr, sCBestr+1←sBestr ; Step 5: If r<sGen, then r←r+1, go to Step 2; Step 6: Li,j←sBestr . C. The Implementation of MAGA In MAGA, the neighborhood competition operator is performed on each agent. As a result, the agents with low energy are cleaned out from the agent lattice so that there is more developing space for the agents with high energy. The neighborhood orthogonal crossover operator and the mutation operator are performed on each agent with probabilities Pc and Pm, respectively. In order to reduce the computational cost, the self-learning operator is only performed on the best agent in each generation, but it has an important effect on the performance of MAGA. In general, the four operators utilize different methods to simulate the behaviors of agents, and play different roles in MAGA. ALGORITHM 2 describes the details of MAGA. ALGORITHM 2 Multi-Agent Genetic Algorithm Lt represents the agent lattice in the tth generation, and Lt+1/3 and Lt+2/3 are the mid-lattices between Lt and Lt+1. Bestt is the best agent among L0 , L1 , …, Lt, and CBestt is the best agent in Lt . Pc and Pm are the probabilities to perform the neighborhood orthogonal crossover operator and the mutation operator. Step 1: Initialize L0 , update Best0 , and t←0; Step 2: Perform the neighborhood competition operator on each agent in Lt, obtaining Lt+1/3; Step 3: For each agent in Lt+1/3, if U(0,1)<Pc, perform the neighborhood orthogonal crossover operator on it, obtaining Lt+2/3; Step 4: For each agent in Lt+2/3, if U(0,1)<Pm, perform the mutation operator on it, obtaining Lt+1; Step 5: Find CBestt+1 in Lt+1, and then perform the self-learning operator on CBestt+1;
SMCB-E-02252003-0112.R1 J step 6:If Energy(CBestt+)>Energy(Best'), agent lattice in C.In any generation,the four evolutionary then Best+1←-CBestt1;otherwise operators transform the agent lattice,L,to another one,L and Bestt+1←Best',CBest+1←Best; this processcan be viewed asatransition fromLtoLetp Step 7:If termination criteria are reached, be the probability of transition from L to L,Pu be the output Bestt and stop;otherwise tet+1,go probability of transition from L to any agent lattice in and to Step 2. Pbe the probability of transition from any agent lattice in C D.The Convergence of MAGA to any agent lattice in C.It is obvious that Obviously,encoding a variable,x,of the objective function (23) with an M-bit sequence is equivalent to quantizing its search AA-∑,w,∑PA=1,Pu2n space,,into 2 values,and the precision g is equal to Based on the concepts above and [20],[21],the convergence ()/2.Therefore,the convergence of MAGA is of MAGA is proved as follows: Theorem 1 [22]Let P:n'xn'be a reducible stochastic matrix analyzed by real coding directly.Suppose that the required which means that by applying the same permutations to rows precision is e.Thus,the search space S can be changed to a c0) discrete space.Its number of elements,is equal to and columns,P can be brought into the form where C: R T (e,and each element is a candidate solution, mxm is a primitive stochastic matrix and R,T0.Then namely an agent.LetE=[Energy(a)lae It is obvious 01 p产=limP=lim C0 rRCT气R0 (24) thatS,and can be represented as E={E,E,,E},where E>E2>…>E间 is a stable stochastic matrix with p"=I'p",where This immediately gives us the opportunity to partition S into p"=pp"is unique regardless of the initial distribution,and a collection of nonempty subsets,si=1,2,.., psatisfies:p,>0forl≤i≤n and p,=0form0,k≤i ∑1 SHSt S≠a,i∈{L,2…lB: =0,k>i (19) SnS=②,i≠方Us=s Proof:VLEL,i=1,2,…,lE1,j=1,2,…,lC|, 3a'=(x.,x)e1,Energy(a')=E.Suppose that L Obviously,E is the global optimum,and s consists of all changes to L after the four evolutionary operators are agents whose energies are El performed.IfL is the agent lattice in the tth generation,thenL In MAGA,the number of agents remains constant throughout is the one in the (+1)th generation.Therefore,L and L are the evolutionary process.Let C stand for the set of all agent labeled as L'and L,respectively. lattices.Because an agent lattice may contain multiple copies of Firstly,(25)can be obtained according to Step 6 in one or more agents,the number of elements in c is ALGORITHM 2. ICE IS|+L×Lee-I Energy(CBest)=Energy(Best) Li XLiEe (25) ≥Ener☒(Best)=Eerg☒y(a) Suppose that the energy of an agent lattice,L,is equal to Therefore, EnergL),which is determined by, Energy(L')2 Energy(L)→k≤i Energy(L)=max[Energy(i,j=1. (20) →k>i,Pnu=0 Thus VLeC,ElsliP4-∑P,u=0 c i=1,2,.,where =k>i,P,k=0 Secondly,in the neighborhood competition operator,a must C=LLec and Energy(L)=E (21) be a winner because its energy is greater than that of any other ∑1 CHCk C≠o,ie{L2,sh agents in L',so a'L.The probability of a'is(1-P.) (22) because the probability to perform the neighborhood orthogonal cne=②,i≠j:Uc=c crossover operator on a is P Therefore,the probability to perform the mutation operator on a'is: C consists of all the agent lattices whose energies are E. Pr=(1-P).P>0 (27) Let L,i=1,2,...j=1,2c,stand for the jth 3a'e L,Energy(a)=Ek.Suppose that there are m
SMCB-E-02252003-0112.R1 5 Step 6: If Energy(CBestt+1)>Energy(Bestt ), then Bestt+1←CBestt+1; otherwise Bestt+1←Bestt , CBestt+1←Bestt ; Step 7: If termination criteria are reached, output Bestt and stop; otherwise t←t+1, go to Step 2. D. The Convergence of MAGA Obviously, encoding a variable, xi, of the objective function with an M-bit sequence is equivalent to quantizing its search space, [ ] ,i i x x , into 2M values, and the precision ε is equal to ( ) 2M i i x x − . Therefore, the convergence of MAGA is analyzed by real coding directly. Suppose that the required precision is ε . Thus, the search space S can be changed to a discrete space. Its number of elements, | | S , is equal to ( ) 1 n i i i x x ε = − ∏ , and each element is a candidate solution, namely an agent. Let E = ∈ {Energy( )| a a S} . It is obvious that E ≤ S , and E can be represented as { } 1 2 || = EE E , , ..., E E , where 1 2 || E > >> E E ... E . This immediately gives us the opportunity to partition S into a collection of nonempty subsets, { 1, 2, ..., | |} i S i = E , where { and ( ) } i i S S =∈ = aa a Energy E (18) { } | | 1 1 | | | |; , 1,2, ,| | ; , ; i i i ij i i i i j = = = ≠∅ ∀ ∈ =∅ ∀ ≠ = ∑ | | SS S SS S S E E E (19) Obviously, E1 is the global optimum, and 1 S consists of all agents whose energies are E1 . In MAGA, the number of agents remains constant throughout the evolutionary process. Let L stand for the set of all agent lattices. Because an agent lattice may contain multiple copies of one or more agents, the number of elements in L is || 1 | | size size size size L L L L +× − = × S L . Suppose that the energy of an agent lattice, L, is equal to Energy(L), which is determined by, Energy L Energy L i j L ( ) max ( ) , 1, , = = { i j size , } (20) Thus 1 ∀∈ ≤ ≤ L L, () E Energy L E E . Therefore, L can be partitioned into a collection of nonempty subsets { 1,2, ,| |} i L i = E , where { and ( ) } i i L L =∈ = L L Energy L E (21) { } | | 1 | | 1 | | | |; , 1,2, ,| | ; , ; i i i ij i i i i j = = = ≠∅ ∀ ∈ =∅ ∀ ≠ = ∑ LLL LL LL E E E (22) 1 L consists of all the agent lattices whose energies are E1 . Let Lij , 1,2, ,| |, 1,2, ,| | i i j = = E L , stand for the jth agent lattice in i L . In any generation, the four evolutionary operators transform the agent lattice, Lij , to another one, Lkl, and this process can be viewed as a transition from Lij to Lkl. Let pij.kl be the probability of transition from Lij to Lkl, pij.k be the probability of transition from Lij to any agent lattice in k L , and pi.k be the probability of transition from any agent lattice in i L to any agent lattice in k L . It is obvious that | | . . 1 k ij k ij kl l p p = = ∑L , | | . 1 1 ij k k p = ∑ = E , i k ij k . . p ≥ p (23) Based on the concepts above and [20], [21], the convergence of MAGA is proved as follows: Theorem 1 [22] Let P: n′×n′ be a reducible stochastic matrix which means that by applying the same permutations to rows and columns, P can be brought into the form C 0 R T , where C: m×m is a primitive stochastic matrix and R T, ≠ 0 . Then 1 =0 lim lim k k k - i ki k k k i ∞ ∞ →∞ →∞ − ∞ = = ∑ 0 0 0 C C P= P T RC T R (24) is a stable stochastic matrix with ∞ ∞ P = 1′p , where ∞ ∞ 0 p = p P is unique regardless of the initial distribution, and ∞ p satisfies: 0 i p∞ > for 1 ≤ ≤i m and 0 i p∞ = for min ≤ = = > . Proof: , 1,2, ,| | ij i ∀∈ = L i L E , 1,2, ,| | i j = L , * 1 (, , ) ij n ∃= ∈ a x xL , * ( ) i Energy E a = . Suppose that Lij changes to Lkl after the four evolutionary operators are performed. If Lij is the agent lattice in the tth generation, then Lkl is the one in the (t+1)th generation. Therefore, Lij and Lkl are labeled as Lt and Lt+1, respectively. Firstly, (25) can be obtained according to Step 6 in ALGORITHM 2, 1 1 * ( ) () ( ) ( ) t t t Energy CBest Energy Best Energy Best Energy + + = ≥ = a (25) Therefore, 1 . | | . . 1 . ( ) () , 0 , 0 , 0 k t t ij kl ij k ij kl l i k Energy L Energy L k i k ip k ip p k ip + = ≥ ⇒ ≤ ⇒ ∀> = ⇒ ∀> = = ⇒ ∀> = ∑L (26) Secondly, in the neighborhood competition operator, a* must be a winner because its energy is greater than that of any other agents in Lt , so 1 * 3 t L+ a ∈ . The probability of 2 * 3 t L+ a ∈ is (1-Pc) because the probability to perform the neighborhood orthogonal crossover operator on a* is Pc. Therefore, the probability to perform the mutation operator on a* is: 1 (1 ) 0 Pr P P =− ⋅ > c m (27) 1 , ( ) t k L Energy E + ∃∈ = a a ′ ′ . Suppose that there are n1
SMCB-E-02252003-0112.R1 6 variables,x,,in a'which are different from the f(x)=∑(-x,sinV1x),S=[-500,500]”; corresponding ones in a.Then the probability to generate a' Generalized Rastrigin's Function from a by the mutation operator is: f3(x)=∑[x-10cos(2πx)+10],S=[-5.12,5.12]; 红- P%=1--i盖eg>0 (28) Ackley's Function: l Thus,the probability of transition from L to any agent lattice f;(x)=-20exp {-022x in C by the four evolutionary operators is: PA>Pr×P3>0 (29) Therefore,.k≤i,Pk≥Pgk>0. -em(2co2rs小+20+e,s--2,, It follows from this theorem that there is always a positive Generalized Griewank Function: probability to transit from an agent lattice to the one with f(x)=∑x-Πcos(0+1,S=[-600600; identical or higher energy and a zero probability to the one with Generalized Penalized Function 1: lower energy.Thus,once MAGA enters C,it will never go out. Theorem 3:Multi-agent genetic algorithm converges to the f(x)=10sn2(πy)+∑y-lP[1+10sim(πy4] global optimum. +0y.-1}+∑x,10,100,4). Proof:It is clear that one can consider each C,i=1,2,..,as a state in a homogeneous finite Markov k(x;-a)" x,>a chain.According to theorem 2,the transition matrix of the u(x:,a,k,m)= 0 -a≤x≤a, Markov chain can be written as follows: k(-x;-a)"x;0,T≠0, +x。-l[1+sin'(2rx)]}+∑x5,100,4), C=(P:)=(1)≠0 S=[-50,50; According to theorem 1,Pis given by, Sphere Model::fx)=∑,S=[-10o,10o' Schwefel's Problem 2.22: P*=lim pt lim 31) f(x)=∑k+Isl,S=-10,10; Schwefel's Problem 1.2: where C"=1,R=(1,1,.,1).Thus,p is a stable stochastic matrix,and f(x)=∑(∑mx,S=-100,100 10.0\ Schwefel's Problem 2.21: 10…0 pc= fo(x)=max,l1≤i≤n},S=[-100,100]; (32) fi-f are multimodal functions where the number of local minima 1 0… 0 increases with the problem dimension.For example,the number Therefore, oflocal minima off is about 10n in the given search space.ffo are unimodal functions.Some parameters must be assigned to lim Pr(Energy(L')=E)=1 (33) before MAGA is used to solve problems.LXL is equivalent where Pr stands for the probability.This implies that to the population size in traditional GAs,so Lze can be chosen multi-agent genetic algorithm converges to the global optimum. from 5 to 10.P determines whether MAGA puts emphasis on exploitation or on exploration.That is,when P<0.5,MAGA puts emphasis on searching in the new space,while when P0.5,on making use of available information.It is better to let III.EXPERIMENTAL STUDIES ON GLOBAL NUMERICAL P.be small than 0.5,otherwise it will greatly increase the OPTIMIZATION computational cost.P is similar to the mutation probability. In order to test the performance of MAGA,10 benchmark The self-learning operator is a small scale MAGA,so its four functions have been used: parameters can be chosen easily.On account of the Generalized Schwefel's Problem 2.26 computational cost,it is better to let sL be small than 5 and choose sGen from 5 to 10.sRadius controls the size of the local
SMCB-E-02252003-0112.R1 6 variables, 1 1, , n x x ′ ′ , in a′ which are different from the corresponding ones in a* . Then the probability to generate a′ from a* by the mutation operator is: ( )( ) 2 1 1 ( ) 1 2 2 2 1 1 0 i i tx x n n n t n i Pr e π − ′ − − = =− ∏ > i (28) Thus, the probability of transition from Lij to any agent lattice in k L by the four evolutionary operators is: . 12 0 ij k p Pr Pr >×> (29) Therefore, ∀ ≤ k i , . . 0 i k ij k p p ≥ > . It follows from this theorem that there is always a positive probability to transit from an agent lattice to the one with identical or higher energy and a zero probability to the one with lower energy. Thus, once MAGA enters 1 L , it will never go out. Theorem 3: Multi-agent genetic algorithm converges to the global optimum. Proof: It is clear that one can consider each , 1,2, ,| | i L i = E , as a state in a homogeneous finite Markov chain. According to theorem 2, the transition matrix of the Markov chain can be written as follows: 1.1 2.1 2.2 | |.1 | |.2 | |.| | 0 0 0 p p p pp p = 0 C P = R T E E EE (30) Obviously, ( ) 2.1 3.1 | |.1 , ,, 0 T R = pp p E > , T ≠ 0 , ( ) ( ) 1.1 C = p = ≠ 1 0 . According to theorem 1, P∞ is given by, 1 =0 lim lim k k k - i ki k k k i ∞ ∞ →∞ →∞ − ∞ = = ∑ 0 0 0 C C P= P T RC T R (31) where 1, 1,1, ,1 ( ) ∞ ∞ T C R = = . Thus, ∞ P is a stable stochastic matrix, and 10 0 10 0 10 0 ∞ P = (32) Therefore, 1 lim { ( ) } 1 t t Pr Energy L E →∞ = = (33) where Pr stands for the probability. This implies that multi-agent genetic algorithm converges to the global optimum. III. EXPERIMENTAL STUDIES ON GLOBAL NUMERICAL OPTIMIZATION In order to test the performance of MAGA, 10 benchmark functions have been used: Generalized Schwefel’s Problem 2.26: 1 1 ( ) ( sin | |) n ii i f xx x = ∑ = − , [ ] 500, 500 n S = − ; Generalized Rastrigin’s Function: 2 2 1 ( ) [ 10cos(2 ) 10] n ii i fx x x = ∑ = − + π , [ ] 5.12, 5.12 n S = − ; Ackley’s Function: 1 2 3 1 ( ) 20exp 0.2 n n i i f x = =− − x 1 1 exp cos(2 ) 20 n n i i π x e = − ∑ + + , [ ] 32, 32 n S = − ; Generalized Griewank Function: 1 2 4 11 4000 ( ) cos( ) 1 i n n x ii i i f x x = ∑ = = −∏ + , [ ] 600, 600 n S = − ; Generalized Penalized Function 1: ( ) { 1 2 22 51 1 1 10sin ( ) ( 1) 1 10sin ( ) n n i i i f yy y π π π − + = = + −+ ∑ x }2 1 ( 1) ( ,10,100,4) n n i i y ux = +− +∑ , ( ) ( ,,, ) 0 ( ) m i i i i m i i kx a x a ux akm a x a kxa x a − > = −≤ ≤ − − 0.5, on making use of available information. It is better to let Pc be small than 0.5, otherwise it will greatly increase the computational cost. Pm is similar to the mutation probability. The self-learning operator is a small scale MAGA, so its four parameters can be chosen easily. On account of the computational cost, it is better to let sLsize be small than 5 and choose sGen from 5 to 10. sRadius controls the size of the local
SMCB-E-02252003-0112.R1 > search area,so it is better to assign a small value to sRadius.sP hill-climbing method,F the fitness function and term the is similar to P.In the following experiments,the parameter termination criterion settings are:LsEe=5,Po=0.2,P-0.1,P-0.1,sLsze=3, 4)AEA [19]:This is a modified version of BGA.Besides the sRadius=0.2,sP=0.05,sGen=10. new recombination operator and the mutation operator,each A.Descriptions of the Compared Algorithms individual of AEA is coded as a vector with components all in the unit interval,and inversion is applied with some probability Since MAGA is compared with FEP [23],OGA/Q[8],BGA to the parents before recombination is performed. [24],and AEA [19]in the following experiments,we first give a brief description of the four algorithms. B.The Comparison between FEP,OGA/O,and MAGA on 1)FEP [23]:This is a modified version of the classical Functions with 30 Dimensions evolutionary programming(CEP).It is different from CEP in FEP [23]and OGA/Q [8]are two methods proposed recently generating new individuals.Suppose that the selected individual and obtain good performances on numerical optimization isx=().In CEP,the new individual,x=(), problems.[8]compared OGA/Q with traditional GAs and five is generated as follows:x=x,+nN,(0,1),i=1,2,...,n,where existing algorithms,and the results showed that OGA/Q outperforms all the other methods.In [23],the termination n's are standard deviations for Gaussian mutations,and M(0,1) criterion of FEP was to run 1500 generations for f andf, denotes a normally distributed one-dimensional random number 2000 generations for f and fs,5000 generations for f andffo. with a mean zero and lof standard deviation.In FEP,x'is and 9000 generations for f.In [8].the termination criterion of generated as follows:x=x,+n,i=1,2,.n,where 6 is a OGA/Q was the quality of the solution cannot be further Cauchy random variable with the scale parameter 1. improved in successive 50 generations after 1000 generations. 2)OGA/Q [8]:This is a modified version of the classical Since the termination criteria of FEP and OGA/Q are different genetic algorithm(CGA).It is the same as CGA,except that it to make a fair comparison,we let the computational cost of uses the orthogonal design to generate the initial population and MAGA be less than those of FEP and OGA/Q,and compare the the offspring of the crossover operator. qualities of their final solutions at the given computational cost. 3)BGA [24]:It is based on artificial selection similar to that Therefore,the termination criterion of MAGA is to run 150 used by human breeders,and is a recombination of evolution generations for each function.The results averaged over 50 strategies (ES)and GAs.BGA uses truncation selection as trials are shown in Table 1,where n=30 performed by breeders.This selection scheme is similar to the As can be seen,MAGA finds the exact global optimum,0,in (u,)-strategy in ES.The search process of BGA is mainly all trials for six out of ten functions.For all the ten functions. driven by recombination,making BGA a genetic algorithm. both the mean function value and the mean number of function Thus, BGA can be described by evaluations of MAGA are much better than those of FEP.Forfi, (P,N,T,I,A,HC,F,term).po is the initial population, fs and f6,the solutions of MAGA are better than those of OGA/Q, and for the other functions,the solutions of MAGA are as good N the size of the population,T the truncation threshold,I the as those of OGA/Q.Moreover,the mean number of function recombination operator,A the mutation operator,HC the evaluations of MAGA is about 10,000 for all functions,while TABLE I THE COMPARISON BETWEEN FEP,OGA/Q,AND MAGA ON FUNCTIONS WITH 30 DIMENSIONS Mean function value (standard deviation) Mean number of function evaluations FEP OGA/Q MAGA FEP OGA/Q MAGA -12569.5 -12554.5 -12569.4537 -12569.4866 900.000 302,166 10,862 (52.6 (6.447×10$ (7.121×10-12) U 4.6×102 0 0 500.000 224,710 11,427 (1.2×10-2 (0) (0) 1.8×102 4.440×1016 4.440×10-16 150.000 112,421 9,656 (2.1×103 (3.989×1017) (0) 4 1.6×102 0 0 200.000 134,000 9,777 (2.2×10-2 (0) (0) 15 9.2×106 6.019×106 1.142×10-18 150,000 134,556 10,545 (3.6×106 (1.159×106) (4.390×10-18 1.6×104 1.869x10-4 1.039×10-17 150.000 134,143 11,269 (7.3×105) (2.615×105) (4.196×10-1 5.7×104 0 150.000 112.559 9,502 (1.3×104 (0) (0) 8.1×103 0 0 200.000 112,612 9,591 (7.7×10 0 (0 1.6×102 500.000 112,576 9,479 (1.4×10-2 (0) fio 0.3 0 0 500.000 112.893 9,603 0.5
SMCB-E-02252003-0112.R1 7 search area, so it is better to assign a small value to sRadius. sPm is similar to Pm. In the following experiments, the parameter settings are: Lsize=5, Po=0.2, Pc=0.1, Pm=0.1, sLsize=3, sRadius=0.2, sPm=0.05, sGen=10. A. Descriptions of the Compared Algorithms Since MAGA is compared with FEP [23], OGA/Q [8], BGA [24], and AEA [19] in the following experiments, we first give a brief description of the four algorithms. 1) FEP [23]: This is a modified version of the classical evolutionary programming (CEP). It is different from CEP in generating new individuals. Suppose that the selected individual is 1 (, , ) n x = x x . In CEP, the new individual, 1 (, , ) n x′′ ′ = x x , is generated as follows: (0,1), 1,2,..., i i ii xx N i n ′ =+ = η , where ηi’s are standard deviations for Gaussian mutations, and N(0,1) denotes a normally distributed one-dimensional random number with a mean zero and 1of standard deviation. In FEP, x′ is generated as follows: , 1,2,..., i i ii xx i n ′ =+ = η δ , where δi is a Cauchy random variable with the scale parameter 1. 2) OGA/Q [8]: This is a modified version of the classical genetic algorithm (CGA). It is the same as CGA, except that it uses the orthogonal design to generate the initial population and the offspring of the crossover operator. 3) BGA [24]: It is based on artificial selection similar to that used by human breeders, and is a recombination of evolution strategies (ES) and GAs. BGA uses truncation selection as performed by breeders. This selection scheme is similar to the ( , ) µ λ -strategy in ES. The search process of BGA is mainly driven by recombination, making BGA a genetic algorithm. Thus, BGA can be described by ( ) 0 , , , , , , , P N T HC F term g Γ ∆ . 0 Pg is the initial population, N the size of the population, T the truncation threshold, Γ the recombination operator, ∆ the mutation operator, HC the hill-climbing method, F the fitness function and term the termination criterion. 4) AEA [19]: This is a modified version of BGA. Besides the new recombination operator and the mutation operator, each individual of AEA is coded as a vector with components all in the unit interval, and inversion is applied with some probability to the parents before recombination is performed. B. The Comparison between FEP, OGA/Q, and MAGA on Functions with 30 Dimensions FEP [23] and OGA/Q [8] are two methods proposed recently and obtain good performances on numerical optimization problems. [8] compared OGA/Q with traditional GAs and five existing algorithms, and the results showed that OGA/Q outperforms all the other methods. In [23], the termination criterion of FEP was to run 1500 generations for f3 and f5-f7, 2000 generations for f4 and f8, 5000 generations for f2 and f9-f10, and 9000 generations for f1. In [8], the termination criterion of OGA/Q was the quality of the solution cannot be further improved in successive 50 generations after 1000 generations. Since the termination criteria of FEP and OGA/Q are different, to make a fair comparison, we let the computational cost of MAGA be less than those of FEP and OGA/Q, and compare the qualities of their final solutions at the given computational cost. Therefore, the termination criterion of MAGA is to run 150 generations for each function. The results averaged over 50 trials are shown in Table I, where n=30. As can be seen, MAGA finds the exact global optimum, 0, in all trials for six out of ten functions. For all the ten functions, both the mean function value and the mean number of function evaluations of MAGA are much better than those of FEP. For f1, f5 and f6, the solutions of MAGA are better than those of OGA/Q, and for the other functions, the solutions of MAGA are as good as those of OGA/Q. Moreover, the mean number of function evaluations of MAGA is about 10,000 for all functions, while TABLE I THE COMPARISON BETWEEN FEP, OGA/Q, AND MAGA ON FUNCTIONS WITH 30 DIMENSIONS
SMCB-E-02252003-0112.R1 TABLE II THE MEAN NUMBER OF FUNCTION EVALUATIONS(THE STANDARD DEVIATIONS)OF MAGA ON FUNCTIONS WITH 20-1000 DIMENSIONS i 月 fs 6 5 fio 20 2.483 4.301 3.583 2.566 2.827 3.745 2.420 2.956 4.151 6.823 (0.2) (1.9x10(9.3×10-(2.3×10(1.9x10(1.9×10)(1.6x10(1.0×10(1.7x10(6.5x10- 100 5.443 10,265 5.410 4,447 4.907 7.929 4.199 5.638 6,351 8.920 (0.9) (1.6×10(8.8×10-(2.1×10-(2.8×10)(2.1×10)(1.3×10)(7.9×10-)(1.3×10(5.9x10- 200 7.284 14.867 6.051 5.483 6.870 9.732 4.966 6.757 6.949 9.307 (2.7) (2.9x105)(6.9×10(2.1×10(3.2×105)(2.2×10)(1.6x105)(8.7×10-6(1.1x105)(6.6x10- 400 12,368 17,939 6,615 6,249 9,305 12,820 5.576 7,753 7,474 9.662 (6.1) (2.7×10(7.1×10(1.8×10(3.7×10(3.3×105)(1.2×10(8.3×10)(1.0x105(5.4×106 800 19.992 20,306 7.069 6.883 10,572 16.070 6.079 8.692 7.902 9.823 (11) (3.6x10-(7.5×10-6(1.7×10(3.0x10(3.2×10(14×10(8.3×106(1.3×10(6.2x10-6 1000 22.827 20.,083 7,288 7,358 11,214 17,829 6,273 9,465 8,024 9,945 (14)(3.7x103(8.6×10-(1.8×10-(3.6×10-(3.5×10-)(1.4×10-(7.2×10-6(1.1×10-3)(5.2×10 that of OAGA/Q ranges from 100,000 to 300,000.Therefore, than that of AEA.Forf,when n<200,the number ofevaluations the computational cost of MAGA is much lower than those of of MAGA is greater than that of AEA,while when 4005ns1000, OGA/Q.To summarize,the results show that MAGA it is smaller than that of AEA.For both f and f.the number of outperforms FEP and OGA/Q,and is competent for the evaluations of MAGA is much smaller than that of AEA at all numerical optimization problems dimensions.In general,MAGA obtains better solutions C.The Performance of MAGA on Functions with 20-1000 (=10)at a lower computational cost than BGA and AEA, Dimensions and displays a good performance in solving high dimensional Because the size of the search space and the number of local problems. minima increase with the problem dimension,the higher the D.The Performance of MAGA on Functions with dimension is,the more difficult the problem is.Therefore,this 1000-10.000 Dimensions experiment studies the performance of MAGA on functions In order to study the scalability of MAGA along the problem with 20~1000 dimensions.The termination criterion of MAGA dimension further,in this experiment,MAGA is used to is one of the objectives,Ifes-fike.lf or lferke if optimize fifio with higher dimensions.The problem dimension f=0,is achieved,where fest andfum represent the best is increased from 1000 to 10,000 in steps of 500.To the best of solution found until the current generation and the global our knowledge,no researchers have ever optimized the optimum,respectively.To be consistent,e=10 is used for all functions with such high dimensions by means of evolution functions.Table II gives the mean number and the standard Therefore,no existing results can be used for a direct deviations of function evaluations of MAGA averaged over 50 comparison.In the previous two sections,MAGA was trials. compared with four algorithms,and the results show that As can be seen,for fi and f,when the dimension increases MAGA is much better than FEP and BGA.Although the from 20 to 1000,the number of evaluations increases to about TABLE III 20,000.For fs andf,when the dimension increases to 1000,the THE MEAN NUMBER OF FUNCTION EVALUATIONS OF BGA,AEA,AND MAGA ON WTTH 20~1000 DIMENSIONS number ofevaluations only increases to 11.214 and 17.829.For the six other functions,MAGA obtains high quality solutions (g=10)only with thousands of evaluations at all selected BGA AEA MAGA BGA AEA MAGA dimensions.In addition,because g is assigned to 10 in the 20 16.100 1,603 2.483 3.608 1,247 4.301 termination criterion,small standard deviations are obtained for 100 92.000 5,106 5,443 25,040 4,789 10,265 all functions at all selected dimensions. 200 248.000 8.158 7,284 52,948 10,370 14.867 BGA [24]and AEA [19]are also tested on fi-f with 20,100. 400 700.000 13,822 12,368 112.634 23,588 17,939 200,400,1000 dimensions.In [24]and [19],the termination 1000 23,867 22,827 337.570 46.02420,083 criteria of BGA and AEA were the same as that of MAGA.but f月 fa they used g=10 forfi,g=10-for f,and g=103 for f and n BGA AEA MAGA BGA AEA MAGA f.Therefore,a comparison is made between BGA,AEA,and MAGA,which is shown in Table III 20 19.420 7.040 3,583 66.000 3.581 2,566 As can be seen,the number of evaluations of MAGA is much 100 53.860 22,710 5,410 361.722 17,228 4,447 smaller than that of BGA for all the four functions.For f,when 200 107.800 43.527 6,051 748.300 36.760 5,483 n<100,the number of evaluations of MAGA is slightly greater 400 220.820 78,216 6,615 1,630.00061.975 6,249 than that of AEA,while when 2005n1000,it is slightly smaller 1000 548,306 160.940 7,288 97,6607,358
SMCB-E-02252003-0112.R1 8 that of OAGA/Q ranges from 100,000 to 300,000. Therefore, the computational cost of MAGA is much lower than those of OGA/Q. To summarize, the results show that MAGA outperforms FEP and OGA/Q, and is competent for the numerical optimization problems. C. The Performance of MAGA on Functions with 20~1000 Dimensions Because the size of the search space and the number of local minima increase with the problem dimension, the higher the dimension is, the more difficult the problem is. Therefore, this experiment studies the performance of MAGA on functions with 20~1000 dimensions. The termination criterion of MAGA is one of the objectives, | || | best min min ff f − <⋅ ε or | | best f < ε if 0 min f = , is achieved, where fbest and fmin represent the best solution found until the current generation and the global optimum, respectively. To be consistent, 4 ε 10− = is used for all functions. Table II gives the mean number and the standard deviations of function evaluations of MAGA averaged over 50 trials. As can be seen, for f1 and f2, when the dimension increases from 20 to 1000, the number of evaluations increases to about 20,000. For f5 and f6, when the dimension increases to 1000, the number of evaluations only increases to 11,214 and 17,829. For the six other functions, MAGA obtains high quality solutions ( 4 ε 10− = ) only with thousands of evaluations at all selected dimensions. In addition, because ε is assigned to 10-4 in the termination criterion, small standard deviations are obtained for all functions at all selected dimensions. BGA [24] and AEA [19] are also tested on f1-f4 with 20, 100, 200, 400, 1000 dimensions. In [24] and [19], the termination criteria of BGA and AEA were the same as that of MAGA, but they used 4 ε 10− = for f1, 1 ε 10− = for f2, and 3 ε 10− = for f3 and f4. Therefore, a comparison is made between BGA, AEA, and MAGA, which is shown in Table III. As can be seen, the number of evaluations of MAGA is much smaller than that of BGA for all the four functions. For f1, when n≤100, the number of evaluations of MAGA is slightly greater than that of AEA, while when 200≤n≤1000, it is slightly smaller than that of AEA. For f2, when n≤200, the number of evaluations of MAGA is greater than that of AEA, while when 400≤n≤1000, it is smaller than that of AEA. For both f3 and f4, the number of evaluations of MAGA is much smaller than that of AEA at all dimensions. In general, MAGA obtains better solutions ( 4 ε 10− = ) at a lower computational cost than BGA and AEA, and displays a good performance in solving high dimensional problems. D. The Performance of MAGA on Functions with 1000~10,000 Dimensions In order to study the scalability of MAGA along the problem dimension further, in this experiment, MAGA is used to optimize f1-f10 with higher dimensions. The problem dimension is increased from 1000 to 10,000 in steps of 500. To the best of our knowledge, no researchers have ever optimized the functions with such high dimensions by means of evolution. Therefore, no existing results can be used for a direct comparison. In the previous two sections, MAGA was compared with four algorithms, and the results show that MAGA is much better than FEP and BGA. Although the TABLE III THE MEAN NUMBER OF FUNCTION EVALUATIONS OF BGA, AEA, AND MAGA ON f1 4 ~ f WITH 20~1000 DIMENSIONS TABLE II THE MEAN NUMBER OF FUNCTION EVALUATIONS (THE STANDARD DEVIATIONS) OF MAGA ON FUNCTIONS WITH 20~1000 DIMENSIONS
SMCB-E-02252003-0112.R1 9 performance of OGA/Q is also good,the size of the memory to than that of MAGA.In MAGA.for fi and f.although the store the sampling points of the orthogonal design increases number of evaluations obviously increases with the dimension. with the problem dimension dramatically.For example,when they are only 195,292 and 121,370,respectively,even when the dimension is 200,a 604.3M memory is needed.So OGA/Q dimensions increase to 10,000.Forf2,the dimension has no cannot be used to optimize high-dimensional functions in its obvious effect on the number of evaluations,and especially current form. when n1000,it still lies between 10,000 and 20,000.Forffs. We implement AEA and run it with the high dimensions in the number of evaluations slowly increases with the dimension, which MAGA is tested.Both MAGA and AEA used the and are only 28.815 and 27.645 even when n increases to termination criterion in Section III.C and g=10.Fig.2 shows 10,000.For,ff and fio,MAGA only uses about 10,000 the mean number of function evaluations of MAGA and AEA evaluations even when n increases to 10.000. where the results of MAGA are averaged over 50 trials and the From the results approximated by O(n)we can see that the results of AEA are averaged over 10 trials since the running complexities of AEA for 9 out of the 10 functions are worse than time of AEA is much longer than that of MAGA.Because the O(n),and only the complexity off is (n).At the same time. number of evaluations of AEA is much greater than that of we can see that the complexities of MAGA for all the 10 MAGA,the results of AEA and MAGA for each function are functions are very low,which are better than O(n)for the 9 depicted in two figures so that the effect of dimensions on the functions other than f.Although the number ofevaluations off performance of MAGA can be shown more expressively.The does not change with the dimension obviously,it is smaller than figures in the same row represent the results of the same 25,000 at all dimensions.The worst complexities of MAGA function,where the left one is the result of AEA and the right among the 10 functions are those of fi and f,which are O(n.7) one is that of MAGA.In order to study the complexity of and O(n80),respectively.The complexities forfa and fs are MAGA further,in Fig.2,the number of function evaluations is O(n41)and O(n3),respectively,and Fig.2(h)and (j)shows approximated by (n)(0<a<2).For more clarity,the that most results of the two functions are better than these comparisons between AEA and MAGA in the mean number of complexities.The complexities for f andffs are only about function evaluations for functions with 10,000 dimensions and O(n).are unimodal functions and MAGA displays similar the derived O(n)are shown in Table IV. performances for these functions.Therefore,the complexity of As can be seen,the number of evaluations of AEA MAGA for unimodal functions is very low. dramatically increases with the dimensions,and much greater AEA 10 x10 MAGA 12 Q 3 万 ,0n1.u3 00n0) 0 1.5 0 0 0 ● 00 0.5 ● 2000 4000 6000 8000 10000 2000 4000 6000 8000 10000 (a) (b) 15X10 AEA 2.5X10 MAGA O0n1.6② 91 2 ≌ 1.5● 6 0000 1● 0 000a 0.5 2000 4000 6000 8000 10000 2000 40006000800010000 (c) (d) Fig.2.The comparison between AEA and MAGA on fi-fo with 20~10,000 dimensions (to be continued)
SMCB-E-02252003-0112.R1 9 performance of OGA/Q is also good, the size of the memory to store the sampling points of the orthogonal design increases with the problem dimension dramatically. For example, when the dimension is 200, a 604.3M memory is needed. So OGA/Q cannot be used to optimize high-dimensional functions in its current form. We implement AEA and run it with the high dimensions in which MAGA is tested. Both MAGA and AEA used the termination criterion in Section III.C and 4 ε 10− = . Fig.2 shows the mean number of function evaluations of MAGA and AEA, where the results of MAGA are averaged over 50 trials and the results of AEA are averaged over 10 trials since the running time of AEA is much longer than that of MAGA. Because the number of evaluations of AEA is much greater than that of MAGA, the results of AEA and MAGA for each function are depicted in two figures so that the effect of dimensions on the performance of MAGA can be shown more expressively. The figures in the same row represent the results of the same function, where the left one is the result of AEA and the right one is that of MAGA. In order to study the complexity of MAGA further, in Fig.2, the number of function evaluations is approximated by O(na ) (01000, it still lies between 10,000 and 20,000. For f4, f5, the number of evaluations slowly increases with the dimension, and are only 28,815 and 27,645 even when n increases to 10,000. For f3, f7, f8, f9, and f10, MAGA only uses about 10,000 evaluations even when n increases to 10,000. From the results approximated by O(na ) we can see that the complexities of AEA for 9 out of the 10 functions are worse than O(n), and only the complexity of f3 is O(n0.78). At the same time, we can see that the complexities of MAGA for all the 10 functions are very low, which are better than O(n) for the 9 functions other than f2. Although the number of evaluations of f2 does not change with the dimension obviously, it is smaller than 25,000 at all dimensions. The worst complexities of MAGA among the 10 functions are those of f1 and f6, which are O(n0.78) and O(n0.80), respectively. The complexities for f4 and f5 are O(n0.41) and O(n0.39), respectively, and Fig.2(h) and (j) shows that most results of the two functions are better than these complexities. The complexities for f3 and f7-f8 are only about O(n0.1). f7-f8 are unimodal functions and MAGA displays similar performances for these functions. Therefore, the complexity of MAGA for unimodal functions is very low. Fig. 2. The comparison between AEA and MAGA on f1-f10 with 20~10,000 dimensions (to be continued)
SMCB-E-02252003-0112.R1 10 X10 AEA MAGA 0 9000 O08) 000 O0.06) 8000 0 0 00e 0 6000 0 5000 2 4000 3000 2000 4000 6000 8000 10000 2000 4000 6000 8000 10000 (c) 15X10 AEA 3X10 MAGA o ●fa 2 0n1.3 2.5 —0n04l) 0 6 0 6 0 060 05 2000 4000 6000 8000 10000 2000 4000 6000 8000 10000 7 (g) (h) x10 AEA 10 MAGA 0 3.5 ● On1.0) 00 O(n0.39 3 ●● 2 ● 0 6 0.5 2000 4000 6000 8000 10000 2000 40006000 800010000 7 (1① ) x10 AEA 1s104 MAGA 3.5 On1.0 一O(n0.80 ● 2 ● ● 6 ● 3 0.5 200040006000 8000 10000 0 2000 40006000 800010000 Fig.2.(continued)The comparison between AEA and MAGA on fi-fio with 20~10,000 dimensions(to be continued)
SMCB-E-02252003-0112.R1 10 Fig. 2. (continued) The comparison between AEA and MAGA on f1-f10 with 20~10,000 dimensions (to be continued)