This article was downloaded by: [New York University] On: 08 November 2011. At: 12: 09 Publisher: Taylor Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House 37-41 Mortimer street London wit 3JH, UK International Journal of General Systems Publication details, including instructions for authors and subscription information http://www.tandfonline.com/loi/ggen20 GENERAl Requirements for total uncertainty measures in SYSTEMS Dempster-Shafer theory of evidence ellan f andres ma Department of Computer Science and Artificial Intelligence, University of granada Available online: 03 Nov 2008 To cite this article: Joaquin abellan &t andres Masegosa(2008): Requirements for total uncertainty measures in Dempster-Shafer theory of evidence, International Journal of General Systems, 37: 6, 733-747 Tolinktothisarticlehttp://dx.doi.org/10.1080/03081070802082486 PLEASE SCROLL DOWN FOR ARTICLE Fulltermsandconditionsofuse:http://www.tandfonline.com/page/terms-and-conditions This article may be used for research teaching and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden The publisher does not give any warranty express or implied or make any representation that the contents will be complete or accurate or up to date. The accuracy of any instructions formulae and drug doses should be independently verified with primary sources. The publisher shall not be liable for any loss, actions, claims, proceedings, demand or costs or damages whatsoever or howsoever caused arising directly or indirectly in connection with or arising out of the use of this material
This article was downloaded by: [New York University] On: 08 November 2011, At: 12:09 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK International Journal of General Systems Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/ggen20 Requirements for total uncertainty measures in Dempster–Shafer theory of evidence Joaquín Abellán a & Andrés Masegosa a a Department of Computer Science and Artificial Intelligence, University of Granada, Granada, Spain Available online: 03 Nov 2008 To cite this article: Joaquín Abellán & Andrés Masegosa (2008): Requirements for total uncertainty measures in Dempster–Shafer theory of evidence, International Journal of General Systems, 37:6, 733-747 To link to this article: http://dx.doi.org/10.1080/03081070802082486 PLEASE SCROLL DOWN FOR ARTICLE Full terms and conditions of use: http://www.tandfonline.com/page/terms-and-conditions This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. The publisher does not give any warranty express or implied or make any representation that the contents will be complete or accurate or up to date. The accuracy of any instructions, formulae, and drug doses should be independently verified with primary sources. The publisher shall not be liable for any loss, actions, claims, proceedings, demand, or costs or damages whatsoever or howsoever caused arising directly or indirectly in connection with or arising out of the use of this material
Intemational Journal of General Systems Taylor Francis vol.37,No.6, December2008,733-747 Requirements for total uncertainty measures in Dempster-Shafi theory of evidence Joaquin Abellan*and Andres Masegosa Department of Computer Science and Artificial Intelligence, University of granada, Received 9 October 2007, final version received 6 March 2008) Recently, an alternative measure of total uncertainty in Dempster-Shafer theory of evidence(dst)has been proposed in place of the maximum ent on the pignistic probability of a basic probability assignment and it is proved that this z∞一 asure verifies a set of needed properties for such a type of measur measure is motivated by the problems that maximum(upper)entropy has. In this yas o we analyse the requirements, presented in the literature, for total uncertainty measures DST and the shortcomings found on them. We extend the set of requirements, which we consider as a set of requirements of properties, and we use the set of shortcomings found on them to define a set of requirements of the behaviour for total uncertainty measures in DST. We present the differences of the principal total uncertainty measures presented in DST taking into account their properties and behaviour. Also, an experimental comparative study of the performance of total uncertainty measures in DST on a special type of belief decision trees is presented. Keywords: imprecise probabilities; theory of evidence; uncertainty based in total uncertainty: conflict; non-specificity In the classical theory of probability, Shannons entropy(Shannon 1948)is the tool used for quantifying uncertainty. Its main virtue is that it verifies a set of desirable properties for probability distributions In situations where the probabilistic representation is inadequate, an imprecise probability theory can be used as seen in Walley(1991), such as Dempster Shafer's theory (DST)(Dempster 1967, Shafer 1976), interval-valued probabilities Campos et al. 1994), order-two capacities( Choquet 1953/54), upper-lower probabilities (Suppes 1974, Fine 1983)or general convex sets of probability distributions( Good 1962, Levi 1980, Walley 1991), also called credal sets. In order to quantify the uncertainty represented by these situations, Shannons entropy has been used as the starting point It can be justified in different ways, but the most common one is the axiomatic approach, i.e. by assuming a set of necessary basic properties that a measure must verify(Klir and Wierman 1998). In Dempster-Shafer's theory (dSt), Yager(1983)distinguishes between two types of uncertainty: conflict (or randomness or discord) and non-specificity. A total uncertainty measure is also justified in this theory by an axiomatic approach considering the one used in probability theory as a reference. We believe that some aspects of dST that do not Corresponding author. Email: jabellan, andrew I @decsaiugres ISSN 0308-1079 print/ISSN 1563-5104 online Taylor Francis 108003081070802082486
Requirements for total uncertainty measures in Dempster–Shafer theory of evidence Joaquı´n Abella´n* and Andre´s Masegosa Department of Computer Science and Artificial Intelligence, University of Granada, Granada, Spain ( Received 9 October 2007; final version received 6 March 2008 ) Recently, an alternative measure of total uncertainty in Dempster–Shafer theory of evidence (DST) has been proposed in place of the maximum entropy measure. It is based on the pignistic probability of a basic probability assignment and it is proved that this measure verifies a set of needed properties for such a type of measure. The proposed measure is motivated by the problems that maximum (upper) entropy has. In this paper, we analyse the requirements, presented in the literature, for total uncertainty measures in DST and the shortcomings found on them. We extend the set of requirements, which we consider as a set of requirements of properties, and we use the set of shortcomings found on them to define a set of requirements of the behaviour for total uncertainty measures in DST. We present the differences of the principal total uncertainty measures presented in DST taking into account their properties and behaviour. Also, an experimental comparative study of the performance of total uncertainty measures in DST on a special type of belief decision trees is presented. Keywords: imprecise probabilities; theory of evidence; uncertainty based information; total uncertainty; conflict; non-specificity 1. Introduction In the classical theory of probability, Shannon’s entropy (Shannon 1948) is the tool used for quantifying uncertainty. Its main virtue is that it verifies a set of desirable properties for probability distributions. In situations where the probabilistic representation is inadequate, an imprecise probability theory can be used as seen in Walley (1991), such as Dempster– Shafer’s theory (DST) (Dempster 1967, Shafer 1976), interval-valued probabilities (Campos et al. 1994), order-two capacities (Choquet 1953/54), upper-lower probabilities (Suppes 1974, Fine 1983) or general convex sets of probability distributions (Good 1962, Levi 1980, Walley 1991), also called credal sets. In order to quantify the uncertainty represented by these situations, Shannon’s entropy has been used as the starting point. It can be justified in different ways, but the most common one is the axiomatic approach, i.e. by assuming a set of necessary basic properties that a measure must verify (Klir and Wierman 1998). In Dempster–Shafer’s theory (DST), Yager (1983) distinguishes between two types of uncertainty: conflict (or randomness or discord) and non-specificity. A total uncertainty measure is also justified in this theory by an axiomatic approach considering the one used in probability theory as a reference. We believe that some aspects of DST that do not ISSN 0308-1079 print/ISSN 1563-5104 online q 2008 Taylor & Francis DOI: 10.1080/03081070802082486 http://www.informaworld.com *Corresponding author. Email: {jabellan,andrew}@decsai.ugr.es International Journal of General Systems Vol. 37, No. 6, December 2008, 733–747 Downloaded by [New York University] at 12:09 08 November 2011
J. Abellan and A. Masegosa appear in classical probability theory, such as monotonicity, should be taken into account when studying the axiomatic approach Maeda and Ichihashi(1993) proposed a total uncertainty measure on DST adding up le generalised Hartley measure and upper entropy. They proved that this total uncertainty measure verifies all the necessary basic properties except for the required range. This property could, however, be discussed since there are more types of uncertainty in DST than in the probability theory DST is considered as a particular theory of credal sets. By applying the uncertainty invariance principle, a total uncertainty measure on general credal sets will be a generalisation of a total uncertainty measure on DST. With this aim, various studies of the quantification of uncertainty on credal sets have been published(Abellan and Moral Ez∞ 2003a, 2005b: Abellan et al. 2006) In Abellan and Moral (2003a), it is proved that the maximum of Shannons entropy (upper entropy) verifies on general credal sets all the basics properties that it verifies on DST. In Abellan and Moral (2005a) and Klir and Smith(2001), the use of maximum entropy on credal sets as a good measure of total uncertainty is justified. The problem lies in separating these functions into others, which really do measure the conflict and nor specificity parts by using a credal set to represent the information In DSt, we have two total uncertainty measures that verify a set of basic required properties: Maeda and Ichihashi's total uncertainty measure and upper entropy. More recently, however, Jousselme et al.(2006) presented a new total uncertainty measure in DST based on the pignistic distribution. The authors proved that this measure verifies the necessary properties and it resolves other shortcomings of upper entropy. We therefore have three total uncertainty measures in DST verifying all the required properties, and we will study these measures in this paper. In this paper, we justify an extension of the set of required properties for a total uncertainty measure on DST and we refer to this set as the requirements of properties. We present a comparative study of the properties verified for each total uncertainty measure, and in doing so, we will see that Jousselme et al's total uncertainty measure has some undesirable defects We will also analyse the shortcomings reported for the upper entropy by certain authors (Jousselme et al. 2006). In order to do so, we will revise the set of requirements of behaviour that a total uncertainty measure in DST must verify and study these requirements on the most significant total uncertainty measures defined in DST. By nsidering the new results on upper entropy, we will see that the upper entropy behaves Corm isms being not totally justified. e important aspect of total uncertainty measures in DST is their applicability, and this involves a not too complicated calculation In Appendix a of this paper, we present an application of these measures, having conducted an experimental study of these total uncertainty measures on a special type of belief decision trees(Abellan and Moral 2003b, 2005a), i.e. decision trees where the dsT is used to represent the information expressed by a database on a query variable. In this procedure, the way to quantify the information plays an important role in the success obtained. In our experimentation, we will use the differer total uncertainty measures analysed in this paper as tools to quantify the information In Section 2, we will introduce some necessary basic concepts and notation. Section 3 presents the extended set of basic properties verified by each total uncertainty measure. In Section 4, we analyse the shortcomings that total uncertainty measures present. Startin with this set of shortcomings, we will define a set of requirements of behaviour for this type of measure. Section 5 discusses our conclusions
appear in classical probability theory, such as monotonicity, should be taken into account when studying the axiomatic approach. Maeda and Ichihashi (1993) proposed a total uncertainty measure on DST adding up the generalised Hartley measure and upper entropy. They proved that this total uncertainty measure verifies all the necessary basic properties except for the required range. This property could, however, be discussed since there are more types of uncertainty in DST than in the probability theory. DST is considered as a particular theory of credal sets. By applying the uncertainty invariance principle, a total uncertainty measure on general credal sets will be a generalisation of a total uncertainty measure on DST. With this aim, various studies of the quantification of uncertainty on credal sets have been published (Abella´n and Moral 2003a, 2005b; Abella´n et al. 2006). In Abella´n and Moral (2003a), it is proved that the maximum of Shannon’s entropy (upper entropy) verifies on general credal sets all the basics properties that it verifies on DST. In Abella´n and Moral (2005a) and Klir and Smith (2001), the use of maximum entropy on credal sets as a good measure of total uncertainty is justified. The problem lies in separating these functions into others, which really do measure the conflict and nonspecificity parts by using a credal set to represent the information. In DST, we have two total uncertainty measures that verify a set of basic required properties: Maeda and Ichihashi’s total uncertainty measure and upper entropy. More recently, however, Jousselme et al. (2006) presented a new total uncertainty measure in DST based on the pignistic distribution. The authors proved that this measure verifies the necessary properties and it resolves other shortcomings of upper entropy. We therefore have three total uncertainty measures in DST verifying all the required properties, and we will study these measures in this paper. In this paper, we justify an extension of the set of required properties for a total uncertainty measure on DST and we refer to this set as the requirements of properties. We present a comparative study of the properties verified for each total uncertainty measure, and in doing so, we will see that Jousselme et al.’s total uncertainty measure has some undesirable defects. We will also analyse the shortcomings reported for the upper entropy by certain authors (Jousselme et al. 2006). In order to do so, we will revise the set of requirements of behaviour that a total uncertainty measure in DST must verify and study these requirements on the most significant total uncertainty measures defined in DST. By considering the new results on upper entropy, we will see that the upper entropy behaves correctly, with the criticisms being not totally justified. One important aspect of total uncertainty measures in DST is their applicability, and this involves a not too complicated calculation. In Appendix A of this paper, we present an application of these measures, having conducted an experimental study of these total uncertainty measures on a special type of belief decision trees (Abella´n and Moral 2003b, 2005a), i.e. decision trees where the DST is used to represent the information expressed by a database on a query variable. In this procedure, the way to quantify the information plays an important role in the success obtained. In our experimentation, we will use the different total uncertainty measures analysed in this paper as tools to quantify the information. In Section 2, we will introduce some necessary basic concepts and notation. Section 3 presents the extended set of basic properties verified by each total uncertainty measure. In Section 4, we analyse the shortcomings that total uncertainty measures present. Starting with this set of shortcomings, we will define a set of requirements of behaviour for this type of measure. Section 5 discusses our conclusions. 734 J. Abella´n and A. Masegosa Downloaded by [New York University] at 12:09 08 November 2011
International Journal of General Systems 2. Previous concepts 2.1 Dempster-Shafer theory of evidence Let X be a finite set considered as a set of possible situations, X= n, o(X) the power set of X and x any element in X. The Dempster-Shafer theory is based on the concept of basic probability assignment. a basic probability assignment (b.p. a ), also called a mass assignment, is a mapping m: p(X)= ch that m(0)=0 A set A where m(A)>0 is called a focal element of Let x, y be finite sets. Considering the product space of the possible situation XxY and m a b P a on X X Y, the marginal b P.a. on X, mr and similarly on Y, my is defined in the following way mx(A)=> m(R),VACX where R is the set projection of R on X. There are two functions associated with each basic probability assignment belief function, Bel, and a plausibility function, Pl: Bel(A)=RcAM(B), Pl(A) CA0Rzgn(B). These can be seen as the lower and upper probability of A, respectivel We may note that belief and plausibility functions are inter-related for all A E p(X) Pl(A)=l- Bel(A), where A denotes the complement of A. Furthermore, Bel(A)sPI(A) 2.2 Uncertainty in DST The classical measure of entropy(Shannon 1948)on probability theory is defined by the following continuous function: S(P)=-CIExP(x)log 2(p(r), where p=(p(r))ex is a probability distribution on X, p(r) is the probability of value x and log2 is normally used to quantify the value in bits. The value- S(p) quantifies the only type of uncertainty presented on probability theory and it verifies a large set of desirable properties(Shannon 1948, Klir and Wierman 1998) In DSt, Yager(1983)distinguishes between two types of uncertainty: the first associated with cases where the information focuses on sets with empty intersections, and the second is associated with cases where the information focuses on sets with greater- than-one cardinality. These are called conflict (or randomness or discord)and non- ificity, respectively The following function, introduced by Dubois and Prade(1984), has its origin in classical Hartley measure(Hartley 1928)on classical set theory and in the extended Hartley m associated with a b p a. It is expressed as follows: 1(m)= 2AcXm(A)log(lAD I(m) attains its minimum, zero, when m is a probability distribution. The maximum, log(IXD), is obtained for a b p a, m, with m(X)=l and m(A)=O, VA CX. Many measures were introduced to quantify the conflict degree that a b p a.represents (Klir and Wierman 1998). One of the most representative confict functions was introduced by Yager(1983): E(m)= (A)log Pl(a). This function, however, does not verify all the required properties on dst
2. Previous concepts 2.1 Dempster –Shafer theory of evidence Let X be a finite set considered as a set of possible situations, jXj ¼ n; ‘ðXÞ the power set of X and x any element in X. The Dempster–Shafer theory is based on the concept of basic probability assignment. A basic probability assignment (b.p.a.), also called a mass assignment, is a mapping m : ‘ðXÞ ! 0; 1 ; such that m(Y) ¼ 0 and P A#XmðAÞ ¼ 1. A set A where m(A) . 0 is called a focal element of m. Let X, Y be finite sets. Considering the product space of the possible situation X £ Y and m a b.p.a. on X £ Y, the marginal b.p.a. on X, mx and similarly on Y, mY is defined in the following way: mXðAÞ ¼ X RjA¼RX mðRÞ; ;A # X; where Rx is the set projection of R on X. There are two functions associated with each basic probability assignment: a belief function, Bel, and a plausibility function, Pl : BelðAÞ ¼ P P B#AmðBÞ; PlðAÞ ¼ A>B–YmðBÞ: These can be seen as the lower and upper probability of A, respectively. We may note that belief and plausibility functions are inter-related for all A [ ‘ðXÞ, by PlðAÞ ¼ 1 2 BelðAc Þ; where A c denotes the complement of A. Furthermore, Bel(A) # Pl(A). 2.2 Uncertainty in DST The classical measure of entropy (Shannon 1948) on probability theory is defined by the following continuous function: SðpÞ ¼ 2P x[XpðxÞ log 2ðpðxÞÞ; where p ¼ ðpðxÞÞx[X is a probability distribution on X, p(x) is the probability of value x and log2 is normally used to quantify the value in bits1 . The value2 S( p) quantifies the only type of uncertainty presented on probability theory and it verifies a large set of desirable properties (Shannon 1948, Klir and Wierman 1998). In DST, Yager (1983) distinguishes between two types of uncertainty: the first is associated with cases where the information focuses on sets with empty intersections, and the second is associated with cases where the information focuses on sets with greaterthan-one cardinality. These are called conflict (or randomness or discord) and nonspecificity, respectively. The following function, introduced by Dubois and Prade (1984), has its origin in classical Hartley measure (Hartley 1928) on classical set theory and in the extended Hartley measure on possibility theory (Higashi and Klir 1983). It represents a measure of non-specificity associated with a b.p.a. It is expressed as follows: IðmÞ ¼ P A#XmðAÞ log ðjAjÞ: I(m) attains its minimum, zero, when m is a probability distribution. The maximum, log(jXj), is obtained for a b.p.a., m, with m(X) ¼ 1 and mðAÞ ¼ 0; ;A , X. Many measures were introduced to quantify the conflict degree that a b.p.a. represents (Klir and Wierman 1998). One of the most representative conflict functions was introduced by Yager (1983): EðmÞ ¼ 2 X A#X mðAÞ log PlðAÞ: This function, however, does not verify all the required properties on DST. International Journal of General Systems 735 Downloaded by [New York University] at 12:09 08 November 2011
J. Abellan and A. Masegosa Harmanec and Klir(1996)proposed the measure S"(m)which is equal to the maximum of the entropy (upper entropy) of the probability distributions verifying Bel(A)s AP(r)s Pl(A), VA CX. This set of probability distributions is the credal set associated with a b p a. m, and will be denoted as K Maeda and Ichihashi (1993)proposed a total uncertainty measure using the measures which quantifies the conflict and non-specificity contained in a b p a. on X following way MI(m)=I(m)+S(m) where 1(m)is used as a non-specificity function and S"(m)is used as a measure of conflict. 二Ez∞N一 This measure was analysed in Abellan and Moral (1999) Harmanec and Klir(1996)proposed S as a total uncertainty measure in DST, i.e.as a neasure that quantifies conflict and non-specificity, but they do not separate this into parts that quantify these two types of uncertainty on DST. More recently, abellan et al.(2006 proposed upper entropy as an aggregate measure on more general theories than dst, coherently separating conflict and non-specificity. These parts can also be obtained in DSt in a similar way. In DSt, we can consider where S(m) represents maximum entropy and S(m) represents minimum entropy on the credal Km associated to a b p a. m, with S(m) coherently quantifying the conflict part and (S-S)(m)its non-specificity part Quite recently, Jousselme et al.(2006)presented a measure to quantify ambiguity (discord or conflict and non-specificity)in DST, i.e. a total uncertainty measure on dst. This measure is based on the pignistic distribution on DST: let m be a b p a. on a finite set X, then the pignistic probability distribution BetPm, on all the subsets A in X is defined by A∩B BetPm(A)=>m(B For a singleton set A=[x], we have BetPm([))=CreB[m(B)/Bl]. Therefore, the ambiguity measure for a b p a. m on a finite set X is defined as M(m)=->BetPm(x)log(BetPm(x) 3. Basic properties of total uncertainty measures in DST In Klir and wierman(1998), we can find five requirements for a total uncertainty measure (TU in DST, i.e. for a measure which captures both conflict and non-specificity. Using the above notation, these requirements can be expressed in the following way Pl) Probabilistic consistency all the focal elements of a b P a. m are singletons hen a total uncertainty measure must be equal to the Shannon entropy
Harmanec and Klir (1996) proposed the measure S* (m) which is equal to the maximum of the entropy (upper entropy) of the probability distributions verifying P BelðAÞ # x[ApðxÞ # PlðAÞ; ;A # X: This set of probability distributions is the credal set associated with a b.p.a. m, and will be denoted as Km. Maeda and Ichihashi (1993) proposed a total uncertainty measure using the above measures which quantifies the conflict and non-specificity contained in a b.p.a. on X in the following way: MIðmÞ ¼ IðmÞ þ S*ðmÞ; where I(m) is used as a non-specificity function and S* (m) is used as a measure of conflict. This measure was analysed in Abella´n and Moral (1999). Harmanec and Klir (1996) proposed S* as a total uncertainty measure in DST, i.e. as a measure that quantifies conflict and non-specificity, but they do not separate this into parts that quantify these two types of uncertainty on DST. More recently, Abella´n et al. (2006) proposed upper entropy as an aggregate measure on more general theories than DST, coherently separating conflict and non-specificity. These parts can also be obtained in DST in a similar way. In DST, we can consider S*ðmÞ ¼ S*ðmÞþðS* 2 S*ÞðmÞ; where S* (m) represents maximum entropy and S*ðmÞ represents minimum entropy on the credal Km associated to a b.p.a. m, with S*ðmÞ coherently quantifying the conflict part and ðS* 2 S*ÞðmÞ its non-specificity part. Quite recently, Jousselme et al. (2006) presented a measure to quantify ambiguity (discord or conflict and non-specificity) in DST, i.e. a total uncertainty measure on DST. This measure is based on the pignistic distribution on DST: let m be a b.p.a. on a finite set X, then the pignistic probability distribution BetPm, on all the subsets A in X is defined by BetPmðAÞ ¼ X B#X mðBÞ jA > Bj jBj : For a singleton set A ¼ {x}, we have BetPmð{x}Þ ¼ P x[B ½mðBÞ=jBj. Therefore, the ambiguity measure for a b.p.a. m on a finite set X is defined as AMðmÞ ¼ 2 X x[X BetPmðxÞ log ðBetPmðxÞÞ: 3. Basic properties of total uncertainty measures in DST In Klir and Wierman (1998), we can find five requirements for a total uncertainty measure (TU) in DST, i.e. for a measure which captures both conflict and non-specificity. Using the above notation, these requirements can be expressed in the following way: (P1) Probabilistic consistency: when all the focal elements of a b.p.a. m are singletons, then a total uncertainty measure must be equal to the Shannon entropy: TUðmÞ ¼ X x[X mðxÞ log mðxÞ: 736 J. Abella´n and A. Masegosa Downloaded by [New York University] at 12:09 08 November 2011
nal jour General Systems 737 (P2) Set consistency. when a set A exists such that m(A)= l, then a TU must collapse t he hartley measure TU(m)= log IAl (P3)Range: the range of TU(m) is [0, logIX]. (P4) Subadditivity: let m be a b p a. on the space X X Y, mx and my its marginal b.P.a.s on X and Y, respectively, then a TU must satisfy the following inequality TU(m)≤TU(mx)+TU(my) (P5)Additivity: let m be a b p a. on the space X X Y, mx and my its marginal b p as on X nd Y, respectively, such that these marginals are not interactive(m(A X B) mx(A)my (B), with A CX, Bc Y and m(C)=0if C=A X B), then a TU must satisf z∞一 TU(m)=TU(mx)+ TU(my) With these requirements, we hope to extend those of Shannon entropy in probability theory, although there are situations in DST that can never appear in probability theory For instance, a probability distribution can never contain another probability distribution In DST, however, the information of a b.p.a. can be contained by the information of another b.P.a. Let us consider the following example. Example 1. In a first situation, we have three pieces of evidence (el e2 and e3)about the type of disease (dl, d2 or d3), which a patient has. Hence, an expert quantifies the information available using a basic probability assignment and considers the following b p a. on the universal X= ld1, d2, d3) e1→m1({d1,d2})= e2→m1({d1,d3}) e3→m1({d2,d3})= Let us now assume that the expert finds that the reasons for discarding d3 in eI are false and that it is necessary to change his b P a. to the following e1→m2({d1,d2,d3})= e2→m2({d1,d3)= e3→m2({d2,d3∥≈1 In the above example, we go from a first situation with one amount of information to another more confused situation. It is logical to consider that the second situation involves a greater level of uncertainty(minor information). Here, we have Bel2(A)s Bell(A)and
(P2) Set consistency: when a set A exists such that m(A) ¼ 1, then a TU must collapse to the Hartley measure: TUðmÞ ¼ log jAj: (P3) Range: the range of TU(m) is [0, logjXj]. (P4) Subadditivity: let m be a b.p.a. on the space X £ Y, mX and mY its marginal b.p.a.s on X and Y, respectively, then a TU must satisfy the following inequality: TUðmÞ # TUðmXÞ þ TUðmY Þ: (P5) Additivity: let m be a b.p.a. on the space X £ Y, mX and mY its marginal b.p.a.s on X and Y, respectively, such that these marginals are not interactive (mðA £ BÞ ¼ mXðAÞmY ðBÞ, with A # X, B # Y and mðCÞ ¼ 0 if C – A £ B), then a TU must satisfy the equality TUðmÞ ¼ TUðmXÞ þ TUðmY Þ: With these requirements, we hope to extend those of Shannon entropy in probability theory, although there are situations in DST that can never appear in probability theory. For instance, a probability distribution can never contain another probability distribution. In DST, however, the information of a b.p.a. can be contained by the information of another b.p.a. Let us consider the following example. Example 1. In a first situation, we have three pieces of evidence (e1 e2 and e3) about the type of disease (d1, d2 or d3), which a patient has. Hence, an expert quantifies the information available using a basic probability assignment and considers the following b.p.a. on the universal X ¼ {d1, d2, d3}: e1 ! m1ð{d1; d2}Þ ¼ 1 3 ; e2 ! m1ð{d1; d3}Þ ¼ 1 2 ; e3 ! m1ð{d2; d3}Þ ¼ 1 6 : Let us now assume that the expert finds that the reasons for discarding d3 in e1 are false and that it is necessary to change his b.p.a. to the following: e1 ! m2ð{d1; d2; d3}Þ ¼ 1 3 ; e2 ! m2ð{d1; d3}Þ ¼ 1 2 ; e3 ! m2ð{d2; d3}Þ ¼ 1 6 : In the above example, we go from a first situation with one amount of information to another more confused situation. It is logical to consider that the second situation involves a greater level of uncertainty (minor information). Here, we have Bel2ðAÞ # Bel1ðAÞ and International Journal of General Systems 737 Downloaded by [New York University] at 12:09 08 November 2011
J. Abellan and A. Masegosa Ph1(A)=Pl2(A), VA CX; implying a larger level of uncertainty for 2. This also implies that Km, CKmz, where Km, and Km, are the credal sets associated to mI and m2, respectively We consider that the situation expressed by Example I should be taken into account for a total uncertainty measure in dSt. This situation allows us to consider the following (P6) Monotonicity: a total uncertainty measure in DST must not decrease the total quantity of uncertainty in a situation where a clear decrease in information(increment Formally, let two b p as be on a finite set X, mi and m2, verifying that Km C Km, then Ez∞ TU(m1)≤TU(m2) Here, we must remark that the monotone dispensability definition of Harmanec(1995) could be used, but we prefer a more general one, which can be extended in a direct way to general credal sets Monotone dispensability always implies monotonicity axiom but not the contrary, as can be easily checked If we use the results of the works of Klir and wierman(1998), Maeda and Ichihashi (1993)and Jousselme et al.(2006), it can checked that the Ml, S and AM functions verify the following sets of requirements in DST: MI: P1. P4. P5 and P6 S: Pl, P2, P3, P4, P5 and P6. AM: P1 P2 P3 and P5 Considering the list above, we should mention the following: 1. Function MI does not satisfy the P2 and P3 requirements. Its range is [0, 2 loglXl] because it uses a clear split between the quantification of the two types of uncertainty*, each with range [0, loglXI 2. Jousselme et al. proved that function AM satisfies the P4 requirement, but recently, Klir and Lewis(2007)found an error in this proof and gave a counter example that proves that AM does not satisfy the P4 requirement 3. Function AM does not satisfy the P6 requirement. If we consider Example 1, it can BetPm ((d1))=12, BetIm((d2 )=12, BetPm ((ds) BetPm(diD Hence AM(m1)=1.078>AM(m2)=1.047 We can therefore see that only S" satisfies all the proposed requirements 4. Requirements of behaviour for total uncertainty measures in DST The paper by Jousselme et al. analyses certain shortcomings of the S function(upper entropy) in DST in order to compare this function with the AM function. These
Pl1ðAÞ # Pl2ðAÞ; ;A # X; implying a larger level of uncertainty for m2. This also implies that Km1 # Km2 , where Km1 and Km2 are the credal sets associated to m1 and m2, respectively. We consider that the situation expressed by Example 1 should be taken into account for a total uncertainty measure in DST. This situation allows us to consider the following property: (P6) Monotonicity: a total uncertainty measure in DST must not decrease the total quantity of uncertainty in a situation where a clear decrease in information (increment of uncertainty) is produced. Formally, let two b.p.a.s be on a finite set X, m1 and m2, verifying that Km1 # Km2 , then TUðm1Þ # TUðm2Þ: Here, we must remark that the monotone dispensability definition of Harmanec (1995) could be used, but we prefer a more general one, which can be extended in a direct way to general credal sets. Monotone dispensability always implies monotonicity axiom but not the contrary, as can be easily checked3 . If we use the results of the works of Klir and Wierman (1998), Maeda and Ichihashi (1993) and Jousselme et al. (2006), it can checked that the MI, S* and AM functions verify the following sets of requirements in DST: MI: P1, P4, P5 and P6. S* : P1, P2, P3, P4, P5 and P6. AM: P1, P2, P3 and P5. Considering the list above, we should mention the following: 1. Function MI does not satisfy the P2 and P3 requirements. Its range is [0, 2 logjXj] because it uses a clear split between the quantification of the two types of uncertainty4 , each with range [0, logjXj]. 2. Jousselme et al. proved that function AM satisfies the P4 requirement, but recently, Klir and Lewis (2007) found an error in this proof and gave a counter example that proves that AM does not satisfy the P4 requirement. 3. Function AM does not satisfy the P6 requirement. If we consider Example 1, it can be proved that Km1 # Km2 , and we have BetPm1 ð{d1}Þ ¼ 5 12 ; BetPm1 ð{d2}Þ ¼ 3 12 ; BetPm1 ð{d3}Þ ¼ 4 12 ; BetPm2 ð{d1}Þ ¼ 13 36 ; BetPm2 ð{d2}Þ ¼ 7 36 ; BetPm2 ð{d3}Þ ¼ 16 36 : Hence, AMðm1Þ ¼ 1:078 . AMðm2Þ ¼ 1:047: We can therefore see that only S* satisfies all the proposed requirements. 4. Requirements of behaviour for total uncertainty measures in DST The paper by Jousselme et al. analyses certain shortcomings of the S * function (upper entropy) in DST in order to compare this function with the AM function. These 738 J. Abella´n and A. Masegosa Downloaded by [New York University] at 12:09 08 November 2011
International Journal of General Systems 739 shortcomings have been presented in publications by Klir et al. and can be expressed in the following way 1. Computing complexity 2. Concealment of the two types of uncertainty coexisting in the evidence theory conflict and non-specificity 3. Insensitivity to changes in evidence We consider Klir et al.'s considerations about the behaviour of a total uncertainty measure (TU) in DST to be very important, because a TU in DST makes no sense if it verifies all the basic properties(P1-P6)but its calculation is unfeasible. A TU in DST 二Ez∞N一 should also give us information about the quantification of the two types of uncertainty coexisting in DST. Finally, a TU should be sensitive to changes in evidence directly or via its parts of conflict or non-specificity since it is possible for an increase in conflict to cause a decrease in non-specificity, and vice versa, and we could have two situations with similar total uncertainty values but with different conflict and non-specificity parts of the ncertainty This set of shortcomings found in certain total uncertainty measures could therefore be used to present a set of requirements of behaviour of a TU in DST. P1-P6 can be considered as requirements of properties that a total uncertainty measure in DST must satisfy. The set of requirements of behaviour(RB)of a tU in DST could be expressed in the following way (RB1)The calculation of a TU should not be too complex. (RB2)A TU must not conceal the two types of uncertainty(conflict and non-specificity) co-existing in the evidence theory (RB3)A TU must be sensitive to changes in evidence either directly or via its parts of conflict and non-specificity There are certain situations where the information available is more suitable for being mathematically quantified with more general models than the dsT. In such cases, we are talking about a Generalised Information Theory'[see Klir(2006)] and take into account Klir's following principle of Klir Principle of uncertainty invariance: the s. the pr (and information) must be uncertain theory is transformed into its counterpart in another theory. That is, the princi es that no information is unwittingly added or eliminated solely by changing the mathematical framework by which a articular phenomenon is formalised. By this principle, a TU in dSt should allow us to extend it on more general theories than DST. This one could be considered as a requirement of behaviour for a tU in DSt, that we can call extensibility RB4) The extension of a TU in DST on more general theories must be possible We will now review the above require RBI: As we can see in Jousselme et al.(2006), the AM function has a simpler lculation than the other functions(MI includes S in its definition), and it is only necessary to obtain the pignistic probability distribution of a b p a. The calculation of S="in DST has a high computational complexity. Meyerot al's algorithm(1994)was the first to obtain this value. More recently, the computation of this algorithm was reduced by
shortcomings have been presented in publications by Klir et al. and can be expressed in the following way: 1. Computing complexity. 2. Concealment of the two types of uncertainty coexisting in the evidence theory: conflict and non-specificity. 3. Insensitivity to changes in evidence. We consider Klir et al.’s considerations about the behaviour of a total uncertainty measure (TU) in DST to be very important, because a TU in DST makes no sense if it verifies all the basic properties (P1–P6) but its calculation is unfeasible. A TU in DST should also give us information about the quantification of the two types of uncertainty coexisting in DST. Finally, a TU should be sensitive to changes in evidence directly or via its parts of conflict or non-specificity since it is possible for an increase in conflict to cause a decrease in non-specificity, and vice versa, and we could have two situations with similar total uncertainty values but with different conflict and non-specificity parts of the uncertainty. This set of shortcomings found in certain total uncertainty measures could therefore be used to present a set of requirements of behaviour of a TU in DST. P1–P6 can be considered as requirements of properties that a total uncertainty measure in DST must satisfy. The set of requirements of behaviour (RB) of a TU in DST could be expressed in the following way: (RB1) The calculation of a TU should not be too complex. (RB2) A TU must not conceal the two types of uncertainty (conflict and non-specificity) co-existing in the evidence theory. (RB3) A TU must be sensitive to changes in evidence either directly or via its parts of conflict and non-specificity. There are certain situations where the information available is more suitable for being mathematically quantified with more general models than the DST. In such cases, we are talking about a ‘Generalised Information Theory’ [see Klir (2006)] and take into account Klir’s following principle of Klir: Principle of uncertainty invariance: the amount of uncertainty (and information) must be preserved when a representation of uncertainty in one mathematical theory is transformed into its counterpart in another theory. That is, the principle guarantees that no information is unwittingly added or eliminated solely by changing the mathematical framework by which a particular phenomenon is formalised. By this principle, a TU in DST should allow us to extend it on more general theories than DST. This one could be considered as a requirement of behaviour for a TU in DST, that we can call extensibility: (RB4) The extension of a TU in DST on more general theories must be possible. We will now review the above requirements of behaviour for functions MI, S* and AM: RB1: As we can see in Jousselme et al. (2006), the AM function has a simpler calculation than the other functions (MI includes S * in its definition), and it is only necessary to obtain the pignistic probability distribution of a b.p.a. The calculation of S* in DST has a high computational complexity. Meyerowitz et al.’s algorithm (1994) was the first to obtain this value. More recently, the computation of this algorithm was reduced by International Journal of General Systems 739 Downloaded by [New York University] at 12:09 08 November 2011
74 J. Abellan and A. Masegosa Liu et al.(2007). Although the computational cost of every TU in dST is clearly different the calculation of every tU in DST is simple RB2: MI can be separated coherently in conflict and non-specificity by definition. Here, is used as a conflict measure and function I as a non-specificity measure. Recently Abellan et al.(2006)separated s into two parts that coherently quantify conflict and nor specificity for more general theories than DST. Here, S,(minimum of entropy) is used as a conflict measure and S-S, is used as a non-specificity function. In order to obtain these parts, Abellan and Moral (2005b) present a branch and bound algorithm to obtain S, on more general theories than DST. Only the AM function has no clear separation between conflict and non-specificity. In Jousselme et al.(2006), AM is presented as a special case of the function 8S+(1-8), for an unknown S E(0, 1). Therefore, when value AM(m)is used it is impossible to know what quantity corresponds to conflict and what to non-specificity RB3: In this point, we want to review the analysis presented in Jousselme et al.(2006) about the sensitivity of s in DST using an example by Klir and Smith(1999): Example 2. Suppose there are two elements in the given frame of discernment X=(1, 2) and we know n({1})=m1,m=({2}=m2,som({1,2})=m12=1-m1-m2.At this point, we should mention that by definition (Yager 1983), the non-specificity part of m depends only on the 12 value and the conflict part of m depends on the interaction between mi and m2 values In Klir and Smiths original example(1999), S was identified as highly insensitive to changes in evidence, an'unsatisfactory situation. S gives the same value for all bodies of evidence for which both m and m2 fall into the range [0, 0.5]. When m2 C[ 0.5, 1, the S measure is entirely independent of the value of and vice versa. Jousselme et al.(2006) proved that the AM measure does not behave in the same way and AM is neither independent of m, or m2 by considering the above example. We will use the example and will apply it to every Tu considered in this paper. Without loss of generality, it is supposed that mi is known. They then consider that three cases appear m1>0.5,m1=0.5,andm10.5,e.g.m1=0.6.Here,m12=0.4-m2. We have S"m)=S(0.6,0.4);S.(m)=S(1-m2,m2); (m)=(0.4-m2)log2 AM(m)=S(08 0.2+ MI: The conflict part of this function(S )is constant, and does not vary when m2 changes No distinction is made between the values of m2 and this does not make any sense because of the definition of the conflict part of a b.p.a S: The variations of the conflict part of this function (S)make sense; if m2 increases then so does the conflict part. Similarly, the non-specificity part behaves in a similar way: a decrease in m2 leads to an increase in m12 and as we can see via its non-specificity part S-S.=S0.6,0.4)-S(1-m2,m2)
Liu et al. (2007). Although the computational cost of every TU in DST is clearly different, the calculation of every TU in DST is simple. RB2: MI can be separated coherently in conflict and non-specificity by definition. Here, S * is used as a conflict measure and function I as a non-specificity measure. Recently, Abella´n et al. (2006) separated S * into two parts that coherently quantify conflict and nonspecificity for more general theories than DST. Here, S* (minimum of entropy) is used as a conflict measure and S* 2 S* is used as a non-specificity function. In order to obtain these parts, Abella´n and Moral (2005b) present a branch and bound algorithm to obtain S* on more general theories than DST. Only the AM function has no clear separation between conflict and non-specificity. In Jousselme et al. (2006), AM is presented as a special case of the function dS* þ ð1 2 dÞI, for an unknown d [ ð0; 1Þ. Therefore, when value AMðmÞ is used, it is impossible to know what quantity corresponds to conflict and what to non-specificity. RB3: In this point, we want to review the analysis presented in Jousselme et al. (2006) about the sensitivity of S * in DST using an example by Klir and Smith (1999): Example 2. Suppose there are two elements in the given frame of discernment X ¼ {1, 2}, and we know m({1}) ¼ m1, m ¼ ({2}) ¼ m2, so mð{1; 2}Þ ¼ m12 ¼ 1 2 m1 2 m2. At this point, we should mention that by definition (Yager 1983), the non-specificity part of m depends only on the m12 value and the conflict part of m depends on the interaction between m1 and m2 values. In Klir and Smith’s original example (1999), S * was identified as ‘highly insensitive to changes in evidence’, an ‘unsatisfactory situation’. S * gives the same value for all bodies of evidence for which both m1 and m2 fall into the range [0, 0.5]. When m2 # ½0:5; 1, the S * measure is entirely independent of the value of m1 and vice versa. Jousselme et al. (2006) proved that the AM measure does not behave in the same way and AM is neither independent of m1 or m2 by considering the above example. We will use the example and will apply it to every TU considered in this paper. Without loss of generality, it is supposed that m1 is known. They then consider that three cases appear: m1 . 0.5, m1 ¼ 0.5, and m1 , 0.5. (1) m1 . 0.5, e.g. m1 ¼ 0.6. Here, m12 ¼ 0.4 2 m2. We have S*ðmÞ ¼ Sð0:6; 0:4Þ; S*ðmÞ ¼ Sð1 2 m2; m2Þ; IðmÞ¼ð0:4 2 m2Þlog 2; AMðmÞ ¼ S 0:8 2 m2 2 ; 0:2 þ m2 2 : MI: The conflict part of this function (S * ) is constant, and does not vary when m2 changes. No distinction is made between the values of m2 and this does not make any sense because of the definition of the conflict part of a b.p.a. S* : The variations of the conflict part of this function (S*) make sense; if m2 increases then so does the conflict part. Similarly, the non-specificity part behaves in a similar way: a decrease in m2 leads to an increase in m12 and as we can see via its non-specificity part S* 2 S* ¼ Sð0:6; 0:4Þ 2 Sð1 2 m2; m2Þ; 740 J. Abella´n and A. Masegosa Downloaded by [New York University] at 12:09 08 November 2011
International Journal of General Systems 741 it results in an increase in non-specificity. We therefore consider that the constant value al uncertainty value makes sense because in this example with X=[1, 2 ,an increase in conflict can result in a decrease in non-specificity, and vice versa, which can be compensated mutually AM: This function is not constant and depends on the m2 values. This situation makes sense but we do not know what happens to the variations of conflict or non-specificity (2)1=0.5 and m12=0.5 m2. We have: S(m)=S0.5,0.5);S4(m)=S(1-m2,m2); 二Ez∞N一 /(m)=(0.5-m2)log2; AMm=S(3-2,023+m) In this case, we can observe a similar situation to that in Case(1) (3)m10.5, the confict part of this function(S)is constant and does not vary when there is a change in m, It does distinguish between the values of m, and this makes more because of the definition of the conflict part of a b p a than in ituation(1). However, if 2 20.5, the conflict part depends on m2, and here AM makes more sense than in Case(1). S: Here, S is constant when 2 <0.5 and it depends on m2(or m12)when m2 20.5. Analysing the parts of S, the variations of the conflict part of this function(S)make sense. It depends on the minimum value of m1 and m2, called a. Studying the specificity part is more complex. We must now take into account the following situations:(3. 1)m2 20.5; (3.2)m1 sm2 <0.5 and (3.3)2 s m1 <0.5. In (3. 1)and (3.3), we have similar results as in Case (1). In(3.2) S"-S*=S(0.5,0.5)-S(0.8,0.2) which is a constant value. If m2 decreases by a quantity E, with0 <E<m2-mI,we obtain the same value of non-specificity. This could be considered to be an unsatisfactory situation. However. we think that here it makes sense to obtain a maximum total
it results in an increase in non-specificity. We therefore consider that the constant value S * as a total uncertainty value makes sense because in this example with X ¼ {1, 2}, an increase in conflict can result in a decrease in non-specificity, and vice versa, which can be compensated mutually. AM: This function is not constant and depends on the m2 values. This situation makes sense but we do not know what happens to the variations of conflict or non-specificity when we modify m2 value. (2) m1 ¼ 0.5 and m12 ¼ 0.5 m2. We have: S*ðmÞ ¼ Sð0:5; 0:5Þ; S*ðmÞ ¼ Sð1 2 m2; m2Þ; IðmÞ¼ð0:5 2 m2Þ log 2; AMðmÞ ¼ S 0:75 2 m2 2 ; 0:25 þ m2 2 : In this case, we can observe a similar situation to that in Case (1). (3) m1 , 0.5, e.g. m1 ¼ 0.2. Here, m12 ¼ 0:8 2 m2. We have: S*ðmÞ ¼ Sð0:5; 0:5Þ if m2 , 0:5; Sð1 2 m2; m2Þ if M2 $ 0:5: 8 >>>: S*ðmÞ ¼ Sða; 1 2 aÞ if m2 , 0:5 ða ¼ min {m1; m2}Þ; Sð0:2; 0:8Þ if M2 $ 0:5: 8 >>>: IðmÞ¼ð0:8 2 m2Þ log 2; AMðmÞ ¼ S 0:8 2 m2 2 ; 0:2 þ m2 2 : MI: If m2 . 0.5, the conflict part of this function (S * ) is constant and does not vary when there is a change in m2. It does distinguish between the values of m2 and this makes more sense because of the definition of the conflict part of a b.p.a. than in Situation (1). However, if m2 $ 0.5, the conflict part depends on m2, and here AM makes more sense than in Case (1). S * : Here, S * is constant when m2 , 0.5 and it depends on m2 (or m12) when m2 $ 0.5. Analysing the parts of S * , the variations of the conflict part of this function (S*) make sense. It depends on the minimum value of m1 and m2, called a. Studying the nonspecificity part is more complex. We must now take into account the following situations: (3.1) m2 $ 0.5; (3.2) m1 # m2 , 0:5 and (3.3) m2 # m1 , 0:5. In (3.1) and (3.3), we have similar results as in Case (1). In (3.2), S* 2 S* ¼ Sð0:5; 0:5Þ 2 Sð0:8; 0:2Þ; which is a constant value. If m2 decreases by a quantity e, with 0 , e , m2 2 m1, we obtain the same value of non-specificity. This could be considered to be an ‘unsatisfactory situation’. However, we think that here it makes sense to obtain a maximum total International Journal of General Systems 741 Downloaded by [New York University] at 12:09 08 November 2011