Decision Making under Uncertainty Jun zhu AI Lab, Tsinghua University http://cs.cmu.edu/-iunzhu Nov6.2011
Jun Zhu AI Lab, Tsinghua University http://cs.cmu.edu/~junzhu Nov 6, 2011 Decision Making under Uncertainty
Decision Making is Ubiquitous THE PERKS OF ONLINE STOCK INVESTING determining which decision. from a set of Pe ossible alternatives, is optimal MLA 2011, Tsinghua university
Decision Making is Ubiquitous determining which decision, from a set of possible alternatives, is optimal 2 MLA 2011, Tsinghua University
Two Challenges o The set of outcomes can be large and complex a The agent must weigh different factors in determining the most referred outcomes a E. g, when deciding which job to take we must consider o Work time, amount of salary, company reputation, officemates utility Theor e The outcome of an action is not fully determined a We must consider the probabilities of various outcomes and the preferences of the agent between these outcomes Probability Theory MLA 2011, Tsinghua University
Two Challenges The set of outcomes can be large and complex ❑ The agent must weigh different factors in determining the most preferred outcomes ❑ E.g., when deciding which job to take we must consider: Work time, amount of salary, company reputation, officemates, … The outcome of an action is not fully determined ❑ We must consider the probabilities of various outcomes and the preferences of the agent between these outcomes Utility Theory Probability Theory 3 MLA 2011, Tsinghua University
Decision Theory o the theory of statistical decision functions [Wald, 1950 +.. is concerned with the process of making decisions and explicitly includes the payoffs that may result determining which decision, from a set of possible alternatives, is optimal for a particular set of conditions MLA 2011, Tsinghua University
Decision Theory the theory of statistical decision functions [Wald, 1950] … is concerned with the process of making decisions and explicitly includes the payoffs that may result ❑ determining which decision, from a set of possible alternatives, is optimal for a particular set of conditions 4 MLA 2011, Tsinghua University
Decision Making in Statistical Learning o distribution-free only training examples are provided a use data to estimate the complete distribution, and then derive decision rule( density estimation is hard P(a Cu) a use training examples to directly make decision, e. g, empirical risk minimization o DT: assumes the complete data distribution is qiven MLA 2011, Tsinghua University
Decision Making in Statistical Learning distribution-free; only training examples are provided ❑ use data to estimate the complete distribution, and then derive decision rule (density estimation is hard!) ❑ use training examples to directly make decision, e.g., empirical risk minimization DT: assumes the complete data distribution is given 5 MLA 2011, Tsinghua University
Outline Utilities and Decisions( Chap 22) Maximum Expected Utility(MEU) principle a Utility theor o Structured decision problems( Chap 23 a Decision tree a Influence diagrams MLA 2011, Tsinghua University
Outline Utilities and Decisions (Chap 22) ❑ Maximum Expected Utility (MEU) principle ❑ Utility theory Structured decision problems (Chap 23) ❑ Decision tree ❑ Influence diagrams 6 MLA 2011, Tsinghua University
Decision-making Situation o A decision-making situation D consists of et of outcomes o-f ON a a set of possible actions that an agent can take A=fal,.,aKJ a a probabilistic outcome model p: A-A0, which defines a lottery Ta Utility function U: 0+R, where U(O) preference fo the outcome o (o1);…;oN:丌a(ON) a a probability distribution over outcomes given the action a was taken Preference ordering: T1 7 T2 if the agent prefers T1 丌1~丌2 if the agent is indifferent between丌iand丌2 MLA 2011, Tsinghua University Compound lottery
Decision-making Situation A decision-making situation consists of ❑ a set of outcomes ❑ a set of possible actions that an agent can take ❑ a probabilistic outcome model , which defines a lottery ❑ a utility function , where is the agent’s preference for the outcome Lottery ❑ a probability distribution over outcomes given the action a was taken ❑ Preference ordering: 7 MLA 2011, Tsinghua University Compound Lottery
Maximum Expected Utility (MEU Principle o The Meu principle asserts that, in a decision-making situation D we should choose the action a that maximizes the expected utility EUDd]=∑xa(oU MLA 2011, Tsinghua University
Maximum Expected Utility (MEU) Principle The MEU principleasserts that, in a decision-making situation , we should choose the action that maximizes the expected utility 8 MLA 2011, Tsinghua University
One state/One Action Example SO EU[S0]=100x0.2+50x0.7+70x0.1 20+35+7 A1 62 S1 0.2 0.7 0.1 100 70 MLA 2011, Tsinghua University
One State/One Action Example s0 s1 s2 s3 A1 0.2 0.7 0.1 100 50 70 EU[S0] = 100 x 0.2 + 50 x 0.7 + 70 x 0.1 = 20 + 35 + 7 = 62 9 MLA 2011, Tsinghua University
One state/Two Actions Example SO ElsO=62 EU2S0=74 euso=maxEUlSo,Eu2ISOB 74 Al A2 S2 S4 0.2 0.70.20.1 100 50 70 MLA 2011, Tsinghua University
One State/Two Actions Example s0 s1 s2 s3 A1 0.2 0.7 0.1 100 50 70 A2 s4 0.2 0.8 80 • EU1[S0] = 62 • EU2[S0] = 74 • EU[S0] = max{EU1[S0],EU2[S0]} = 74 10 MLA 2011, Tsinghua University