6.042/18.] Mathematics for Computer Science April 21, 2005 Srini devadas and Eric Lehman Lecture notes Conditional Probability Suppose that we pick a random person in the world. Everyone has an equal chance of being selected. Let A be the event that the person is an MIT student, and let B be the event that the person lives in Cambridge. What are the probabilities of these events? Intuitively were picking a random point in the big ellipse shown below and asking how likely that point is to fall into region A or B et of all people in the world B set of mit set of people who students live in Cambridge The vast majority of people in the world neither live in Cambridge nor are MIT students, o events A and B both have low probability. But what is the probability that a person is an MIT student, given that the person lives in Cambridge? This should be much greater- but what it is exactly? What were asking for is called a conditional probability; that is, the probability that one event happens, given that some other event definitely happens. Questions about conditional probabilities come up all the time What is the probability that it will rain this afternoon, given that it is cloudy this morning What is the probability that two rolled dice sum to 10, given that both are odd? What is the probability that I'll get four-of-a-kind in Texas No Limit Hold'Em Poker, given that I'm initially dealt two queens? There is a special notation for conditional probabilities. In general, Pr(A l B)denotes the probability of event A, given that event B happens. So, in our example, Pr(A l B)is the probability that a random person is an MIT student, given that he or she is a Cam bridge resident
6.042/18.062J Mathematics for Computer Science April 21, 2005 Srini Devadas and Eric Lehman Lecture Notes Conditional Probability Suppose that we pick a random person in the world. Everyone has an equal chance of being selected. Let A be the event that the person is an MIT student, and let B be the event that the person lives in Cambridge. What are the probabilities of these events? Intuitively, we’re picking a random point in the big ellipse shown below and asking how likely that point is to fall into region A or B: A B set of all people in the world set of people who live in Cambridge set of MIT students The vast majority of people in the world neither live in Cambridge nor are MIT students, so events A and B both have low probability. But what is the probability that a person is an MIT student, given that the person lives in Cambridge? This should be much greater— but what it is exactly? What we’re asking for is called a conditional probability; that is, the probability that one event happens, given that some other event definitely happens. Questions about conditional probabilities come up all the time: • What is the probability that it will rain this afternoon, given that it is cloudy this morning? • What is the probability that two rolled dice sum to 10, given that both are odd? • What is the probability that I’ll get fourofakind in Texas No Limit Hold ’Em Poker, given that I’m initially dealt two queens? There is a special notation for conditional probabilities. In general, Pr (A | B) denotes the probability of event A, given that event B happens. So, in our example, Pr (A | B) is the probability that a random person is an MIT student, given that he or she is a Cambridge resident
Conditional Probability How do we compute Pr(A B)? Since we are given that the person lives in Cambridge, we can forget about everyone in the world who does not. Thus, all outcomes outside event B are irrelevant. So, intuitively, Pr(a b) should be the fraction of Cambridge residents that are also MIT students; that is, the answer should be the probability that the person is in set An B(darkly shaded) divided by the probability that the person is in set B(lightly shaded). This motivates the definition of conditional probability P(4|B)=P(B) If Pr(B)=0, then the conditional probability Pr(A l B)is undefined Probability is generally counterintuitive, but conditional probability is the worst! Con ditioning can subtly alter probabilities and produce unexpected results in randomized algorithms and computer systems as well as in betting games. Yet, the mathematical definition of conditional probability given above is very simple and should give you no trouble- provided you rely on formal reasoning and not intuition 1 The halting problem The Halting problem is the canonical undecidable problem in computation theory that was first introduced by Alan Turing in his seminal 1936 paper. The problem is to determine tantly, it is the name of the MIT EECS department's famed C-league hockey team. por whether a Turing machine halts on a given blah, blah, blah. Anyway, much more In a best-of-three tournament, the Halting Problem wins the first game with probability 2. In subsequent games, their probability of winning is determined by the outcome of the previous game. If the Halting Problem won the previous game, then they are envigorated y victory and win the current game with probability 3. If they lost the previous game, then they are demoralized by defeat and win the current game with probablity only What is the probability that the Halting Problem wins the tournament, given that they win the first game? 1.1 Solution to the halting problem This is a question about a conditional probability. Let A be the event that the Halting Problem wins the tournament and let b be the event that they win the first game. Our goal is then to determine the conditional probability Pr(A l B) We can tackle conditional probability questions just like ordinary probability prob below, followed by an explanation of its construction and use te tree diagram is showr lems: using a tree diagram and the four-step method. A com
2 Conditional Probability How do we compute Pr (A | B)? Since we are given that the person lives in Cambridge, we can forget about everyone in the world who does not. Thus, all outcomes outside event B are irrelevant. So, intuitively, Pr (A | B) should be the fraction of Cambridge residents that are also MIT students; that is, the answer should be the probability that the person is in set A ∩ B (darkly shaded) divided by the probability that the person is in set B (lightly shaded). This motivates the definition of conditional probability: Pr (A | B) = Pr (A ∩ B) Pr (B) If Pr (B) = 0, then the conditional probability Pr (A | B) is undefined. Probability is generally counterintuitive, but conditional probability is the worst! Conditioning can subtly alter probabilities and produce unexpected results in randomized algorithms and computer systems as well as in betting games. Yet, the mathematical definition of conditional probability given above is very simple and should give you no trouble— provided you rely on formal reasoning and not intuition. 1 The Halting Problem The Halting Problem is the canonical undecidable problem in computation theory that was first introduced by Alan Turing in his seminal 1936 paper. The problem is to determine whether a Turing machine halts on a given blah, blah, blah. Anyway, much more importantly, it is the name of the MIT EECS department’s famed Cleague hockey team. In a bestofthree tournament, the Halting Problem wins the first game with probability 1 2 . In subsequent games, their probability of winning is determined by the outcome of the previous game. If the Halting Problem won the previous game, then they are envigorated by victory and win the current game with probability 2 3 . If they lost the previous game, then they are demoralized by defeat and win the current game with probablity only 1 3 . What is the probability that the Halting Problem wins the tournament, given that they win the first game? 1.1 Solution to the Halting Problem This is a question about a conditional probability. Let A be the event that the Halting Problem wins the tournament, and let B be the event that they win the first game. Our goal is then to determine the conditional probability Pr (A | B). We can tackle conditional probability questions just like ordinary probability problems: using a tree diagram and the fourstep method. A complete tree diagram is shown below, followed by an explanation of its construction and use
Conditional Probability WW WLW l/18 W WLL 1/9 LWW W W LWL 1/18 1 I st game从 event A: event B outcome 2nd game rd game outcome win the win the outcome utcome outcome series? Ist game? probability Step 1: Find the Sample Space Each internal vertex in the tree diagram has two children, one corresponding to a win for the Halting Problem (labeled W) and one corresponding to a loss (labeled L). The complete sample space is S=WW, WLW, WLL, LWW, LWL, LLI Step 2: Define Events of Interest The event that the Halting Problem wins the whole tournament is T=WW, WLW, LWWI And the event that the halting problem wins the first game is F=WW,WLW,WLLI The outcomes in these events are indicated with checkmarks in the tree diagram Step 3: Determine Outcome Probabilities Next, we must assign a probability to each outcome. We begin by labeling edges as spec ified in the problem statement. Specifically, The Halting Problem has a 1/2 chance of winning the first game, so the two edges leaving the root are each assigned probability 1/ 2. Other edges are labeled 1/3 or 2 /3 based on the outcome of the preceding game We then find the probability of each outcome by multiplying all probabilities along the corresponding root-to-leaf path. For exan 7 ple, the probability of outcome WLL
Conditional Probability 3 2/3 L 1/2 W 1/2 W 1/3 L 2/3 L 1/3 W 2/3 L L 1/3 W 2/3 W 1/3 1st game outcome 2nd game outcome 3rd game outcome probability outcome 1/3 1/18 1/9 1/9 1/18 1/3 event B: win the 1st game? event A: win the series? WW WLW WLL LWW LWL LL outcome Step 1: Find the Sample Space Each internal vertex in the tree diagram has two children, one corresponding to a win for the Halting Problem (labeled W) and one corresponding to a loss (labeled L). The complete sample space is: S = {WW, W LW, W LL, LWW, LW L, LL} Step 2: Define Events of Interest The event that the Halting Problem wins the whole tournament is: T = {WW, W LW, LWW} And the event that the Halting Problem wins the first game is: F = {WW, W LW, W LL} The outcomes in these events are indicated with checkmarks in the tree diagram. Step 3: Determine Outcome Probabilities Next, we must assign a probability to each outcome. We begin by labeling edges as specified in the problem statement. Specifically, The Halting Problem has a 1/2 chance of winning the first game, so the two edges leaving the root are each assigned probability 1/2. Other edges are labeled 1/3 or 2/3 based on the outcome of the preceding game. We then find the probability of each outcome by multiplying all probabilities along the corresponding roottoleaf path. For example, the probability of outcome W LL is: 1 1 2 1 = 2 · 3 · 3 9
Conditional Probability Step 4: Compute Event Probabilities We can now compute the probability that The Halting Problem wins the tournament, given that they win the first game Pr(AB Pr(A∩B) Pr(B Pr(ww,WLwD Pr(wW, WLW,WLLn 1/3+1/18 /3+1/18+1/9 Were done! If the halting problem wins the first game then they win the whole touna ment with probability 7/9 1.2 Why Tree Diagrams Work Weve now settled into a routine of solving probability problems using tree diagrams But weve left a big question unaddressed: what is the mathematical justfication behind those funny little pictures? Why do they work? The answer involves conditional probabilities. In fact, the probabilities that weve been recording on the edges of tree diagrams are conditional probabilities. For example, con- sider the uppermost path in the tree diagram for the Halting Problem, which corresponds to the outcome ww. The first edge is labeled 1/2, which is the probability that the halt ing Problem wins the first game. The second edge is labeled 2/3, which is the probability that the Halting Problem wins the second game, given that they won the first--that's a conditional probability! More generally, on each edge of a tree diagram, we record the probability that the experiment proceeds along that path, given that it reaches the parent vertex So weve been using conditional probabilities all along. But why can we multiply edge probabilities to get outcome probabilities? For example, we concluded that Pr(Ww)= Why is this correct? The answer goes back to the definition of conditional probability. Rewriting this in a lightly different form gives the Product rule for probabilities
4 Conditional Probability Step 4: Compute Event Probabilities We can now compute the probability that The Halting Problem wins the tournament, given that they win the first game: Pr (A | B) = Pr (A ∩ B) Pr (B) Pr ({WW, W LW}) = Pr ({WW, W LW, W LL}) 1/3 + 1/18 = 1/3 + 1/18 + 1/9 7 = 9 We’re done! If the Halting Problem wins the first game, then they win the whole tournament with probability 7/9. 1.2 Why Tree Diagrams Work We’ve now settled into a routine of solving probability problems using tree diagrams. But we’ve left a big question unaddressed: what is the mathematical justficiation behind those funny little pictures? Why do they work? The answer involves conditional probabilities. In fact, the probabilities that we’ve been recording on the edges of tree diagrams are conditional probabilities. For example, consider the uppermost path in the tree diagram for the Halting Problem, which corresponds to the outcome WW. The first edge is labeled 1/2, which is the probability that the Halting Problem wins the first game. The second edge is labeled 2/3, which is the probability that the Halting Problem wins the second game, given that they won the first— that’s a conditional probability! More generally, on each edge of a tree diagram, we record the probability that the experiment proceeds along that path, given that it reaches the parent vertex. So we’ve been using conditional probabilities all along. But why can we multiply edge probabilities to get outcome probabilities? For example, we concluded that: 1 2 Pr (WW) = 2 · 3 1 = 3 Why is this correct? The answer goes back to the definition of conditional probability. Rewriting this in a slightly different form gives the Product Rule for probabilities:
Conditional Probability Rule(product Rule for 2 Events). If Pr(A2)#0, then Pr(A1n A2)=Pr(A1). Pr(A2 A1) Multiplying edge probabilities in a tree diagram amounts to evaluating the right side of this equation. For example Pr(win first game n win second game Pr(win first game)- Pr(win second game win first game So the Product Rule is the formal justification for multiplying edge probabilities to get outcome probabilities To justify multiplying edge probabilities along longer paths, we need a more general form the product rule Rule (product Rule for n Events). IfPr(A1n..n An-1)#0, then Pr(A1∩.∩An)=Pr(A1)Pr(A2|A1)Pr(A3|A1∩A2)…Pr(An|A1∩.∩An-1) Let's interpret this big formula in terms of tree diagrams. Suppose we want to compute the probability that an experiment traverses a particular root-to-leaf path of length n. Let Ai be the event that the experiment traverses the i-th edge of the path. Then A1n...n An is the event that the experiment traverse the whole path. The Product Rule says that the probability of this is the probability that the experiment takes the first edge times the probability that it takes the second, given it takes the first edge, times the probability it takes the third, given it takes the first two edges, and so forth. In other words, the probability of an outcome is the product of the edge probabilities along the corresponding root-to-leaf path 2 A Posteriori probabilities Suppose that we turn the hockey question around: what is the probability that the Halting Problem won their first game, given that they won the series? This seems like an absurd question! After all, if the Halting Problem won the series, then the winner of the first game has already been determined Therefore who won the first game is a question of fact, not a question of probability. However, our mathemati- cal theory of probability contains no notion of one event preceding another- there is no notion of time at all. Therefore, from a mathematical perspective, this is a perfectly valid question. And this is also a meaningful question from a practical perspective. Suppose that you're told that the Halting Problem won the series, but not told the results of indi vidual games. Then, from your perspective, it makes perfect sense to wonder how likely it is that The Halting Problem won the first game
� Conditional Probability 5 Rule (Product Rule for 2 Events). If Pr (A2) = 0 � , then: Pr (A1 ∩ A2) = Pr (A1) · Pr (A2 | A1) Multiplying edge probabilities in a tree diagram amounts to evaluating the right side of this equation. For example: Pr (win first game ∩ win second game) = Pr (win first game) · Pr (win second game | win first game) 1 2 = 2 · 3 So the Product Rule is the formal justification for multiplying edge probabilities to get outcome probabilities! To justify multiplying edge probabilities along longer paths, we need a more general form the Product Rule: Rule (Product Rule for n Events). If Pr (A1 ∩ . . . ∩ An−1) = 0, then: Pr (A1 ∩ . . . ∩ An) = Pr (A1) · Pr (A2 | A1) · Pr (A3 | A1 ∩ A2)· · ·Pr (An | A1 ∩ . . . ∩ An−1) Let’s interpret this big formula in terms of tree diagrams. Suppose we want to compute the probability that an experiment traverses a particular roottoleaf path of length n. Let Ai be the event that the experiment traverses the ith edge of the path. Then A1 ∩ . . . ∩ An is the event that the experiment traverse the whole path. The Product Rule says that the probability of this is the probability that the experiment takes the first edge times the probability that it takes the second, given it takes the first edge, times the probability it takes the third, given it takes the first two edges, and so forth. In other words, the probability of an outcome is the product of the edge probabilities along the corresponding roottoleaf path. 2 A Posteriori Probabilities Suppose that we turn the hockey question around: what is the probability that the Halting Problem won their first game, given that they won the series? This seems like an absurd question! After all, if the Halting Problem won the series, then the winner of the first game has already been determined. Therefore, who won the first game is a question of fact, not a question of probability. However, our mathematical theory of probability contains no notion of one event preceding another— there is no notion of time at all. Therefore, from a mathematical perspective, this is a perfectly valid question. And this is also a meaningful question from a practical perspective. Suppose that you’re told that the Halting Problem won the series, but not told the results of individual games. Then, from your perspective, it makes perfect sense to wonder how likely it is that The Halting Problem won the first game
Conditional Probability A conditional probability Pr(b| A) is called an a posteriori if event B precedes event A in time. Here are some other examples of a posteriori probabilities The probability it was cloudy this morning, given that it rained in the afternoon The probability that I was initially dealt two queens in Texas No Limit Hold'Em poker, given that I eventually got four-of-a-kind Mathematically, a posteriori probabilities are no different from ordinary probabilities; the distinction is only at a higher, philosophical level. Our only reason for drawing attention to them is to say, "Dont let them rattle you. Let's return to the original problem. The probability that the Halting Problem won their first game, given that they won the series is Pr(B l A). We can compute this using the definition of conditional probability and our earlier tree diagram Pr(B|4)=P(B∩ Pr(a) 1/3+1/18 1/3+1/18+1/9 This answer is suspicious! In the preceding section, we showed that Pr(A B)was also 7/9. Could it be true that Pr(a B)=Pr (b a)in general? Some reflection suggests this is unlikely. For example, the probability that I feel uneasy, given that I was abducted b liens, is pretty large But the probability that I was abducted by aliens, given that I feel uneasy, is rather small Let's work out the general conditions under which Pr(a B)= Pr(b a). By the definition of conditional probability, this equation holds if an only if Pr(A∩B)Pr(A∩B) Pr(B Pr This equation, in turn, holds only if the denominators are equal or the numerator is 0 Pr(B)=Pr(a) The former condition holds in the hockey example; the probability that the Halting Prob- lem wins the series(event A)is equal to the probability that it wins the first game(event B). In fact, both probabilities are 1 /2 1 A Coin problem Suppose you have two coins. One coin is fair; that is, comes up heads with probability 1/ 2 and tails with probability 1/2. The other is a trick coin; it has heads on both sides, and so
6 Conditional Probability A conditional probability Pr (B | A) is called an a posteriori if event B precedes event A in time. Here are some other examples of a posteriori probabilities: • The probability it was cloudy this morning, given that it rained in the afternoon. • The probability that I was initially dealt two queens in Texas No Limit Hold ’Em poker, given that I eventually got fourofakind. Mathematically, a posteriori probabilities are no different from ordinary probabilities; the distinction is only at a higher, philosophical level. Our only reason for drawing attention to them is to say, “Don’t let them rattle you.” Let’s return to the original problem. The probability that the Halting Problem won their first game, given that they won the series is Pr (B | A). We can compute this using the definition of conditional probability and our earlier tree diagram: Pr (B | A) = Pr (B ∩ A) Pr (A) 1/3 + 1/18 = 1/3 + 1/18 + 1/9 7 = 9 This answeris suspicious! In the preceding section, we showed that Pr (A | B) was also 7/9. Could it be true that Pr (A | B) = Pr (B A| ) in general? Some reflection suggests this is unlikely. For example, the probability that I feel uneasy, given that I was abducted by aliens, is pretty large. But the probability that I was abducted by aliens, given that I feel uneasy, is rather small. Let’s work out the general conditions under which Pr (A | B) = Pr (B A| ). By the definition of conditional probability, this equation holds if an only if: Pr (A ∩ B) Pr (A ∩ B) = Pr (B) Pr (A) This equation, in turn, holds only if the denominators are equal or the numerator is 0: Pr (B) = Pr (A) or Pr (A ∩ B) = 0 The former condition holds in the hockey example; the probability that the Halting Problem wins the series (event A) is equal to the probability that it wins the first game (event B). In fact, both probabilities are 1/2. 2.1 A Coin Problem Suppose you have two coins. One coin is fair; that is, comes up heads with probability 1/2 and tails with probability 1/2. The other is a trick coin; it has heads on both sides, and so
Conditional Probability always comes up heads. Now you choose a coin at random so that you're equally likely to pick each one. If you flip the coin you select and get heads, then what is the probability that you flipped the fair coin? This is another a posteriori problem since we want the probability of an event(that the fair coin was chosen) given the outcome of a later event(that heads came up). Intuition may fail us, but the standard four-step method works perfectly well Step 1: Find the Sample Space The sample space is worked out in the tree diagram below l/4 1/2 1/2 fair 1/2 unfa 1/2 event a: event b: ever outcome choice of result probability fair coin? heads? coin flip Step 2: Define Events of Interest Let a be the event that the fair coin was chosen. Let b the event that the result of the flip was heads. The outcomes in each event are marked in the figure. We want to compute Pr(AI B), the probability that the fair coin was chosen, given that the result of the flip was head Step 2: Compute Outcome Probabilities First, we assign probabilities to edges in the tree diagram. Each coin is chosen with prob ability 1/2. If we choose the fair coin, then head and tails each come up with probability 1/2. If we choose the trick coin, then heads comes up with probability 1. By the Product Rule, the probability of an outcome is the product of the probabilities on the correspond ing root-to-leaf path. All of these probabilities are shown in the tree diagram
Conditional Probability 7 always comes up heads. Now you choose a coin at random so that you’re equally likely to pick each one. If you flip the coin you select and get heads, then what is the probability that you flipped the fair coin? This is another a posteriori problem since we want the probability of an event (that the fair coin was chosen) given the outcome of a later event (that heads came up). Intuition may fail us, but the standard fourstep method works perfectly well. Step 1: Find the Sample Space The sample space is worked out in the tree diagram below. fair unfair choice of coin result H flip 1/2 1/2 1/2 1/2 H T 1/2 1/4 1/4 event A: outcome probability fair coin? event B: outcome heads? event chose A B? Step 2: Define Events of Interest Let A be the event that the fair coin was chosen. Let B the event that the result of the flip was heads. The outcomes in each event are marked in the figure. We want to compute Pr(A | B), the probability that the fair coin was chosen, given that the result of the flip was heads. Step 2: Compute Outcome Probabilities First, we assign probabilities to edges in the tree diagram. Each coin is chosen with probability 1/2. If we choose the fair coin, then head and tails each come up with probability 1/2. If we choose the trick coin, then heads comes up with probability 1. By the Product Rule, the probability of an outcome is the product of the probabilities on the corresponding roottoleaf path. All of these probabilities are shown in the tree diagram
Conditional Probability Step 4: Compute Event Probabilities Pr(A B Pr(A∩ Pr(B) 1 The first equation uses the Product Rule. On the second line, we use the fact that the probability of an event is the sum of the probabilities of the outcomes it contains final line is simplification. The probability that the fair coin was chosen, given that result of the flip was heads, is 1 3 2.2 A Variant of the two coins problem Let's consider a variant of the two coins problem. Someone hands you either a fair coin or a trick coin with heads on both sides. You flip the coin 100 times and see heads every time What can you say about the probability that you flipped the fair coin? Remarkably In order to make sense out of this outrageous claim, let's formalize the problem. The sample space is worked out in the tree diagram below. We do not know the probability that you were handed the fair coin initially- you were just given one coin or the other- o lets call that p result of event A: event B: 100 flips coin gIven given fair flipp to you coin? all heads? probab all heads X p/2 fair coin 1-12 ome tails 1-p trick coin 112 all heads Let a be the event that you were handed the fair coin, and let b be the event that you lipped 100 heads. Now, were looking for Pr(A I B), the probability that you were
8 Conditional Probability Step 4: Compute Event Probabilities Pr(A | B) = Pr(A ∩ B) Pr(B) 1 4 = 1 1 4 + 2 1 = 3 The first equation uses the Product Rule. On the second line, we use the fact that the probability of an event is the sum of the probabilities of the outcomes it contains. The final line is simplification. The probability that the fair coin was chosen, given that the result of the flip was heads, is 1/3. 2.2 A Variant of the Two Coins Problem Let’s consider a variant of the two coins problem. Someone hands you either a fair coin or a trick coin with heads on both sides. You flip the coin 100 times and see heads every time. What can you say about the probability that you flipped the fair coin? Remarkably— nothing! In order to make sense out of this outrageous claim, let’s formalize the problem. The sample space is worked out in the tree diagram below. We do not know the probability that you were handed the fair coin initially— you were just given one coin or the other— so let’s call that p. result of 100 flips 1/2100 1/2100 100 1−1/2 event A: given fair coin? event B: flipped all heads? coin given to you fair coin trick coin all heads some tails all heads probability X X X X 1−p p / 2 p − p / 2 100 100 p 1−p Let A be the event that you were handed the fair coin, and let B be the event that you flipped 100 heads. Now, we’re looking for Pr(A | B), the probability that you were
Conditional Probability handed the fair coin, given that you flipped 100 heads. The outcome probabilities are worked out in the tree diagram. Plugging the results into the definition of conditional probability gives Pr(AIB)=Pr(An B) Pr(B P/2 P+p/2 0(1-p)+P This expression is very small for moderate values of p because of the 2 00 term in the denominator. For example, if p=1 /2, then the probability that you were given the fair coin is essentially zero But we do not know the probability p that you were given the fair coin. And perhaps the value of p is not moderate; in fact, maybe p= 1-2-100. Then there is nearly an even chance that you have the fair coin, given that you flipped 100 heads. In fact, maybe you were handed the fair coin with probability p= 1. Then the probability that you were given the fair coin is, well, 1! a similar problem arises in polling before an election a pollster picks a random amer- ican and asks his or her party affiliation. If this process is repeated many times, what can be said about the population as a whole? To clarify the analogy, suppose that the country contains only two people. There is either one Republican and one Democrat (like the fair coin), or there are two Republicans (like the trick coin). The pollster picks a random cit izen 100 times, which is analogous to flipping the coin 100 times. Suppose that he picks a Republican every single time. We just showed that, even given this polling data, the probability that there is one citizen in each party could still be anywhere between 0 and What the pollster can say is that either 1. Something earth-shatteringly unlikely happened during the poll 2. There are two Republicans This is as far as probability theory can take us; from here, you must draw your own con- clusions. Based on life experience, many people would consider the second possibility more plausible. However, if you are just convinced that the country isnt entirely Repub lican(say, because you' re a citizen and a Democrat), then you might believe that the first possibility is actually more likely. 3 Medical Testing There is a deadly disease called X that has infected 10% of the population. There are no symptoms; victims just drop dead one day. Fortunately, there is a test for the disease. The
Conditional Probability 9 handed the fair coin, given that you flipped 100 heads. The outcome probabilities are worked out in the tree diagram. Plugging the results into the definition of conditional probability gives: Pr (A | B) = Pr (A ∩ B) Pr (B) p/2100 = 1 − p + p/2100 p = 2100(1 − p) + p This expression is very small for moderate values of p because of the 2100 term in the denominator. For example, if p = 1/2, then the probability that you were given the fair coin is essentially zero. But we do not know the probability p that you were given the fair coin. And perhaps the value of p is not moderate; in fact, maybe p = 1 − 2−100 . Then there is nearly an even chance that you have the fair coin, given that you flipped 100 heads. In fact, maybe you were handed the fair coin with probability p = 1. Then the probability that you were given the fair coin is, well, 1! A similar problem arises in polling before an election. A pollster picks a random American and asks his or her party affiliation. If this process is repeated many times, what can be said about the population as a whole? To clarify the analogy, suppose that the country contains only two people. There is either one Republican and one Democrat (like the fair coin), or there are two Republicans (like the trick coin). The pollster picks a random citizen 100 times, which is analogous to flipping the coin 100 times. Suppose that he picks a Republican every single time. We just showed that, even given this polling data, the probability that there is one citizen in each party could still be anywhere between 0 and 1! What the pollster can say is that either: 1. Something earthshatteringly unlikely happened during the poll. 2. There are two Republicans. This is as far as probability theory can take us; from here, you must draw your own conclusions. Based on life experience, many people would consider the second possibility more plausible. However, if you are just convinced that the country isn’t entirely Republican (say, because you’re a citizen and a Democrat), then you might believe that the first possibility is actually more likely. 3 Medical Testing There is a deadly disease called X that has infected 10% of the population. There are no symptoms; victims just drop dead one day. Fortunately, there is a test for the disease. The
Conditional Probability test is not perfect, however If you have the disease, there is a 10% chance that the test will say you do not. (These are called"false negatives") If you do not have disease, there is a 30% chance that the test will say you do. (These are"false pe slaves A random person is tested for the disease. If the test is positive, then what is the probability that the person has the disease? Step 1: Find the Sample Space The sample space is found with the tree diagram below. y pos person has X? 63 test result outcome event A: event B probability disease? positive? event a∩B? Step 2: Define Events of Interest Let a be the event that the person has the disease let b be the event that the test was positive. The outcomes in each event are marked in the tree diagram. We want to find Pr(A B), the probability that a person has disease X, given that the test was positive Step 3: Find Outcome Probabilities First, we assign probabilities to edges. These probabilities are drawn directly from the problem statement. By the Product Rule, the probability of an outcome is the product of the probabilities on the corresponding root-to-leaf path all probabilities are shown the figure
10 Conditional Probability test is not perfect, however: • If you have the disease, there is a 10% chance that the test will say you do not. (These are called “false negatives”.) • If you do not have disease, there is a 30% chance that the test will say you do. (These are “false positives”.) A random person is tested for the disease. If the test is positive, then what is the probability that the person has the disease? Step 1: Find the Sample Space The sample space is found with the tree diagram below. person has X? test result outcome probability event A B? yes no pos neg pos neg .1 .9 .9 .1 .3 .7 .09 .01 .27 .63 event A: event B: has disease? test positive? Step 2: Define Events of Interest Let A be the event that the person has the disease. Let B be the event that the test was positive. The outcomes in each event are marked in the tree diagram. We want to find Pr (A | B), the probability that a person has disease X, given that the test was positive. Step 3: Find Outcome Probabilities First, we assign probabilities to edges. These probabilities are drawn directly from the problem statement. By the Product Rule, the probability of an outcome is the product of the probabilities on the corresponding roottoleaf path. All probabilities are shown in the figure