6.042/18.] Mathematics for Computer Science April 22, 2005 Srini devadas and Eric Lehman Notes for recitation 18 The Law of Total Probability is a handy tool for breaking down the computation of a pro ability into distinct cases. More precisely, suppose we are interested in the probability of an event E: Pr(E). Suppose also that the random experiment can evolve in two different ways; that is, two different cases X and X are possible. Suppose also that it is easy to find the probability of each case: Pr(X)and Pr(X it easy to find the probability of the event in each case: Pr(E I X)and Pr(EI X) Then finding the probability of E is only two multiplications and an addition away Theorem 1(Law of Total Probability). Let e and X be events, and 0< Pr(x)<1. Then Pr(e) Pr(X)·Pr(E|X)+Pr(X)·Pr(E Proof. Let's simplify the right-hand side P(E|X)·Pr(X)+Pr(E|X)·Pr(X) Pr(E∩X) Pr(E∩X) Pr(X)+ E∩X)+Pr(E∩X) Pr(E) The first step uses the definition of conditional probability. On the next-to-last line, were adding the probabilities of all outcomes in E and X to the probabilities of all outcomes in E and not in X. Since every outcome in E is either in X or not in X, this is the sum of the probabilities of all outcomes in E, which equals Pr(E) What happens if the experiment can evolve in more than two different ways? That is, what if there are n cases, X1,..., Xn, which are mutually exclusive(no two cases can hap pen simultaneously) and collectively exhaustive (at least one case must happen? If it is still easy to find the probability of each case and the probability of the event in each case, then again finding Pr(E)is trivial Theorem 2. Let e be an event and let X1,..., Xn be disjoint events whose union exhausts the sample space. The P(E)=∑P(E|x)Pr(X) provided that Pr(X)≠0
� � � � � � � � � � � � � � � 6.042/18.062J Mathematics for Computer Science April 22, 2005 Srini Devadas and Eric Lehman Notes for Recitation 18 The Law of Total Probability is a handy tool for breaking down the computation of a probability into distinct cases. More precisely, suppose we are interested in the probability of an event E: Pr (E). Suppose also that the random experiment can evolve in two different ways; that is, two different cases X and X are possible. Suppose also that • it is easy to find the probability of each case: Pr (X) and Pr X , • it easy to find the probability of the event in each case: Pr (E | X) and Pr E X | . Then finding the probability of E is only two multiplications and an addition away. Theorem 1 (Law of Total Probability). Let E and X be events, and 0 < Pr (X) < 1. Then Pr (E) = Pr (X) · Pr (E | X) + Pr X · Pr E | X Proof. Let’s simplify the righthand side. Pr (E | X) · Pr (X) + Pr E X | · Pr X � � Pr E ∩ X � = Pr (E ∩ X) · Pr (X) + � � Pr � X Pr (X) Pr X · = Pr (E ∩ X) + Pr E ∩ X = Pr (E) The first step uses the definition of conditional probability. On the nexttolast line, we’re adding the probabilities of all outcomes in E and X to the probabilities of all outcomes in E and not in X. Since every outcome in E is either in X or not in X, this is the sum of the probabilities of all outcomes in E, which equals Pr (E). What happens if the experiment can evolve in more than two different ways? That is, what if there are n cases, X1, . . . , Xn, which are mutually exclusive (no two cases can happen simultaneously) and collectively exhaustive (at least one case must happen)? If it is still easy to find the probability of each case and the probability of the event in each case, then again finding Pr (E) is trivial. Theorem 2. Let E be an event and let X1, . . . , Xn be disjoint events whose union exhausts the sample space. Then n Pr (E) = Pr (E | Xi) · Pr (Xi) i=1 provided that Pr (Xi) =� 0
Recitation 18 Problem 1. There is a rare and deadly disease called nerditosis which afflicts about 1 per son in 1000. On symption is a compulsion to refer to everything- fields of study, classes, buildings, etc- using numbers. It's horrible. As victims enter their final, downward spiral, theyre awarded a degree from MIT. Two doctors claim that they can diagnose Nerditosis (a) Doctor X received his degree from Harvard Medical School. He practices at Mas- sachusetts General Hospital and has access to the latest scanners, lab tests, and re- search. Suppose you ask Doctor X whether you have the disease If you have Nerditosis, he says"yes"with probability 0.99 If you dont have it, he says"no"with probability 0.97 Let d be the event that you have the disease, and let e be the event that the diag- nosis is erroneous. Use the Total Probability Law to compute Pr(E), the probability that Doctor X makes a mistake Solution. By the Total Probability Law Pr(E)=Pr(EI D). Pr(D)+Pr(EID). Pr(D =0.01.0.001+0.03·0.999 (b)" Doctor"Y received his genuine degree from a fully-accredited university for $49.95 via a special internet offer. He knows that Nerditosis stikes 1 person in 1000, but is a little shakey on how to interpret this. So if you ask him whether you have the disease, he'll helpfully say"yes" with probability 1 in 1000 regardless of whether you actually do or not Let d be the event that you have the disease, and let F be the event that the diag nosis is faulty. Use the Total Probability Law to compute Pr(F), the probability that Doctor y made a mistake Solution by the Total Probability Law Pr(F)=Pr(FI D). Pr(D)+Pr(FID). PrD 0.999·0.001+0.001.0.999 0.001998 (c) Which doctor is more reliable? Solution. Doctor X makes more than 15 times as many errors as doctor y
� � � � � � � � Recitation 18 2 Problem 1. There is a rare and deadly disease called Nerditosis which afflicts about 1 person in 1000. On symption is a compulsion to refer to everything— fields of study, classes, buildings, etc.— using numbers. It’s horrible. As victims enter their final, downward spiral, they’re awarded a degree from MIT. Two doctors claim that they can diagnose Nerditosis. (a) Doctor X received his degree from Harvard Medical School. He practices at Massachusetts General Hospital and has access to the latest scanners, lab tests, and research. Suppose you ask Doctor X whether you have the disease. • If you have Nerditosis, he says “yes” with probability 0.99. • If you don’t have it, he says “no” with probability 0.97. Let D be the event that you have the disease, and let E be the event that the diagnosis is erroneous. Use the Total Probability Law to compute Pr (E), the probability that Doctor X makes a mistake. Solution. By the Total Probability Law: Pr (E) = Pr (E | D) · Pr (D) + Pr E D | · Pr D = 0.01 · 0.001 + 0.03 · 0.999 = 0.02998 (b) “Doctor” Y received his genuine degree from a fullyaccredited university for $49.95 via a special internet offer. He knows that Nerditosis stikes 1 person in 1000, but is a little shakey on how to interpret this. So if you ask him whether you have the disease, he’ll helpfully say “yes” with probability 1 in 1000 regardless of whether you actually do or not. Let D be the event that you have the disease, and let F be the event that the diagnosis is faulty. Use the Total Probability Law to compute Pr (F), the probability that Doctor Y made a mistake. Solution. By the Total Probability Law: Pr (F) = Pr (F D| ) · Pr (D) + Pr F | D · Pr D = 0.999 · 0.001 + 0.001 · 0.999 = 0.001998 (c) Which doctor is more reliable? Solution. Doctor X makes more than 15 times as many errors as Doctor Y
Recitation 18 Problem 2. a Barglesnort makes its lair in one of three caves The Barglesnort inhabits cave 1 with probability ,, cave 2 with probability 1, and cave 3 with probability. a rabbit subsequently moves into one of the two unoccupied caves selected with equal probability. With probability 3, the rabbit leaves tracks at the entranc to its cave. (Barglesnorts are much too clever to leave tracks )What is the probability that the Barglesnort lives in cave 3, given that there are no tracks in front of cave 2? Use a tree diagram and the four-step method Solution. A tree diagram is given below. Let B be the event that the Barglesnort inhabits cave 3, and let T2 be the event that there are tracks in front of cave 2. Taking data from the tree diagram, we can compute the desired probability as follows: Pt(B1|72)=P(B3n 5 In the denominator, we apply the formula Pr(T2)=1-Pr(T2)for convenience Rabbits prob. Bargle's yes-1/2
� � � � � � Recitation 18 3 Problem 2. A Barglesnort makes its lair in one of three caves: 1 2 3 The Barglesnort inhabits cave 1 with probability 1 2 , cave 2 with probability 1 , and cave 3 4 with probability 1 . A rabbit subsequently moves into one of the two unoccupied caves, 4 selected with equal probability. With probability 1 , the rabbit leaves tracks at the entrance 3 to its cave. (Barglesnorts are much too clever to leave tracks.) What is the probability that the Barglesnort lives in cave 3, given that there are no tracks in front of cave 2? Use a tree diagram and the fourstep method. Solution. A tree diagram is given below. Let B3 be the event that the Barglesnort inhabits cave 3, and let T2 be the event that there are tracks in front of cave 2. Taking data from the tree diagram, we can compute the desired probability as follows: Pr B3 | T2 = Pr B � 3 ∩ � T2 Pr T2 1 1 1 24 12 12 + + = 1 1 1 − 12 − 24 5 = 21 In the denominator, we apply the formula Pr T2 = 1 − Pr (T2) for convenience. 1 2 3 1 3 3 2 1 2 Bargle’s lair Rabbit’s hole rabbit tracks? no yes 1/2 1/2 1/4 1/4 1/2 1/2 1/2 1/2 1/2 1/3 1/3 1/3 1/3 1/3 1/3 2/3 2/3 2/3 2/3 2/3 2/3 prob. 1/12 1/12 1/6 1/6 1/24 1/12 1/24 1/12 1/24 1/12 1/24 1/12 Bargle yes yes yes yes in 3? Tracks at 2? yes yes
Recitation 18 Problem 3. There is a deck of cards on the table. Either John or Mary shuffled it and we have no reason to believe in one case more than the other. Now John is a well-known cheater with well-known preferences: he always steals the ace of diamonds while shuf fling. Mary, on the other hand, is a very honest girl: a deck suffled by her is always a full card deck do any calculations: Who is more likely to have shuffled the deck ExPlain (a)You pick the topmost card on the deck and you see a queen of hearts. Before you Solution. A shuffling by John strictly increases the fraction of non-aces in the deck Hence, between the two worlds (1) the world where John has shuffled the deck and (2) the world where Mary has shuffled the deck it is the first world rather than the second one that favors the event of the topmost card being a non-ace. Since we know this event is a fact and the two worlds are otherwise equally likely, we should bet we live in world (1) (b)Now calculate What is the probability that John has shuffled the deck? What is the probability that it has been Mary? Solution. Let J be the event that John shuffles the deck and A that the topmost card is an ace. We want the probabilities Pr(J|A) and Pr(M| A) J and therefore Pr(M A)= Pr(J| A)=1-Pr(J| A), so that we need only calculate the probability about John. By the definition of conditional probability(first equation) and then the product rule(on the enumerator)and the law of total probability(on the denominator), we know Pr(J∩A) Pr(J)- Pr(AlJ) Pr(④)Pr(J)·Pr(A|J+Pr(M)Pr(A|M) and everything in this last fraction is known (J|A)=2=15 52+51103 which is(slightly, but) strictly greater than 2, as expected Like John, Peter is also a well-known cheater: when he shuffles the deck, he also steals a card from it; but (unlike John) he steals a random card. That is, every card is equally likely to be stolen when Peter is shuffling
� � � � � � � � � � � � � � � � Recitation 18 4 Problem 3. There is a deck of cards on the table. Either John or Mary shuffled it and we have no reason to believe in one case more than the other. Now, John is a wellknown cheater with wellknown preferences: he always steals the ace of diamonds while shuf fling. Mary, on the other hand, is a very honest girl: a deck suffled by her is always a full 52card deck. (a) You pick the topmost card on the deck and you see a queen of hearts. Before you do any calculations: Who is more likely to have shuffled the deck? Explain. Solution. A shuffling by John strictly increases the fraction of nonaces in the deck. Hence, between the two worlds: (1) the world where John has shuffled the deck and (2) the world where Mary has shuffled the deck it is the first world rather than the second one that favors the event of the topmost card being a nonace. Since we know this event is a fact and the two worlds are otherwise equally likely, we should bet we live in world (1). (b) Now calculate. What is the probability that John has shuffled the deck? What is the probability that it has been Mary? Solution. Let J be the event that John shuffles the deck and A that the topmost card is an ace. We want the probabilities Pr J | A and Pr M | A Clearly, M = J and therefore Pr M | A = Pr J | A = 1 − Pr J A | , so that we need only calculate the probability about John. By the definition of conditional probability (first equation) and then the product rule (on the enumerator) and the law of total probability (on the denominator), we know � � � Pr (J) � · Pr A | J Pr J | A = Pr J � ∩ � A = Pr (J) Pr A | J + Pr (M) Pr � A M � Pr A · · | and everything in this last fraction is known: 1 48 1 51 1 51 + 1 52 52 52 = = 52 + 51 103 51 · 2 Pr J | A = 1 48 1 48 = · + · 2 51 2 52 1 which 2 is (slightly, but) strictly greater than , as expected. Like John, Peter is also a wellknown cheater: when he shuffles the deck, he also steals a card from it; but (unlike John) he steals a random card. That is, every card is equally likely to be stolen when Peter is shuffling
Recitation 18 (c)Suppose you know that Mary shuffled the deck and you are about to pick the topmost card. what is the probability that you will see an ace? Solution. If we know Mary shuffled the deck, we know the deck if a full 52-card deck. So, easily, the probability is 5? (d) Suppose you know that Peter shuffled the deck and you are about to pick the topmost card. What is the probability that you will see an ace?(Hint: What is this probability if Peter steals an ace? What if Peter steals a non-ace Solution. Suppose Peter shuffles the deck. Then there are two cases about what card he steals: it's either an ace or a non-ace. let sa be the event that he steals an ace. Since he steals a card at random, we know he steals an ace with probability Pr(Sa)=2 and a non-ace with probability Pr(SA)=52 Now, as before, let A be the event that the topmost card is an ace. If we know what case we are in with respect to the stolen card, it is easy to calculate the probability 小: Pr(AISA=3 and Pr(AISA So, we know the probability in each case and we also know the probability of each case. This rings the bell of the law of total probability Pr(4)=Pr(SA)Pr(4|SA)+P(SA)Pr(A|S4)=是·晶+盏·==是 So the probability that the topmost card is an ace is a2 (e) Anything strange with the answers to parts(c)and( d)? Solution The two answers are identical. In other words whether the deck is miss- ing a card or not does not affect the probability that the topmost card is an ace! How can that be? Here is an explanation. The situation of part(c)is identical to the following what is the probability that the first card is an ace?Ck we pick the top two cards of a well-shuffled 52-card deck; (because the selection of the second card is irrelevant). Similarly, the situation of part(d)is identical to the following we pick the top two cards of a well-shuffled 52-card deck That is the probability that the second card (because the effect of Peter stealing a card at random and shuffling is identical to the effect of us drawing the topmost card). Now the two situations we have just introduced are identical, because the number of pairs where the first card is an ace is equal to the number of pairs where the second card is an ace, for obvious reasons of symmetry
� � � � � � Recitation 18 5 (c) Suppose you know that Mary shuffled the deck and you are about to pick the topmost card. What is the probability that you will see an ace? Solution. If we know Mary shuffled the deck, we know the deck if a full 52card 4 deck. So, easily, the probability is . 52 (d) Suppose you know that Peter shuffled the deck and you are about to pick the topmost card. What is the probability that you will see an ace? (Hint: What is this probability if Peter steals an ace? What if Peter steals a nonace?) Solution. Suppose Peter shuffles the deck. Then there are two cases about what card he steals: it’s either an ace or a nonace. Let SA be the event that he steals an ace. Since he steals a card at random, we know he steals an ace with probability 4 48 Pr (SA) = 52 and a nonace with probability Pr SA = . 52 Now, as before, let A be the event that the topmost card is an ace. If we know what case we are in with respect to the stolen card, it is easy to calculate the probability of A: 3 � � 4 Pr (A | SA) = and Pr A | SA = 51 51 So, we know the probability in each case and we also know the probability of each case. This rings the bell of the law of total probability: 4 3 4 4·(3+48) = 4 Pr (A) = Pr (SA) · Pr (A | SA) + Pr SA Pr A | SA = 52 · 51 + 48 = . 52 · 51 52·51 52 · 4 So the probability that the topmost card is an ace is . 52 (e) Anything strange with the answers to parts (c) and (d)? Solution. The two answers are identical. In other words, whether the deck is missing a card or not does not affect the probability that the topmost card is an ace! How can that be? Here is an explanation. The situation of part (c) is identical to the following: we pick the top two cards of a wellshuffled 52card deck; what is the probability that the first card is an ace? (because the selection of the second card is irrelevant). Similarly, the situation of part (d) is identical to the following: we pick the top two cards of a wellshuffled 52card deck; what is the probability that the second card is an ace? (because the effect of Peter stealing a card at random and shuffling is identical to the effect of us drawing the topmost card). Now the two situations we have just introduced are identical, because the number of pairs where the first card is an ace is equal to the number of pairs where the second card is an ace, for obvious reasons of symmetry