麻省理工大学：遗传学（Genetics）讲稿_lecture4.pdf_大学文库

HANDBOOK for PROBABILITY CAlCUlatIons Many problems in diploid genetics rely on basic concepts of probability. This is because each individual inherits at random only one of two possible copies of a gene from each parent. Thus, breeding experi ments or inheritance in human pedigrees have probabilistic rather than absolute outcomes. Everyone has an intuitive sense of probability but what we need is a precise definition that will allow probabilities to be manipulated quantitatively Probabilities are usually defined in terms of possible outcomes of a trial. A trial could be the toss of a coin, the roll of a die, or two parents having a child. If we define a specific event a, p(a) or the probabil ity of a, can be defined as follows: after a very large number of trials, p(a) is simply the fraction of trials that give outcome a. In principle, we could determine pla by actually performing a large number of trials and directly measuring the fraction of trials that produce event a. This is sometimes called the Monte Carlo method " named after a famous European casino and works well for computer simulations of complicated phenomena. However, in many cases there is a much simpler way to calculate probabili ties. To directly calculate classical probabilities one must know enough about a process to break down the possible outcomes of a trial into some number of equally probable events. In these cases the prob- ability of event a is (a n where na is the number of outcomes that satisfy the criteria for a and n is the total number of equally probable outcomes. Note that since n includes all possible outcomes, na sn and o sp(a)sl Example: A couple has two children, what is the probability that they are both girls? Assuming that the chances of having a boy or a girl are equal, there are 4 equally probable ways of having two children (boy, boy; girl, boy; boy, girl; girl, girl) and the probability of two girls is 1/4 or 0.25 For classical probability problems you will always be able to arrive at the correct answer by writing out all of the possible outcomes of a trial and counting the fraction of outcomes that satisfy the criteria for a given event. Often, enumerating all of the outcomes for a trial is time-consuming and error-prone. It is usually faster and easier to break a problem down into simple parts and then to combine the probabilities for the individual parts. The following are useful ways that probabilities can be combined to speed obability calculation PRODUCTRULE p(a and b)=p(a) p(b) if a and b are independent Two events are considered independent if they do not influence one another. The criterion of indepen dence is very important-application of the product rule for events that are not independent will give an incorrect answer

HANDBOOK for PROBABILITY CALCULATIONS Many problems in diploid genetics rely on basic concepts of probability. This is because each individual inherits at random only one of two possible copies of a gene from each parent. Thus, breeding experiments or inheritance in human pedigrees have probabilistic rather than absolute outcomes. Everyone has an intuitive sense of probability but what we need is a precise definition that will allow probabilities to be manipulated quantitatively. Probabilities are usually defined in terms of possible outcomes of a trial. A trial could be the toss of a coin, the roll of a die, or two parents having a child. If we define a specific event a, p(a) or the probability of a, can be defined as follows: after a very large number of trials, p(a) is simply the fraction of trials that give outcome a. In principle, we could determine p(a) by actually performing a large number of trials and directly measuring the fraction of trials that produce event a. This is sometimes called the “Monte Carlo method” named after a famous European casino and works well for computer simulations of complicated phenomena. However, in many cases there is a much simpler way to calculate probabilities. To directly calculate classical probabilities one must know enough about a process to break down the possible outcomes of a trial into some number of equally probable events. In these cases the probability of event a is: p(a)= na N where na is the number of outcomes that satisfy the criteria for a and N is the total number of equally probable outcomes. Note that since N includes all possible outcomes, na ≤ N and 0 ≤ p(a) ≤ 1. Example: A couple has two children, what is the probability that they are both girls? Assuming that the chances of having a boy or a girl are equal, there are 4 equally probable ways of having two children (boy, boy; girl, boy; boy, girl; girl, girl) and the probability of two girls is 1/4 or 0.25. For classical probability problems you will always be able to arrive at the correct answer by writing out all of the possible outcomes of a trial and counting the fraction of outcomes that satisfy the criteria for a given event. Often, enumerating all of the outcomes for a trial is time-consuming and error-prone. It is usually faster and easier to break a problem down into simple parts and then to combine the probabilities for the individual parts. The following are useful ways that probabilities can be combined to speed probability calculations. PRODUCT RULE p(a and b) = p(a) x p(b) if a and b are independent. Two events are considered independent if they do not influence one another. The criterion of independence is very important — application of the product rule for events that are not independent will give an incorrect answer

Examples: To find the probability that a couple with three children have three boys we first note that the ex of one child has no influence on the sex of another and therefore constitute independent events. For each child, p(boy )=1/2 and by the product rule p(3 boys)=1/2 x 1/2 x 1/2=178 First, for a recessive trait to be expressed the progeny must inherit the recessive allele from both the mother and the father. Since the probability of inheriting a given allele from a heterozygote is 1/2 p(mutant from mother and mutant from father)=1/2 x 1/2=1/4. Second, since unlinked genes are inherited independently, we can use the product rule again to calculate p(recessives at gene a and reces- sives at gene B)=1/4x 174=1/16 SUM RULE The probability that either a or b will occur can be written as p(a or b). If two events a and b cannot both occur they are mutually exclusive and the number of events that satisfy a or b is na nb. It should be apparent from our definition of probability that p(ab≈=n+m=p(a)+p(b N A useful special case of the sum rule arises when we consider p(not a). by definition p(a)and p(not a) are mutually exclusive and they encompass all possible outcomes. Thus (a or not a)=1=p(a)+p(not a) and p(not a)=1-p(a) Examples: Find the probability that a family with three children has at least one girl. We begin by noting that instead of trying to count all possible families with at least one girl it is easier to realize that p(at least one girl) is the same as p( not all boys). Since p(all boys)=1/8, p(not all boys)=1-178=7/8 p(at least one girl) In a cross where both parents are heterozygous for recessive mutations in two unlinked genes, what is the probability that one of their progeny will express at least one of the dominant traits? p(at least one dominant)=1-p(both recessive), and from above, p(both recessive)=1/16. Therefore p(at least one dominant)=1-1/16=15/16 In cases where two events a and b are independent but not mutually exclusive, we can still calculate p(a or b). In this case we note that the two events a and(b and not a)are mutually exclusive and encompass all outcomes that satisfy a or b or both. For these mutually exclusive events we can apply the sum rule. Thus p(a or b)=p(aor [b and not a])=p(a)+ p(b and not a) Since b and not a are independent p(a)+ p(b and not a)=p(a)+p(b)x p(not a)=p(a)+p(b)x[l-p(a)J= p(a)+p(b)-[p(a)x p(b) Note that in the case where a and b are mutually exclusive, p(a)x p(b)=0 giving the same formula for the sum rule Example: We can use this formula as another way to solve the last example, which is a case in which the two events are independent but not mutually exclusive. p(at least one dominant)=p(dominant at gene A or dominant at gene B)=p(dominant at gene A)+p(dominant at gene B)-Ip(dominant at gene A)x p( dominant at gene B)=314+3/4-[314x3/4]=6/4-9/16=15/16

Examples: To find the probability that a couple with three children have three boys we first note that the sex of one child has no influence on the sex of another and therefore constitute independent events. For each child, p(boy ) = 1/2 and by the product rule p(3 boys) = 1/2 x 1/2 x 1/2 = 1/8. First, for a recessive trait to be expressed the progeny must inherit the recessive allele from both the mother and the father. Since the probability of inheriting a given allele from a heterozygote is 1/2, p(mutant from mother and mutant from father) = 1/2 x 1/2 = 1/4. Second, since unlinked genes are inherited independently, we can use the product rule again to calculate p(recessives at gene A and recessives at gene B) = 1/4 x 1/4 = 1/16. SUM RULE The probability that either a or b will occur can be written as p(a or b). If two events a and b cannot both occur they are mutually exclusive and the number of events that satisfy a or b is na + nb. It should be apparent from our definition of probability that: n + n p(a a b or b) = n +n = a b = p(a) + p(b) N N N A useful special case of the sum rule arises when we consider p(not a). By definition p(a) and p(not a) are mutually exclusive and they encompass all possible outcomes. Thus: p(a or not a) = 1 = p(a) + p(not a) and p(not a) = 1 – p(a) Examples: Find the probability that a family with three children has at least one girl. We begin by noting that instead of trying to count all possible families with at least one girl it is easier to realize that p(at least one girl) is the same as p( not all boys). Since p(all boys) = 1/8, p(not all boys) = 1– 1/8 = 7/8 = p(at least one girl). In a cross where both parents are heterozygous for recessive mutations in two unlinked genes, what is the probability that one of their progeny will express at least one of the dominant traits? p(at least one dominant) = 1 – p(both recessive), and from above, p(both recessive) = 1/16. Therefore p(at least one dominant) = 1 – 1/16 = 15/16. In cases where two events a and b are independent but not mutually exclusive, we can still calculate p(a or b). In this case we note that the two events a and (b and not a) are mutually exclusive and encompass all outcomes that satisfy a or b or both. For these mutually exclusive events we can apply the sum rule. Thus, p(a or b) = p(a or [b and not a]) = p(a) + p(b and not a) Since b and not a are independent: p(a) + p(b and not a) = p(a) + p(b) x p(not a) = p(a) + p(b) x [1 – p(a)] = p(a) + p(b) – [p(a) x p(b)] Note that in the case where a and b are mutually exclusive, p(a) x p(b) = 0 giving the same formula as for the sum rule. Example: We can use this formula as another way to solve the last example, which is a case in which the two events are independent but not mutually exclusive. p(at least one dominant) = p(dominant at gene A or dominant at gene B) = p(dominant at gene A)+p(dominant at gene B) – [p(dominant at gene A) x p(dominant at gene B )] = 3/4 + 3/4 – [3/4 x 3/4] =6/4 – 9/16 = 15/16