6.042/18.] Mathematics for Computer Science March 15.2005 Srini devadas and Eric Lehman Lecture notes Sums, Approximations, and Asymptotics II 1 Block Stacking How far can a stack of identical blocks overhang the end of a table without toppling over? Can a block be suspended entirely beyond the tables edge? Table Physics imposes some constraints on the arrangement of the blocks. In particular, the stack falls off the desk if its center of mass lies beyond the desk's edge. Moreover, the center of mass of the top k blocks must lie above the(k+ 1)-st block; otherwise, the top k would fall over In order to find the best configuration of blocks satisfying these constraints, we'll need a fact about centers of mass Fact 1. If two objects have masses m1 and m2 and centers-of-mass at positions a1 and 22, center of mass of the two objects together is at position 31m1+2m2 m1+m2 Define the offset of a particular configuration of blocks to be the horizonal distar from its center of mass to its rightmost edge offset center of mass Table
6.042/18.062J Mathematics for Computer Science March 15, 2005 Srini Devadas and Eric Lehman Lecture Notes Sums, Approximations, and Asymptotics II 1 Block Stacking How far can a stack of identical blocks overhang the end of a table without toppling over? Can a block be suspended entirely beyond the table’s edge? Table Physics imposes some constraints on the arrangement of the blocks. In particular, the stack falls off the desk if its center of mass lies beyond the desk’s edge. Moreover, the center of mass of the top k blocks must lie above the (k + 1)st block; otherwise, the top k would fall over. In order to find the best configuration of blocks satisfying these constraints, we’ll need a fact about centers of mass. Fact 1. If two objects have masses m1 and m2 and centersofmass at positions z1 and z2, then the center of mass of the two objects together is at position: z1m1 + z2m2 m1 + m2 Define the offset of a particular configuration of blocks to be the horizonal distance from its center of mass to its rightmost edge. offset ? center of mass - s ? Table
2 Sums, Approximations, and Asymptotics Il The offset determines precisely how far that configuration can extend beyond the desk since at best the center of mass lies right above the desks edge. So hereafter we can focus on maximizing the offset of a stack of n blocks rather than the overhang. As a result, we need only be concerned with whether the stack is internally stable and not with whether the whole configuration is too far to the right and falls off the table We can establish the greatest possible offset of a stack of n blocks with an inductive argument. This is an instance where induction not only serves as a proof technique, but lso turns out to be a nice tool for reasoning about the problem Theorem 1. The greatest possible offset of a stack of n> 1 blocks 246 Proof. We use induction on n, the number of blocks. Let P(n) be the proposition that the greatest possible offset of a stack of n> 1 blocks is 1/2+1/4+..+1/(2n) Base case: For a single block, the center of mass is distance X1= 1/2 from its rightmost edge. So the offset is 1/2 and P(1)is true Inductive step: For n> 2, we assume that P(n-1) is true in order to prove P(n). A stack of n blocks consists of the bottom block together with a stack of n-1 blocks on top In order to acheive the greatest possible offset with n blocks the top n-1 blocks should themselves have the greatest possible offset, which is Xn-i; otherwise, we could do better by replacing the top n- l blocks with a different configuration that has greater offset Furthermore, the center of mass of the top n-1 blocks should lie directly above the right edge of the bottom block; otherwise, we could do better by sliding the top n-1 blocks farther to the right n-1 blocks Weve now identified the configuration of n blocks with the greatest possible offset What remains is to determine what that offset is. To that end, we'll apply Fact 1 where positions are measured relative to the rightmost edge of the n-block stack. In particular, 1 blocks with center of mass at position Xn-I, and 1 block wit at position Xn-1+2. So Fact 1 implies that the maximum possible offset of a stack of n
2 Sums, Approximations, and Asymptotics II The offset determines precisely how far that configuration can extend beyond the desk since at best the center of mass lies right above the desk’s edge. So hereafter we can focus on maximizing the offset of a stack of n blocks rather than the overhang. As a result, we need only be concerned with whether the stack is internally stable and not with whether the whole configuration is too far to the right and falls off the table. We can establish the greatest possible offset of a stack of n blocks with an inductive argument. This is an instance where induction not only serves as a proof technique, but also turns out to be a nice tool for reasoning about the problem. Theorem 1. The greatest possible offset of a stack of n ≥ 1 blocks is: 1 1 1 1 Xn = + + + . . . + 2 4 6 2n Proof. We use induction on n, the number of blocks. Let P(n) be the proposition that the greatest possible offset of a stack of n ≥ 1 blocks is 1/2 + 1/4 + . . . + 1/(2n). Base case: For a single block, the center of mass is distance X1 = 1/2 from its rightmost edge. So the offset is 1/2 and P(1) is true. Inductive step: For n ≥ 2, we assume that P(n − 1) is true in order to prove P(n). A stack of n blocks consists of the bottom block together with a stack of n − 1 blocks on top. In order to acheive the greatest possible offset with n blocks, the top n−1 blocks should themselves have the greatest possible offset, which is Xn−1; otherwise, we could do better by replacing the top n − 1 blocks with a different configuration that has greater offset. Furthermore, the center of mass of the top n − 1 blocks should lie directly above the right edge of the bottom block; otherwise, we could do better by sliding the top n − 1 blocks farther to the right. n − 1 blocks s Xn−1 - s Xn−1 + 1 - 2 We’ve now identified the configuration of n blocks with the greatest possible offset. What remains is to determine what that offset is. To that end, we’ll apply Fact 1 where positions are measured relative to the rightmost edge of the nblock stack. In particular, we have n−1 blocks with center of mass at position Xn−1, and 1 block with center of mass at position Xn−1 + 1 . So Fact 1 implies that the maximum possible offset of a stack of n 2
Sums, Approximations, and Asymptotics T blocks is (n-1)+(Xn-1+) X 1.1.1 We use the assumption P(n-1) in the last step. This proves P(n) The theorem follows by the principle of induction 1.1 Harmonic numbers Sums similar to the one in Theorem 1 come up all the time in computer science. In partic is called a harmonic sum. Its value is called the n-th harmonic number and is denoted Hn In these terms, the greatest possible offset of a stack of n blocks is Hn We can tabulate the maximum achievable overhang with n= 1, 2, 3 and 4 blocks by computing the first few harmonic numbers of blocks maximum overhang 2H2=是(+)= H3=是(+是+)= 是H4=是(+++1)=2 The last line reveals that we can suspend the fourth block beyond the edge of the table Here's the configuration that does the trick
Sums, Approximations, and Asymptotics II 3 blocks is: 1 2 ) 1· Xn = Xn−1 · (n − 1) + (Xn−1 + n 1 = Xn−1 + 2n 1 1 1 1 = + + + . . . + 2 4 6 2n We use the assumption P(n − 1) in the last step. This proves P(n). The theorem follows by the principle of induction. 1.1 Harmonic Numbers Sums similar to the one in Theorem 1 come up all the time in computer science. In particular, 1 1 1 1 + + + . . . + 1 2 3 n is called a harmonic sum. Its value is called the nth harmonic number and is denoted Hn. In these terms, the greatest possible offset of a stack of n blocks is 1 2Hn. We can tabulate the maximum achievable overhang with n = 1, 2, 3 and 4 blocks by computing the first few harmonic numbers: # of blocks maximum overhang 1 1 2 1 1 1 2 1 H1 = ( ) = 2 1 1 2 1 1 3 4 2 H2 = ( + ) = 2 1 2 1 1 2 1 1 1 11 = 12 3 H3 = ( + + ) 2 1 2 3 1 1 2 1 1 1 1 25 4 H4 = ( + + + ) = > 1 2 1 2 3 4 24 1 2 1 4 1 6 1 8 The last line reveals that we can suspend the fourth block beyond the edge of the table! Here’s the configuration that does the trick:
Sums, Approximations, and Asymptotics II 1.2 Bounding a Sum with an Integral We need to know more about harmonic sums to determine what can be done with a large number of blocks. Unfortunately, there is no closed form for H, But, on the bright side we can get good lower and upper bounds on harmonic sums using a general technique involving integration that applies to many other sums as well Here's how it works. First, we imagine a bar graph where the area of the k-th bar is equal to the k-th term in the sum. In particular, each bar has width 1 and height equal to the value of the k-th term. For example, the bar graph corresponding to the harmonic sum 111 2+3 is shown below 1 1/2 1 01234 Now the value of the sum is equal to the area of the bars, and we can estimate the area of the bars using integration. Suppose we draw a smooth curve that runs just below the bars; in fact, there's a natural choice: the curve described by the function y= 1/(a+1) 1/2 =1/(x+1) 1
4 Sums, Approximations, and Asymptotics II 1.2 Bounding a Sum with an Integral We need to know more about harmonic sums to determine what can be done with a large number of blocks. Unfortunately, there is no closed form for Hn. But, on the bright side, we can get good lower and upper bounds on harmonic sums using a general technique involving integration that applies to many other sums as well. Here’s how it works. First, we imagine a bar graph where the area of the kth bar is equal to the kth term in the sum. In particular, each bar has width 1 and height equal to the value of the kth term. For example, the bar graph corresponding to the harmonic sum 1 1 1 1 Hn = + + + . . . + 1 2 3 n is shown below. 6 1 1/2 1 1 1 1 . . . 1 2 3 4 - 0 1 2 3 4 n − 1 n Now the value of the sum is equal to the area of the bars, and we can estimate the area of the bars using integration. Suppose we draw a smooth curve that runs just below the bars; in fact, there’s a natural choice: the curve described by the function y = 1/(x + 1). - 6 1 1 1 2 1 3 1 4 1 1/2 y = 1/(x + 1) 0 1 2 3 4 n − 1 n
ums, Approximations, and Asymptotics Il The area of the bars is at least the area under this curve so we have a lower bound on the n-th harmonic sum 111 (n+1) Remember that n blocks can overhang the edge of a table by Hn block wi we had, say, a million blocks, then this lower bound implies that we could achieve an overhang of at least 1m(.00069 block widths! In fact, since the lower bound of In(n+ 1) grows arbitrarily large, there is no limit on how far the stack can overhang. Of course, this assumes no breeze, defor- mation of the blocks, or gravitational variation as our stack grows thousands of miles high We can get an upper bound on the n-th harmonic number by playing essentially the same game. Now we need a curve that skims just above the bar graph. The curve defined by y=1/a fits the bill 1/2 01234 1n2 The area under this curve is an upper bound on the area of the bar graph and thus on the n-th harmonic sum. But there's a problem: the area under the curve is infinite because y=1/ has a bad asymptote at =0. This is a common problem when bounding sums with integrals and there's a simple solution: take the exact value of the first term(1/1)and then bound the remaining terms(1/2+1/3+..+1/n) with an integral. In this case, we
Sums, Approximations, and Asymptotics II 5 The area of the bars is at least the area under this curve, so we have a lower bound on the nth harmonic sum: 1 1 1 1 Hn = + + + . . . + � 1 2 3 n n 1 ≥ dx 0 x + 1 = ln(n + 1) Remember that n blocks can overhang the edge of a table by 1 2Hn block widths. So if we had, say, a million blocks, then this lower bound implies that we could achieve an overhang of at least ln(1, 000, 000 + 1) = 6.907 . . . 2 block widths! In fact, since the lower bound of 1 2 ln(n + 1) grows arbitrarily large, there is no limit on how far the stack can overhang. Of course, this assumes no breeze, deformation of the blocks, or gravitational variation as our stack grows thousands of miles high. We can get an upper bound on the nth harmonic number by playing essentially the same game. Now we need a curve that skims just above the bar graph. The curve defined by y = 1/x fits the bill. - 6 1 1 1 2 1 3 1 4 1 1/2 y = 1/x 0 1 2 3 4 n − 1 n The area under this curve is an upper bound on the area of the bar graph and thus on the nth harmonic sum. But there’s a problem: the area under the curve is infinite because y = 1/x has a bad asymptote at x = 0. This is a common problem when bounding sums with integrals and there’s a simple solution: take the exact value of the first term (1/1) and then bound the remaining terms (1/2 + 1/3 + . . . + 1/n) with an integral. In this case, we
Sums, Approximations, and Asymptotics II ret the upper bound 111 1+Inn So even though there is no closed-form for the n-th harmonic sum, we now know that the harmonic numbers grow logarithmically n(n+1)≤Hn≤1+ There is a refinement to the integration method we've just employed, called Euler. Maclaurin summation, which yields a sequence of terms that correct errors in the inital estimate. This technique gives an absurdly accurate formula for harmonic numbers An=Inn+y+ 2n12n2120m4 The second term is Euler's constant, y=0.577215664.. This is probably the third most important mathematical constant behind e and T. It is somewhat mysterious; for example no one knows whether y is rational or irrational, though if it equals a ratio of integers, then the denominator must be at least 10242,080. In the final term of the formula above E(n) denotes some number between 0 and 1. You'll never need this kind of precision in this course or, probably, anywhere else 2 The factorial Function One of the most common elements in messy mathematical expressions is the factorial function n!=1.2.3…(n-1) Factorial comes up particularly often in combinatorics and probability, which happen to be the major topics in the remainder of 6.042. Within a few weeks, you are going to have factorials coming out your ears a good way to deal with any product is to covert it into a sum by taking a logarithm l(I(4)=∑m) k=1 Then summing tools and exponentiate at the end to undo the effect the log. Let,'s use this strategy to get bounds on n!. First, we take a logarithm to make
� � � � 6 Sums, Approximations, and Asymptotics II get the upper bound: 1 1 1 1 Hn = + + + . . . + 1 � 2 3 n n 1 ≤ 1 + dx 1 x = 1 + ln n So even though there is no closedform for the nth harmonic sum, we now know that the harmonic numbers grow logarithmically: ln(n + 1) ≤ Hn ≤ 1 + ln n There is a refinement to the integration method we’ve just employed, called EulerMaclaurin summation, which yields a sequence of terms that correct errors in the inital estimate. This technique gives an absurdly accurate formula for harmonic numbers: 1 1 �(n) Hn = ln n + γ + + 2n − 12n2 120n4 The second term is Euler’s constant, γ = 0.577215664 . . .. This is probably the third most important mathematical constant behind e and π. It is somewhat mysterious; for example, no one knows whether γ is rational or irrational, though if it equals a ratio of integers, then the denominator must be at least 10242,080 . In the final term of the formula above, �(n) denotes some number between 0 and 1. You’ll never need this kind of precision in this course or, probably, anywhere else. 2 The Factorial Function One of the most common elements in messy mathematical expressions is the factorial function: n! = 1 2 3 · · · · ·(n − 1) · n Factorial comes up particularly often in combinatorics and probability, which happen to be the major topics in the remainder of 6.042. Within a few weeks, you are going to have factorials coming out your ears. A good way to deal with any product is to covert it into a sum by taking a logarithm: n n ln f(k) = ln f(k) k=1 k=1 Then we can apply all our summing tools and exponentiate at the end to undo the effect of the log. Let’s use this strategy to get bounds on n!. First, we take a logarithm to make
Sums, Approximations, and Asymptotics II the product into a sum: lnn!=ln(1·.2.3……m) In 1+In 2+In3+.+Inn This sum is rather nasty, but we can still get bounds by the integration method. A picture can be extremely helpful in working out the proper bounding functions and limits of integration. y=In(a+1) In 3 In 4 In this case, we get the bounds lndr≤hnn!≤/ln(x+1)dx rhnx-x.≤mn!≤(x+1)hn(x+1)-(x+1 nlnn-n+1≤lnn!≤(n+1)ln(m+1)-(m+1)+ Finally, we exponentiate to get bounds on n ≤/n+1)n+1 This gives some indication how big n! is: about(n/e) n. This estimate is often good enough. If you need a little more accuracy, you can add one more term to get Stirlings uLe n!≈v2rn Stirlings formula is worth committing to memory. We'll often use it to rewrite expres- lot better than the expression on the right. But thats just an artifact of notation. If youe o sions involving n!. Now, one might object that the expression on the left actually looks tually wanted to compute n! youd need n-1 multiplications. However, the expression the right is a closed form; evaluating it requires only a handful of basic operations, regardless of the value of n. Furthermore, when n! appears inside a larger expression
� � � � � � � � � � � � � � Sums, Approximations, and Asymptotics II 7 the product into a sum: ln n! = ln(1 · 2 · 3 · · · n) = ln 1 + ln 2 + ln 3 + . . . + ln n This sum is rather nasty, but we can still get bounds by the integration method. A picture can be extremely helpful in working out the proper bounding functions and limits of integration. - 6 y = ln x y = ln(x ln n + 1) ln 3 ln 4 0 1 2 3 4 n In this case, we get the bounds: n n ln x dx ≤ ln n! ≤ ln(x + 1) dx 1 0 n ln n! ≤ (x + 1) ln(x + 1) − (x + 1) 1 ≤ n x ln x − x 0 n ln n − n + 1 ln ≤ n! ≤ (n + 1) ln(n + 1) − (n + 1) + 1 Finally, we exponentiate to get bounds on n!. n n+1 n n + 1 e n! ≤ e e · ≤ e · This gives some indication how big n! is: about (n/e)n . This estimate is often good enough. If you need a little more accuracy, you can add one more term to get Stirling’s Formula: √ 2πn n n n! ≈ e Stirling’s formula is worth committing to memory. We’ll often use it to rewrite expressions involving n!. Now, one might object that the expression on the left actually looks a lot better than the expression on the right. But that’s just an artifact of notation. If you actually wanted to compute n!, you’d need n − 1 multiplications. However, the expression on the right is a closed form; evaluating it requires only a handful of basic operations, regardless of the value of n. Furthermore, when n! appears inside a larger expression
Sums, Approximations, and Asymptotics II you usually cant do much with it. It doesn' t readily cancel or combine with other terms In contrast, the expression on the right looks scary, but melds nicely into larger formulas So dont be put off: Stirlings Formula is your friend Stepping back a bit, Stirlings formula is fairly amazing. Who would guess that the product of the first n positive integers could be so precisely described by a formula involv ing both e and r?) If you re feeling a little crazy, you might pull out these even -more-precise bounds n)el(12n+1) These bounds are ridiculously close. For example, if n= 100, then we get 100 100!> V2007el/201 100!< 200re1/1200 The upper bound is less than 7 hundred-thousandths of 1% greater than the lower bound 3 Asymptotic Notation Our final topic is a special notation- called asymptotic notation- designed to sweep mathematical messes under the rug. Asymptotic notation is sort of like the Force from Star Wars. It's everywhere. Tts an essential part of the language of computer science And mostly it is a tool for good. At best, asymptotic notation suppresses irrelevant de- tails, while allowing the essential features of a mathematical analysis to shine through However, as we'll see, asymptotic notation also has an alluring dark side 3.1 Asymptotic Notation: The Jedi Perspective Suppose you want to know how long a computer takes to multiply two n x n matrices You could tally up all the multiplications and additions and loop variable increments and comparisons and perhaps factor in hardware-specific considerations such as page faults and cache misses and branch mispredicts and floating-point unit availability and all this would give you one sort of answer. (Whew! )Such an answer would be very accurate, but not very insightful. Given a very complicated formula for the running time, we would be hard-pressed to answer basic questions: How would doubling the size of the matrices alter th running time? What is the biggest matrix we can handle? Furthermore, a minor change in the procedure or hardware would invalidate the answer
� � � �� � � �� � 8 Sums, Approximations, and Asymptotics II you usually can’t do much with it. It doesn’t readily cancel or combine with other terms. In contrast, the expression on the right looks scary, but melds nicely into larger formulas. So don’t be put off: Stirling’s Formula is your friend. (Stepping back a bit, Stirling’s formula is fairly amazing. Who would guess that the product of the first n positive integers could be so precisely described by a formula involving both e and π?) If you’re feeling a little crazy, you might pull out these evenmoreprecise bounds: n n n n 1/(12n+1) 1/(12n) √ 2πn � � e ≤ n! ≤ √ 2πn e e e These bounds are ridiculously close. For example, if n = 100, then we get: � �100 100! ≥ 100 √ 200π e 1/1201 e =1.000832... � �100 100! ≤ 100 √ 200π e 1/1200 e =1.000833... The upper bound is less than 7 hundredthousandths of 1% greater than the lower bound! 3 Asymptotic Notation Our final topic is a special notation— called asymptotic notation— designed to sweep mathematical messes under the rug. Asymptotic notation is sort of like the Force from Star Wars. It’s everywhere. Tt’s an essential part of the language of computer science. And mostly it is a tool for good. At best, asymptotic notation suppresses irrelevant detailss, while allowing the essential features of a mathematical analysis to shine through. However, as we’ll see, asymptotic notation also has an alluring dark side. 3.1 Asymptotic Notation: The Jedi Perspective Suppose you want to know how long a computer takes to multiply two n × n matrices. You could tally up all the multiplications and additions and loop variable increments and comparisons and perhaps factor in hardwarespecific considerations such as page faults and cache misses and branch mispredicts and floatingpoint unit availability and all this would give you one sort of answer. (Whew!) Such an answer would be very accurate, but not very insightful. Given a very complicated formula for the running time, we would be hardpressed to answer basic questions: How would doubling the size of the matrices alter the running time? What is the biggest matrix we can handle? Furthermore, a minor change in the procedure or hardware would invalidate the answer
Sums, Approximations, and Asymptotics T On the other hand, each of the n2 entries in the product matrix takes about n steps to compute. So the running time is roughly n2.n=n. This answer is certainly less precise, but it was easy to derive and is easy to interpret. For example, we can see that doubling the size of the matrices from n x n to 2n x 2n would increase the ng time from about n to about(2n =8n'--a factor of 8. And, assuming a computer performs billions of operations in a reasonable time(as opposed to millions or trillions), the largest matrices we could handle would be roughly 1000 x 1000. Furthermore, this approximate answer is independent of tiny implementation and hardware details. It remains valid even after you upgrade your computer. So approximate answers are quite attractive. And asymptotic notation allows one to ormulate vague statements like"roughly n"in a very precise way. 3.2 Six Funny-Lookin' Symbols Asymptotic notation involves six weird little symbols oh omega theta little-oh little-omega tilde We'll focus on O, which is the most widely used. The others are about equally popular, except for w, which is the Jar-Jar of the lot Suppose that f and g are functions. Then the statement f(x)=O9(x) means that there exist constants to and c>0 such that f(x)≤c·g(x) for all a>ro. Now this definition is pretty hairy. But what it's trying to say, with all its cryptic little constants, is that f grows no faster than g. a bit more precisely, it says that f is at most a constant times greater than g, except maybe for small values of For example, here's a true statement: 5x+100=O(x) This holds because the left side is only about 5 times larger than the right. Of course, for small values of a (like r = 1)the left side is many times larger than the right, but the definition of o is cleverly designed to sweep aside such inconvenient details Let's work carefully through a sequence of examples involving O in order to better understand this definition Claim 2. 5.2 +100=O(r)
Sums, Approximations, and Asymptotics II 9 On the other hand, each of the n2 entries in the product matrix takes about n steps to 2 compute. So the running time is roughly n n = n3 · . This answer is certainly less precise, but it was easy to derive and is easy to interpret. For example, we can see that doubling the size of the matrices from n×n to 2n×2n would increase the running time from about n3 to about (2n)3 = 8n3— a factor of 8. And, assuming a computer performs billions of operations in a reasonable time (as opposed to millions or trillions), the largest matrices we could handle would be roughly 1000 × 1000. Furthermore, this approximate answer is independent of tiny implementation and hardware details. It remains valid even after you upgrade your computer. So approximate answers are quite attractive. And asymptotic notation allows one to formulate vague statements like “roughly n3” in a very precise way. 3.2 Six FunnyLookin’ Symbols Asymptotic notation involves six weird little symbols: O Ω Θ o ω oh omega theta littleoh littleomega tilde ∼ We’ll focus on O, which is the most widely used. The others are about equally popular, except for ω, which is the JarJar of the lot. Suppose that f and g are functions. Then the statement f(x) = O(g(x)) means that there exist constants x0 and c > 0 such that |f(x)| ≤ c · g(x) for all x > x0. Now this definition is pretty hairy. But what it’s trying to say, with all its cryptic little constants, is that f grows no faster than g. A bit more precisely, it says that f is at most a constant times greater than g, except maybe for small values of x. For example, here’s a true statement: 5x + 100 = O(x) This holds because the left side is only about 5 times larger than the right. Of course, for small values of x (like x = 1) the left side is many times larger than the right, but the definition of O is cleverly designed to sweep aside such inconvenient details. Let’s work carefully through a sequence of examples involving O in order to better understand this definition. Claim 2. 5x + 100 = O(x)
Sums, Approximations, and Asymptotics II Proof. We must show that there exist constants to and c>0 such that 5.x +100to Let c=10 and zo 20 and note that: 5x+100≤5+5x=10x for all x >20 Here's another claim that points out a very common error involving O notation Claim 3. =O() Proof. We must show that there exist constants To and c>0 such that a ao. Let c= l and co 1 and note that r|≤1.x2 for all x >1 Many people fail to realize that a statement of the form f(a)=o(g(a)) only places an upper bound on the growth of the function f. So, for example, you'll often hear people say things like, I can t use an algorithm that runs in O(a2)steps because that' s too slow People who say things like that are dumb and you should make fun of them until they cry. Okay, maybe not. But they are incorrect; we just proved that a fast algorithm that requires only a steps also runs in time O(z2)! One properly expresses a lower bound using the S notation, which we'll come to presently. What about the reverse of the previous claim? Is 2=O()? On an informal basis, this means z 2 grows no faster than c, which is false. Lets prove this formally Claim4.x2≠O(x) Proof. We argue by contradiction; suppose that there exist constants co and c such that C·x or Dividing both sides of the inequality by a gives <c for all x But this is false when r= 1+ max(co, c) We can show that 2+ O(100.)by essentially the same argument; intuitively, x grows quadratically, while 100r grows only linearly. Generally, changes in multiplicative con- stants do not affect the validity of an assertion involving O. However, constants in expo- nentials can be critical: Claim 5 4≠O(2)
� � 10 Sums, Approximations, and Asymptotics II Proof. We must show that there exist constants x0 and c > 0 such that |5x + 100| ≤ c · x for all x > x0. Let c = 10 and x0 = 20 and note that: | | 5x + 100 ≤ 5x + 5x = 10x for all x > 20 Here’s another claim that points out a very common error involving O notation. Claim 3. x = O(x2) Proof. We must show that there exist constants x0 and c > 0 such that |x| ≤ c · x2 for all x > x0. Let c = 1 and x0 = 1 and note that | | ≤ x x 1 · for all x > 1 2 Many people fail to realize that a statement of the form f(x) = O(g(x)) only places an upper bound on the growth of the function f. So, for example, you’ll often hear people say things like, “I can’t use an algorithm that runs in O(x2) steps because that’s too slow.” People who say things like that are dumb and you should make fun of them until they cry. Okay, maybe not. But they are incorrect; we just proved that a fast algorithm that requires 2 only x steps also runs in time O(x )! One properly expresses a lower bound using the Ω notation, which we’ll come to presently. What 2 about the reverse of the previous claim? Is x = O(x)? On an informal basis, this means x2 grows no faster than x, which is false. Let’s prove this formally. 2 Claim 4. x = O(x) Proof. We argue by contradiction; suppose that there exist constants x0 and c such that: 2 |x | ≤ c · x for all x > x0 Dividing both sides of the inequality by x gives: x ≤ c for all x > x0 But this is false when x = 1 + max(x0, c). We 2 2 can show that x �= O(100x) by essentially the same argument; intuitively, x grows quadratically, while 100x grows only linearly. Generally, changes in multiplicative constants do not affect the validity of an assertion involving O. However, constants in exponentials can be critical: Claim 5. 4x = O(2x)