Data Structures and Algorithm Xiaoqing Zheng Zhengxq@fudan.edu.cn
Data Structures and Algorithm Xiaoqing Zheng zhengxq@fudan.edu.cn
What are algorithms? d A sequence of computational steps that transform the input into the output Sorting problem: Input: A sequence of n numbers Oupu:
What are algorithms? A sequence of computational steps that transform the input into the output Sorting problem: Input: A sequence of n numbers Output: A permutation (reordering) such that a' 1 ≤ a' 2 ≤ … ≤ a' n Instance of sorting problem: Input: Output:
Example of insert sort 822222 8 433 44884 999986 333398 66666 done
Example of insert sort 8 2 493 6 2 8 493 6 24 8 93 6 24 89 3 6 23 489 6 23 468 9 done
Insertion sort INSERTION-SORT(A) I for j←2 to length[4] 2 do key←A 3 / Insert AG] into the sorted sequence All.j-1] 4 while i>0 and ai>key doA[i+11←A[ 7 8[i+1]←key A Sorted
Insertion sort INSERTION-SORT(A) 1 for j ← 2 to length[A] 2 do key ← A[j] 3 // Insert A[j] into the sorted sequence A[1 .. j − 1] 4 i ← j − 1 5 while i > 0 and A[i] > key 6 do A[i + 1] ← A[i] 7 i ← i − 1 8 A[i + 1] ← key A: 1 i j key n Sorted
Running time a The running time depends on the input: an already sorted sequence is easier to sort a Parameterize the running time by the size of the input, since short sequences are easier to sort than long ones a Generally, we seek upper bounds on the running time, because everybody likes a guarantee
Running time The running time depends on the input: an already sorted sequence is easier to sort. Parameterize the running time by the size of the input, since short sequences are easier to sort than long ones. Generally, we seek upper bounds on the running time, because everybody likes a guarantee
Kinds of analyses Worst-case:(usually) T(n)=maximum time of algorithm on any input of size n Average-case:(sometimes) T(n)=expected time of algorithm over all inputs of size n Need assumption of statistical distribution of inputs Best-case:(bogus) Cheat with a slow algorithm that works fast on some input
Kinds of analyses Worst-case: (usually) T( n) = maximum time of algorithm on any input of size n. Average-case: (sometimes) T( n) = expected time of algorithm over all inputs of size n. Need assumption of statistical distribution of inputs. Best-case: (bogus) Cheat with a slow algorithm that works fast on some input
Analysis of insertion sort INSERTION-SORT(A) times I for j←2 to length[4] do key←A[ C 2345678 // Insert AG] into the sorted sequence while i>0 and aG>key doA[i+1←A[i ∑2(1-1) A[i+11←key 7(n)=cm+2(n-1)+c1(n-1)+c∑m21+c∑=2(1-1)+c∑=2(1-1)+c(n-1
Analysis of insertion sort INSERTION-SORT(A) 1 for j ← 2 to length [A] 2 do key ← A [j] 3 // Insert A [j] into the sorted sequence A[1 .. j − 1] 4 i ← j − 1 5 while i > 0 and A [ i] > key 6 do A [ i + 1] ← A [ i] 7 i ← i − 1 8 A [ i + 1] ← key cost c1 c 2 0 c 4 c 5 c 6 c 7 c 8 times n n − 1 n − 1 n − 1 n − 1 2 n j j t ∑ = 2 ( 1) n j j t = ∑ − 2 ( 1) n j j t = ∑ − 12 4 5 6 7 8 22 2 ( ) ( 1) ( 1) ( 1) ( 1) ( 1) nn n jj j jj j T n cn c n c n c t c t c t c n == = = + −+ −+ + −+ −+ − ∑ ∑ ∑
Analvsis of insertion sort: best and worst T(m)=cn+c2(n-1)+c(n-1)+c∑21+c∑2(1-1)+c∑2(1-1)+c(n-1) Best case e The best case occurs if the array is already sorted (n)=c1n+c2(n-1) 1)+c3(n-1) (C+C,+C4+C+C8)n-(C,+C4+C+c) an+ b (inear function) Worst case The worst case occurs if the array is in reverse sorted order 7(n)=c1n+c2(n-1)+c4(n-1)+c5( n(n+1) -1)+c6( n(n-1)、,。n(n-1) )+c7( )+c3(n-1) ++-)n2+(c+c2+c +cs3)n-(c2+c4+c5+c3) an+ bn +c(quadratic function)
Analysis of insertion sort: best and worst 12 4 5 6 7 8 22 2 ( ) ( 1) ( 1) ( 1) ( 1) ( 1) nn n jj j jj j T n cn c n c n c t c t c t c n == = = + −+ −+ + −+ −+ − ∑ ∑ ∑ Best case: y The best case occurs if the array is already sorted 12 4 5 8 T n cn c n c n c n c n ( ) ( 1) ( 1) ( 1) ( 1) = + −+ −+ −+ − 12458 2458 = ( )( ) c c c c cn c c c c ++++ − +++ an + b (linear function) Worst case: y The worst case occurs if the array is in reverse sorted order 12 4 5 6 7 8 ( 1) ( 1) ( 1) ( ) ( 1) ( 1) ( 1) ( ) ( ) ( 1) 2 22 nn nn nn T n cn c n c n c c c c n + − − = + −+ −+ −+ + + − 567 567 2 124 8 2458 ( ) ( )( ) 222 222 ccc ccc = + + + +++ − − + − +++ n c c c cn c c c c an 2 + bn + c (quadratic function)
Machine-independent time a What is insertion sort's worst-case? It depends on the speed of our computer Relative speed (on the same machine), Absolute speed (on different machines) a BIG IDEA Ignore machine-dependent constants Look at growth of r(n)as n>00 "Asymptotic Analysis
Machine-independent time What is insertion sort's worst-case? It depends on the speed of our computer: y Relative speed (on the same machine), y Absolute speed (on different machines). "Asymptotic Analysis" "Asymptotic Analysis" BIG IDEA: y Ignore machine-dependent constants. y Look at growth of T( n ) as n → ∞
notation a Math o(8(n)=fn): there exist positive constants c1 C2,and no such that 0 sC1gn)sf(n)s c2g(n) for all n2 no 3 ENgineering Drop low-order terms; ignore leading constants Example: 3n3+90n2-5n +6046=o(n
Θ-notation Math: Θ(g(n)) = { f(n) : there exist positive constants c1, c2, and n0 such that 0 ≤ c1g(n) ≤ f(n) ≤ c2g(n) for all n ≥ n0 } Engineering: y Drop low-order terms; ignore leading constants. y Example: 3n3 + 90n2 – 5n + 6046 = Θ(n3)