Overview 第六讲:分摊分析法 E Aggregate analysis The accounting metho Amortized Analysi 啼 The potential method 宋斌恒 啼 Example dYnamic tables Overview 聚集分析法 噜由分摊分析法解决的问题 mA sequence of n operations takes worst Sequence of operation on a dada structure case time t()in total 4Average cost per operation 团【比如网站的整体效率问题与此问题的关系】 a The average cost T(n)/n is called the amortized cost 类主要分析手段 mthe amortized cost applies to each AGgregate, worst case average operation, even they are different. ACcounting, Amortized cost aPotential, is a whole index for the data structure Stack operation 粗略分析 w Let s be a stack object, it has two basic c In the worst case of these operations, the operations worst cost is n. the size of the dada pUshy rA sequence of m operations cost at most Adding another operation AMultiPop(k), which pops k elements at once its cost is at most k popos cost w The amortized cost of the operations is e Attention: what it occurs when less than O( cIt is right but not tight!
1 清华大学 宋斌恒 1 第六讲:分摊分析法 Amortized Analysis 宋斌恒 清华大学 宋斌恒 2 Overview Aggregate analysis The accounting method The potential method Example: Dynamic tables 清华大学 宋斌恒 3 Overview 由分摊分析法解决的问题 Sequence of operation on a dada structure Average cost per operation 【比如网站的整体效率问题与此问题的关系】 三类主要分析手段 Aggregate, worst case + average Accounting, Amortized cost Potential, is a whole index for the data structure 清华大学 宋斌恒 4 聚集分析法 A sequence of n operations takes worstcase time T(n) in total The average cost T(n)/n is called the amortized cost, the amortized cost applies to each operation, even they are different. 清华大学 宋斌恒 5 Stack operation Stack operation Let S be a stack object, it has two basic operations: Pop() Push() Adding another operation MultiPop(k), which pops k elements at once, its cost is at most k pop()’s cost Attention: what it occurs when S.getSize()<k? 清华大学 宋斌恒 6 粗略分析 In the worst case of these operations, the worst cost is n, the size of the dada structure; A sequence of m operations cost at most m*n The amortized cost of the operations is less than O(n) It is right but not tight!
聚集分析法 示例:计数器 w Consider the entire sequence of operations a Consider the problem of incrementing a k- 中 A MultiHop(k) takes at most m=mn化k bit binary counter that counts upward from S. getSizeO operations e It has at least m push operation had taken c Array A=[O,1,. k-1]represents the k bits before the pops. w On an initial empty Stack, a sequence of n operations of push, pop, multipop takes at most O(n)steps of operations m Amortized cost =O(n)/n=O(1) AJ]A[2 A[T]A[O] Algorithm 1.i0 2. while i length(a] and A0==1 5. if i length (A] then A0<1 Analysis the algorithm Total cost - The most costing operation is k, the length cThe total number of flips of the bits the worst cost will not O(nk)where n is the operation numbers 啼 A tighten analysis aHalf the operations take 1 operation 岁1<n81/2<2n aQuarter of the operations takes 2 operations 2-of operations takes i operations
2 清华大学 宋斌恒 7 聚集分析法 Consider the entire sequence of operations: A MultiPop(k) takes at most m = min{k, S.getSize()} operations It has at least m push operation had taken before the pops. On an initial empty Stack, a sequence of n operations of push, pop, multipop takes at most O(n) steps of operations. Amortized cost = O(n)/n = O(1) 清华大学 宋斌恒 8 示例:计数器 Consider the problem of incrementing a kbit binary counter that counts upward from 0. Array A=[0,1,…,k-1] represents the k bits 清华大学 宋斌恒 9 Algorithm Increment(A) 1. i Å 0 2. while i < length[A] and A[i]==1 3. { 1. A[i]Å0; 2. i++; 4. } 5. if i < length[A] then A[i]Å1 清华大学 宋斌恒 10 16 0 1 0 0 0 0 31 15 0 0 1 1 1 1 26 14 0 0 1 1 1 0 25 13 0 0 1 1 0 1 23 12 0 0 1 1 0 0 22 11 0 0 1 0 1 1 19 10 0 0 1 0 1 0 18 9 0 0 1 0 0 1 16 8 0 0 1 0 0 0 15 7 0 0 0 1 1 1 11 6 0 0 0 1 1 0 10 5 0 0 0 1 0 1 8 4 0 0 0 1 0 0 7 3 0 0 0 0 1 1 4 2 0 0 0 0 1 0 3 1 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 Total cost Counter A[5] A[4] A[3] A[2] A[1] A[0] value 清华大学 宋斌恒 11 Analysis the algorithm The most costing operation is k, the length of the bits, the worst cost will not exceed O(nk) where n is the operation numbers; A tighten analysis Half the operations take 1 operation Quarter of the operations takes 2 operations 2-i of operations takes i operations 清华大学 宋斌恒 12 Total cost Total cost The total number of flips. ⎣ ⎦ n n i i i n i i 1 / 2 2 2 0 lg 0 ∑ < ∑ < ∞ = =
讨论 小结 啼长整数二进制加法的线性性证明 啼聚集分析方法关键点 中?高位相加快?低位相加快?或者一样? 匕所有n个运算集中考虑 不考虑不同运算的消费不 区只考虑所有运算的总复杂度 M最后分摊到每个运算 The accounting method Stack operation 4 In the accounting method we assign m The actual cost of operations were different amortized cost for different pUsh less or greater than its actual cost EMultipop min k, s) mLet a, be the amortized cost for the ith 啼 Amortized cost operation and c, is its actual cost than it aPush should satisfies for all n.【分摊约束条件】 花”账中成本的含义 Amortized analysis for incrementing the binary counter 啼Push的实际成本为1,同时为其在Pop时花 啼 Actual cost for a flip 的费用存入资金1,故其分摊成本为2 lSet a bit from o to 1 cost 1 dollar 啼Pop的实际成本为1,但是由于已经预付 Set a bit from 1 to o cost 1 dolla 故其分摊成本为0。 c Amortized cost for a fl MultiHop的实际成本为m,但由于已经预 LSet a bit from 0 to 1 cost 2 dollar, pay for its 付,故其分摊成本为0 reverse transfer et a bit from 1 to o cost o dollar because it is
3 清华大学 宋斌恒 13 讨论 长整数二进制加法的线性性证明: ?高位相加快?低位相加快?或者一样? 清华大学 宋斌恒 14 小结 聚集分析方法关键点 所有n个运算集中考虑 不考虑不同运算的消费不同 只考虑所有运算的总复杂度 最后分摊到每个运算 清华大学 宋斌恒 15 The accounting method The accounting method In the accounting method we assign different amortized cost for different operations, some of the amortized cost is less or greater than its actual cost. Let ai be the amortized cost for the ith operation and ci is its actual cost than it should satisfies for all n. 【分摊约束条件】 ∑ ∑ = = ≥ n i i n i i a c 1 1 清华大学 宋斌恒 16 Stack operation Stack operation The actual cost of operations were Push 1 Pop 1 Multipop min{k, s} Amortized cost Push 2 Pop 0 Multipop 0 清华大学 宋斌恒 17 “花”账中成本的含义 Push 的实际成本为1,同时为其在Pop时花 的费用存入资金1,故其分摊成本为2。 Pop 的实际成本为1,但是由于已经预付, 故其分摊成本为0。 MultiPop的实际成本为m,但由于已经预 付,故其分摊成本为0。 清华大学 宋斌恒 18 Amortized analysis for Amortized analysis for incrementing the binary counter incrementing the binary counter Actual cost for a flip Set a bit from 0 to 1 cost 1 dollar. Set a bit from 1 to 0 cost 1 dollar. Amortized cost for a flip Set a bit from 0 to 1 cost 2 dollar, pay for its reverse transfer. Set a bit from 1 to 0 cost 0 dollar, because it is paid already
Amortized Cost for an incrementing 做账分摊法小结 ◆ Actual cost 啼可以为不同的操作赋予不同的分摊成本 sEveral 1 to O flips plus 1 flip of 0 to1 分摊成本与实际成本要满足分摊约束条件 Amortized cost 啼分摊的经验 n0+1*2=2 如果一个操作的作用可以被另外一个或几个操 作消除,则可以把消除函数的实际成本分摊到 比操作本身,而其消除操作的分摊成本为0 匕复杂情况可以通过上述情况的方程式进行求解 The potential method 势方法 The potential method represent the m Start with an initial data structure Do prepaid work as credit stored with the wOn which there are n operations being specific objects in the data structure. We performed call the prepaid work as"potential energy c After each operation the data structure or just the "potential change to D 转下页 势方法(续 如何理解势方法 a D R 势方法适合那种有状态的数据结构上的各 B Actual cost is c. amortized cost is a. then 种操作成本的分摊 啼a1=c+d①)-ΦD1) 啼每个操作的分摊成本是 Require b①D)-D≥0 区实际成本十势能的增加 以上分摊成本可以理解为,一个操作不仅 要为自己付帐,同时要为自己对环境的影 响而付帐【可能是收益】
4 清华大学 宋斌恒 19 Amortized Cost for an incrementing Actual cost Several 1 to 0 flips plus 1 flip of 0 to1 Amortized cost n*0+1*2=2 清华大学 宋斌恒 20 做账分摊法小结 可以为不同的操作赋予不同的分摊成本 分摊成本与实际成本要满足分摊约束条件 分摊的经验 如果一个操作的作用可以被另外一个或几个操 作消除,则可以把消除函数的实际成本分摊到 此操作本身,而其消除操作的分摊成本为0 复杂情况可以通过上述情况的方程式进行求解 清华大学 宋斌恒 21 The potential method The potential method The potential method represent the prepaid work as credit stored with the specific objects in the data structure. We call the prepaid work as “potential energy” or just the “potential” 清华大学 宋斌恒 22 势方法 Start with an initial data structure D0 On which there are n operations being performed After each operation the data structure change to Di 转下页 清华大学 宋斌恒 23 势方法(续) Φ: Di ÆR Actual cost is ci , amortized cost is ai then ai =ci + Φ(Di ) – Φ(Di-1) Require Φ(Di ) – Φ(D0) ≥ 0 清华大学 宋斌恒 24 如何理解势方法 势方法适合那种有状态的数据结构上的各 种操作成本的分摊 每个操作的分摊成本是 实际成本+势能的增加 以上分摊成本可以理解为,一个操作不仅 要为自己付帐,同时要为自己对环境的影 响而付帐【可能是收益】
Stack operation Incrementing a binary counter 啼d(D)= Length(D) m =b the number of ]'s in the counter Push a1=c1+ΦD)-①①1)=1+1=2 after the ith opertion, 啼Popa1=c1+Φ(D)-ΦD)=11=0 w The ith opertaion reset t bits b, sb; 1-t+1 Multihop(k)a1=c1+Φ①D)-D1)=k Φb(D)-dD1) k=0 ≤(t1+1)+(1t 中解释:【课堂提问】 As o(d is nonnegative hence the n sequence of operation cannot exceed 2n of cost hence the amortized cost is o(1 势方法小结 Sample: Dynamic tables 中当因素太多时,做账分摊太难 m Dynamic table is a base data structure for 势方法把帐做到一起:势 store variation objects in a table ≌操作对势有贡献【一般是原始操作,为其后的 动态表上可以加许多操作,我们先考虑 肖除操作付帐】,增加势 下两种: 区操作对势有索取【一般是消除操作,有人已经 iNsert【事实上是 Append】 付帐】,势减少 Deee【事实上是删除最后的数据 区混合操作,则根据不同操作分别累计即可 Insert an element TABLE-INSERT(T,x) I if size[T- 啼插入过程 2 then allocate tabler with I slot sie[←1 alf there are enough space to store the element, just put it after the last element, then allocate new- table with2· size[门 slots ZIf there is not enough space, than apply a insert all items in table[l]into new-table double more space, and then copy the free table[7] original value to the new one and append the table[]+new-table new element sie[们←2·sie[们 11nam[门nm[7+1 5
5 清华大学 宋斌恒 25 Stack operation Φ(Di )=length(Di ) Push ai =ci + Φ(Di ) – Φ(Di-1)=1+1=2 Pop ai =ci + Φ(Di ) – Φ(Di-1)=1-1=0 Multipop(k) ai =ci + Φ(Di ) – Φ(Di-1)=kk=0 解释:【课堂提问】 清华大学 宋斌恒 26 Incrementing a binary counter Incrementing a binary counter Φ= bi the number of 1’s in the counter after the ith opertion, The ith opertaion reset ti bits bi ≤bi-1 –ti +1 ai =ci + Φ(Di ) – Φ(Di-1) ≤(ti +1) + (1-ti ) ≤2 As Φ(Di ) is nonnegative hence the n sequence of operation cannot exceed 2n of cost hence the amortized cost is O(1) 清华大学 宋斌恒 27 势方法小结 当因素太多时,做账分摊太难 势方法把帐做到一起:势 操作对势有贡献【一般是原始操作,为其后的 消除操作付帐】,增加势 操作对势有索取【一般是消除操作,有人已经 付帐】,势减少 混合操作,则根据不同操作分别累计即可 清华大学 宋斌恒 28 Sample: Dynamic tables Sample: Dynamic tables Dynamic table is a base data structure for store variation objects in a table. 动态表上可以加许多操作,我们先考虑一 下两种: Insert【事实上是Append】 Delete【事实上是删除最后的数据】 清华大学 宋斌恒 29 Insert an element Insert an element 插入过程: If there are enough space to store the element, just put it after the last element, If there is not enough space, than apply a double more space, and then copy the original value to the new one and append the new element. 清华大学 宋斌恒 30 TABLE-INSERT( INSERT(T,x) 1 if size[T]=0 2 2 then allocate allocate table[T] with 1 slot ] with 1 slot 3 3 size[T]← 1 4 if num[T]=size[T] 5 5 then allocate allocate new-table with 2 • size[T] slots ] slots 6 insert all items in 6 insert all items in table[T]into new-table 7 7 free table[T] 8 8 table[T]←new-table 9 9 size[T]←2 • size[T] 10 insert 10 insert x into table[T] 11 num[T]←num[T]+1
Analysis a dynamic table from zero size with only insertion operation The total cost of n TABLE-INSERT oper A sequence of n insert operations e The initial capacity of the table is 1 +Analysis the cost of these n insertion Amortized feeling 势函数 啼 Insert实际成本 啼 Potential function【 capacity=size】 团没有扩张:1 m p(T)=2num[[]-capacitylT] ≌有扩张:n+1,不计算分配空间的成本 nsert均摊成本:把未来的扩张成本预付 团插入:1 区为消费掉一个空间付帐:1【为扩张安排自己 的位置做准备】 团为准备一个新空间付帐:1【为扩张安排同样 多的其它数据做准备】 If the i th operation does not trigger an a If the i th operation does trigger an expansion, then we have: hen we have capacity i- -num;- num-l,which implies that 注意: capacity= capacity-,没有变化 capacity; =2(num-1). Thus, the amortized cost of the operation is
6 清华大学 宋斌恒 31 Analysis a dynamic table from Analysis a dynamic table from zero size with only insertion zero size with only insertion operation operation A sequence of n insert operations The initial capacity of the table is 1 Analysis the cost of these n insertion: 清华大学 宋斌恒 32 The total cost of n TABLE-INSERT operations is therefore ⎩ ⎨ ⎧ − = 1 . 1 2, otherwise i if i is an exact power of ci < n + 2n = 3n, [ ] ∑ ∑ = = ≤ + n j j n i c i n lg 1 0 2 清华大学 宋斌恒 33 Amortized feeling Amortized feeling Insert实际成本 没有扩张:1 有扩张:n+1,不计算分配空间的成本 Insert均摊成本:把未来的扩张成本预付 插入:1 为消费掉一个空间付帐:1【为扩张安排自己 的位置做准备】 为准备一个新空间付帐:1【为扩张安排同样 多的其它数据做准备】 清华大学 宋斌恒 34 势函数 Potential function【 capacity =size】 Φ(T) = 2num[T] –capacity[T] 清华大学 宋斌恒 35 注意: capacity i = capacity i-1,没有变化 1 ˆi = i + Φi − Φi− c c 1 (2 capacity ) (2 capacity ) −1 − −1 = + ⋅ − − ⋅ i i i i num num 1 (2 capacity ) (2( 1) capacity ) − − − − −1 = + ⋅ i i i i num num = 3. If the i th operation does not trigger an expansion, then we have: 清华大学 宋斌恒 36 If the i th operation does trigger an expansion, then we have capacity i =2· capacity i-1 and capacity i-1=numi-1=numi -1,which implies that capacity i =2·(numi -1). Thus, the amortized cost of the operation is
带有删除操作的动态表 当<12时插入 当a<12但a;≥1/2,则 删除操作:<1/2
7 清华大学 宋斌恒 37 1 ˆi = i + Φi − Φi− c c (2 capacity ) (2 capacity ) −1 − −1 = + ⋅ − − ⋅ i i i i i num num num = + (2 ⋅ − 2 ⋅( −1)) − (2( −1) − ( −1)) i i i i i num num num num num = + 2 − ( −1) numi numi = 3. 清华大学 宋斌恒 38 0 4 8 12 16 20 24 28 32 0 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30 32 势能 数量 容量 清华大学 宋斌恒 39 带有删除操作的动态表 ⎩ ⎨ ⎧ − < ⋅ − ≥ Φ = = [ ]/ 2 [ ] ( ) 1/ 2. 2 [ ] [ ] ( ) 1/ 2, ( ) ( ) [ ]/ [ ] size T num T if T num T size T if T T T num T size T α α α 清华大学 宋斌恒 40 当αi <1/2时插入 1 ˆi = i + Φi − Φi− c c 1 ( / 2 ) ( / 2 ) = + i − i − i−1 − numi−1 size num size = 1+ ( / 2 − ) − ( / 2 − ( −1)) i i i numi size num size = 0 清华大学 宋斌恒 41 当 αi-1<1/2 但 αi ≥1/2, 则 1 ˆi = i + Φi − Φi− c c 1 (2 ) ( / 2 ) − − −1 − −1 = + ⋅ i i i i num size size num 1 (2( 1) ) ( / 2 ) = + i−1 + − i−1 − i−1 − i−1 num size size num 3 2 3 3 = ⋅ numi−1 − sizei−1 + 3 2 3 = 3αi−1sizei−1 − sizei−1 + 3 2 3 2 1 1 3 < sizei− − sizei− + = 3. 清华大学 宋斌恒 42 删除操作: αi <1/2 1 ˆi = i + Φi − Φi− c c 1 ( / 2 ) ( / 2 ) = + i − i − i−1 − numi−1 size num size = 1+ ( / 2 − ) − ( / 2 − ( +1)) i i i i size num size num = 2
删除操作:∝1≥1/2 思考 啼动态表的实现: 1.考虑该表地址查找频繁 2.考虑该表地址查找稀少 总结 中分摊分析把一个系统作为整体考虑,有三 种常用方法 聚集 团做账 团势 统筹安排—另一种设计思路 团利用提高某些操作的成本,来达到减少其它操 作成本,最后达到整体操作成本最优
8 清华大学 宋斌恒 43 删除操作: αi ≥1/2 1 ˆi = i + Φi − Φi− c c ( 1) ( / 2 ) ( / 2 ) = i + + i − i − i−1 − i−1 num size num size num = ( +1) + (( +1) − ) − ((2 ⋅ + 2) − ( +1)) numi numi numi numi numi = 1. 清华大学 宋斌恒 44 思考 动态表的实现: 1. 考虑该表地址查找频繁 2. 考虑该表地址查找稀少 清华大学 宋斌恒 45 总结 分摊分析把一个系统作为整体考虑,有三 种常用方法: 聚集 做账 势 统筹安排——另一种设计思路: 利用提高某些操作的成本,来达到减少其它操 作成本,最后达到整体操作成本最优