当前位置:高等教育资讯网  >  中国高校课件下载中心  >  大学文库  >  浏览文档

香港中文大学:《Topics in Theoretical Computer Science》课程教学资源(PPT课件讲稿)Lecture 3 Algorithms for data streams

资源类别:文库,文档格式:PPTX,文档页数:33,文件大小:1.88MB,团购合买
点击下载完整版文档(PPTX)

clMSc5706TopicsinTheoreticalcomputerScience Week 3:Streaming and Sketching Instructor:Shengyu Zhang 1

Instructor: Shengyu Zhang 1

Map Motivations and model Problem 1:Missing numbers Problem 2:Count-Min sketch Lower bounds 0( Communication complexity 2

Map ◼ Motivations and model ◼ Problem 1: Missing numbers ◼ Problem 2: Count-Min sketch ◼ Lower bounds ❑ Communication complexity 2

Motivations Big mass of data. Data comes as a stream. Cannot see future data Relatively small space."sketch" Cannot store past data Need to process each item fast. Quick update time. Examples:Phone calls,Internet packets, satellite pictures,.. 3

Motivations ◼ Big mass of data. ◼ Data comes as a stream. ❑ Cannot see future data. ◼ Relatively small space. “sketch” ❑ Cannot store past data ◼ Need to process each item fast. ❑ Quick update time. ◼ Examples: Phone calls, Internet packets, satellite pictures, … 3

Problem 1:Missing numbers A set of numbers S={1,2,...,n} n-1 of them come in a stream x1,x2,...,one number is missing. 3,25,6,19,1,10,… Task:identify which one is missing. ▣Using small space

Problem 1: Missing numbers ◼ A set of numbers 𝑆 = {1,2, … , 𝑛} ◼ 𝑛 − 1 of them come in a stream 𝑥1, 𝑥2, …, 𝑥𝑛−1; one number is missing. ◼ Task: identify which one is missing. ❑ Using small space. 4 3, 25, 6, 19,1,10,…

A simple algorithm Maintain the sum of the input numbers. sum =0 ■fori=1ton-1 sumsum xi return n(n+1 2 .-sum 5

A simple algorithm ◼ Maintain the sum of the input numbers. ◼ 𝑠𝑢𝑚 = 0 ◼ for 𝑖 = 1 to 𝑛 − 1 𝑠𝑢𝑚 = 𝑠𝑢𝑚 + 𝑥𝑖 ◼ return 𝑛 𝑛+1 2 − 𝑠𝑢𝑚 5

Space complexity sum is at mostduring the algorithm. 2 Thus it takes at most log(lg bits to write it down. Space complexity:0(1og2 n). Much smaller than storing the whole stream, which takes at least 0(nlogn). 6

Space complexity ◼ 𝑠𝑢𝑚 is at most 𝑛 𝑛+1 2 during the algorithm. ◼ Thus it takes at most log2 𝑛 𝑛+1 2 = 𝑂 log2 𝑛 bits to write it down. ◼ Space complexity:𝑂 log2 𝑛 . ◼ Much smaller than storing the whole stream, which takes at least 𝑂(𝑛 log 𝑛). 6

More complicated Now the task gets harder. n-2 of them come in a stream x1,x2,...xn-2,two numbers are missing. 3,25,6,19,1,10,… Task:identify which two are missing. ▣Using small space

More complicated ◼ Now the task gets harder. ◼ 𝑛 − 2 of them come in a stream 𝑥1, 𝑥2, …, 𝑥𝑛−2, two numbers are missing. ◼ Task: identify which two are missing. ❑ Using small space. 7 3, 25, 6, 19,1,10,…

First try Maintain the sum and product of the input numbers. sum =0;product =1 ■fori=1ton-2 sumsum xi product=product·xi n(n+1-sum,b=n!/product 2 solve equations x+y=a,xy=b ▣return(x,y) 8

First try ◼ Maintain the sum and product of the input numbers. ◼ 𝑠𝑢𝑚 = 0; 𝑝𝑟𝑜𝑑𝑢𝑐𝑡 = 1 ◼ for 𝑖 = 1 to 𝑛 − 2 𝑠𝑢𝑚 = 𝑠𝑢𝑚 + 𝑥𝑖 𝑝𝑟𝑜𝑑𝑢𝑐𝑡 = 𝑝𝑟𝑜𝑑𝑢𝑐𝑡 ⋅ 𝑥𝑖 ◼ 𝑎 = 𝑛 𝑛+1 2 − 𝑠𝑢𝑚, 𝑏 = 𝑛!/𝑝𝑟𝑜𝑑𝑢𝑐𝑡 ◼ solve equations 𝑥 + 𝑦 = 𝑎, 𝑥 ⋅ 𝑦 = 𝑏 ◼ return (𝑥, 𝑦) 8

Problem and solution Issue:product is at least (n-2)! Thus even writing down the number needs log2(n-2)!=Θ(nlog n)bits. Too much compared to 0(logn)before. How to do?

Problem and solution ◼ Issue: 𝑝𝑟𝑜𝑑𝑢𝑐𝑡 is at least 𝑛 − 2 ! ◼ Thus even writing down the number needs log2 𝑛 − 2 ! = Θ(𝑛 log 𝑛) bits. ❑ Too much compared to 𝑂(log𝑛) before. ◼ How to do? 9

Improvement Note that we don't need to maintain product We can maintain anything,as long as finally we can reconstruct the solution from the stored results. One summary that is much smaller than product:sum of squares. "Recall::12+22+…+n2= n(n+1)(2n+1) 6 10

Improvement ◼ Note that we don’t need to maintain product. ◼ We can maintain anything, as long as finally we can reconstruct the solution from the stored results. ◼ One summary that is much smaller than product: sum of squares. ◼ Recall: 1 2 + 2 2 + ⋯ + 𝑛 2 = 𝑛 𝑛+1 2𝑛+1 6 10

点击下载完整版文档(PPTX)VIP每日下载上限内不扣除下载券和下载次数;
按次数下载不扣除下载券;
注册用户24小时内重复下载只扣除一次;
顺序:VIP每日次数-->可用次数-->下载券;
共33页,可试读12页,点击继续阅读 ↓↓
相关文档

关于我们|帮助中心|下载说明|相关软件|意见反馈|联系我们

Copyright © 2008-现在 cucdc.com 高等教育资讯网 版权所有