当前位置:高等教育资讯网  >  中国高校课件下载中心  >  大学文库  >  浏览文档

北京大学:《大规模数据处理——云计算 Mass Data Processing Cloud Computing》课程教学资源(PPT课件)MapReduce原理 MapReduce Theory and Practice

资源类别:文库,文档格式:PPT,文档页数:18,文件大小:2.35MB,团购合买
点击下载完整版文档(PPT)

Data The term data refers to groups of information that represent the qualitative or quantitative attributes of a variable or set of variables. Data (plural of "datum",which is seldom used)are typically the results of measurements and can be the basis of graphs,images,or observations of a set of variables. Data are often viewed as the lowest level of abstraction from which information and knowledge are derived. Raw data refers to a collection of numbers,characters,images or other outputs from devices that collect information to convert physical quantities into symbols,that are unprocessed. 4

4 Data ◼ The term data refers to groups of information that represent the qualitative or quantitative attributes of a variable or set of variables. ◼ Data (plural of "datum", which is seldom used) are typically the results of measurements and can be the basis of graphs, images, or observations of a set of variables. ◼ Data are often viewed as the lowest level of abstraction from which information and knowledge are derived. ◼ Raw data refers to a collection of numbers, characters, images or other outputs from devices that collect information to convert physical quantities into symbols, that are unprocessed

Bit Multiples of bits v.d.e 位(英语:Bt), 亦称二 SI decimal prefixes IEC binary prefixes 进制位,指二进制中的一位, Name Standard Binary Name Value 是信息的最小单位。Bt是 (Symbol) SI usage (Symbol) Binary digit(二进制数位) kilobit (kbit) 103 210 kibibit(Kibit) 210 的缩写 megabit(Mbit) 106 20 mebibit (Mibit) 20 假设一事件以A或B的方式发 gigabit(Gbit) 109 230 gibibit (Gibit) 30 生,且A、B发生的概率相等 terabit(Tbit) 1012 290 tebibit(Tibit) 30 都为0.5,则一个二进位可用 petabit(Pbit) 1015 250 pebibit(Pibit) 250 来代表A或B之一。例如: exabit (Ebit) 1018 260 exbibit (Eibit) 60 二进位可以用来表示一个简单 zettabit(Zbit) 1021 270 zebibit(Zibit) 20 的正负 yottabit (Ybit) 1024 280 yobibit (Yibit) 20 ◆ 有两种状态的开关(如电灯开关) See also:Nibble·Byte·Multiples of bytes 晶体管的通断 Orders of magnitude of data 。某根导线上电压的有无 一个抽像的逻辑上的是否 5

5 Bit ◼ 位(英语:Bit),亦称二 进制位,指二进制中的一位, 是信息的最小单位。Bit是 Binary digit(二 进制数位) 的缩写 ◼ 假设一事件以A或B的方式发 生,且A、B发生的概率相等, 都为0.5,则一个二进位可用 来代表A或B之一。 例如: ◼ 二进位可以用来表示一个简单 的正负 ◼ 有两种状态的开关(如电灯开关) ◼ 晶体管的通断 ◼ 某根导线上电压的有无 ◼ 一个抽像的逻辑上的是否

Byte 55U#1 SEPTE法BE钢197形 8uTE 字节,英文名称是Byte。 Byte是Binary Terml的 150 the small systems journal 缩写。一个字节代表八 个比特。它是通常被作 Which Microprocessor for you? 为计算机信息计量单位, Cassette Interface-Your key to inexpensive bulk memory 不论被存储数据的类型 Assembling Your Assembler 为何。 Can YOU use these SURPLUS KEYBOARDS? (You bet you can!) COMPUTERS. the World's Greatest Toy! 6

6 Byte ◼ 字节,英文名称是Byte。 Byte是Binary Term的 缩写。一个字节代表八 个比特。它是通常被作 为计算机信息计量单位, 不论被存储数据的类型 为何

History of "Information" Latin origin:a representation implanted in the mind->idea Language and Coding:hide information in messages and then decode them。莫尔斯电码 Mathematics:Shannon:在channel transmission.工作中,定 义了一个message)所包含的信息量为它在source中出现概率 的log2,单位为’bits'。 Logic and linguistics:communication-oriented sense of information涉及到semantic meaning语义,knowledge知识 Society:information as something that is contained in the message used to inform."information is the tennis ball of communication" 7

7 History of “Information” ◼ Latin origin: a representation implanted in the mind-> idea ◼ Language and Coding:hide information in messages and then decode them。 莫尔斯电码 ◼ Mathematics: Shannon在channel transmission工作中,定 义了一个message所包含的信息量为它在source中出现概率 的log2 ,单位为’bits’。 ◼ Logic and linguistics:communication-oriented sense of information涉及到semantic meaning语义, knowledge知识 ◼ Society:information as something that is contained in the message used to inform. “information is the tennis ball of communication

Human Genomics http://www.int 7000PB) nte Wikipedia Particle Physics World Wide Web G日 (10GB) Large Hadron on iCollider 200BCa1 1PB) tured 100%CAGR 15PB) 200°6CA GR www.intel.co W11W1 VE Personal Digital Annual Email Internet Archive Estimated On-line Photos Traffic,no spam RAM in Google (300PB+)】 (1PB+) (8PB) 1000PB+) 00%CAGR 200 of London's 2004 Walmart Typical Oil Merck Bio Traffic Cams Transaction DB Company Research DB (8TB/day) (500TB) (350TB+) (1.5TB/qtr) UPMC Hospitals MIT Babytalk Terashake One Day of Imaging Data Speech Earthquake Model Instant Messaging (500TB/yr) Experiment of LA Basin in2002 1.4PB) (1PB) (750GB) Total digital data to be created this year 270,000PB (IDC)

8

How much data? Google processes 20 PB a day (2008) Wayback Machine has 3 PB 100 TB/month(3/2009) a Facebook has 2.5 PB of user data +15 TB/day (4/2009) eBay has 6.5 PB of user data 50 TB/day (5/2009) ■( CERN's LHC will generate 15 PB a year (?? 640K ought to be enough for anybody. 9

9 How much data? ◼ Google processes 20 PB a day (2008) ◼ Wayback Machine has 3 PB + 100 TB/month (3/2009) ◼ Facebook has 2.5 PB of user data + 15 TB/day (4/2009) ◼ eBay has 6.5 PB of user data + 50 TB/day (5/2009) ◼ CERN’s LHC will generate 15 PB a year (??) 640K ought to be enough for anybody

"We are living in exponential times ?UM 2!! 10

10 “We are living in exponential times

Information Overloading Political theorist Neil Postman spoke to the German Informatics Society in 1990,claiming that we are informing ourselves to death.He argued that the development of computer technology is not as positive as it has been heralded to be.With our focus on technology, we are forfeiting our humanity.We are drowning in information that contains empty promises of improving our lives.(Postman 1990). 11

11 Information Overloading ◼ Political theorist Neil Postman spoke to the German Informatics Society in 1990, claiming that we are informing ourselves to death. He argued that the development of computer technology is not as positive as it has been heralded to be. With our focus on technology, we are forfeiting our humanity. We are drowning in information that contains empty promises of improving our lives. (Postman 1990)

怎样应对信息过载? 12

12 怎样应对信息过载?

What's matter with ME?! What you want to do with 1000pcs,or even 100,000pcs? 13

13 What’s matter with ME?! ◼ What you want to do with 1000pcs, or even 100,000 pcs?

点击下载完整版文档(PPT)VIP每日下载上限内不扣除下载券和下载次数;
按次数下载不扣除下载券;
24小时内重复下载只扣除一次;
顺序:VIP每日次数-->可用次数-->下载券;
共18页,试读已结束,阅读完整版请下载
相关文档

关于我们|帮助中心|下载说明|相关软件|意见反馈|联系我们

Copyright © 2008-现在 cucdc.com 高等教育资讯网 版权所有