相关文档

河南中医药大学（河南中医学院）：《计算机网络》课程教学资源（PPT课件讲稿）第三章数据链路层
《多媒体教学软件设计》课程教学资源（PPT课件讲稿）第4章多媒体教学软件的图文演示设计
四川大学：《计算机操作系统 Operating System Principles》课程教学资源（PPT课件讲稿）第9章文件管理
南京航空航天大学：《数据结构》课程教学资源（PPT课件讲稿）第十章排序
西安电子科技大学：《信息系统安全》课程教学资源（PPT课件讲稿）第二章安全控制原理
《C程序设计》课程电子教案（PPT课件讲稿）第四章数组和结构
北京航空航天大学：Graph Search & Social Networks
《数字图像处理 Digital Image Processing》课程教学资源（各章要求及必做题参考答案）
Online Minimum Matching in Real-Time Spatial Data：Experiments and Analysis
中国科学技术大学：《并行算法实践》课程教学资源（PPT课件讲稿）上篇并行程序设计导论单元II 并行程序编程指南第七章 OpenMP编程指南
上海交通大学：《网络安全技术》课程教学资源（PPT课件讲稿）比特币（主讲：刘振）
电子工业出版社：《计算机网络》课程教学资源（第五版，PPT课件讲稿）第三章数据链路层
同济大学：《大数据分析与数据挖掘 Big Data Analysis and Mining》课程教学资源（PPT课件讲稿）Clustering Basics（主讲：赵钦佩）
东南大学：《C++语言程序设计》课程教学资源（PPT课件讲稿）Chapter 09 Classes A Deeper Look（Part 1）
贵州电子信息职业技术学院：常用办公技巧（PPT讲稿，主讲：刘忠华）
计算机软件技术基础：《Visual Basic6.0 程序设计》课程教学资源（PPT课件）第1章 Visual Basic（VB）概述
Dynamic Pricing in Spatial Crowdsourcing：A Matching-Based Approach
《Java Web应用开发基础》课程教学资源（PPT课件）第8章 EL、JSTL和Ajax技术
《计算机组装与维修》课程电子教案（PPT教学课件）第一章计算机系统维护维修基础
湖南生物机电职业技术学院：《电子商务概论》课程教学资源（PPT课件）第六章网上支付
西安交通大学：《网络与信息安全》课程PPT教学课件（网络入侵与防范）第四章口令破解与防御技术
《机器学习》课程教学资源（PPT课件讲稿）第十二章计算学习理论 Machine Learning
广西外国语学院：《计算机网络》课程教学资源（PPT课件讲稿）第9章 DHCP协议（任课教师：卢豫开）
《信息技术基础》课程教学资源（PPT课件）信息技术基础知识的内容
《PHP程序设计》教学资源（PPT课件讲稿）项目二网站用户中心
Microsoft .NET（PPT课件讲稿）Being Objects and A Glimpse into Coding
《Data Warehousing & Data Mining》课程教学资源（PPT讲稿）Ch 2 Discovering Association Rules
《软件工程》课程教学资源（PPT课件讲稿）需求分析
西安电子科技大学：《微机原理与接口技术》课程教学资源（PPT课件讲稿）第八章中断系统与可编程中断控制器8259A
《ARM原理与设计》课程教学资源（PPT课件讲稿）Lecture 04 Cortex M3指令集
电子工业出版社：《计算机网络》课程教学资源（第五版，PPT课件讲稿）第一章概述
上海交通大学：《计算机控制技术》课程教学资源（PPT课件）第一章计算机控制系统概述 Computer Control Technology
3D computer vision techniques v.4b2 1
山东大学：《微机原理及单片机接口技术》课程教学资源（PPT课件讲稿）第六章中断 §6.1 中断的概念 §6.2 单片机的中断系统及其管理
《人工智能导论》课程教学资源（PPT课件讲稿）群智能（Swarm Intelligence）
《计算机网络与互联网 Computer Networks and Internets》课程电子教案（PPT课件讲稿）Part IV 局域网 Local Area Networks（LANs）
《计算机网络》课程电子教案（PPT课件讲稿）第2章数据通信与广域网技术
西安电子科技大学：《信息系统安全》课程教学资源（PPT课件讲稿）第三章信息安全保障体系、第四章物理安全
《计算机文化基础》课程教学资源（PPT课件讲稿）第四章电子表格系统Excel 2003
南京大学：Decidability、Complexity（P、NP、NPC）、Reduce（P NP NPC）

上海交通大学：《Multicore Architecture and Parallel Computing》课程教学资源（PPT课件讲稿）Lecture 9 MapReduce

团购合买资源类别：文库，文档格式：PPTX，文档页数：56，文件大小：2.6MB

上声定通大字 SHANGHAI JLAO TONG UNIVERSITY CS427 Multicore Architecture and Parallel Computing Lecture 9 MapReduce Prof Li Jiang 201411/19

CS427 Multicore Architecture and Parallel Computing Lecture 9 MapReduce Prof. Li Jiang 2014/11/19 1

O What is MapReduce Origin from Google, [OSDI 04 A Simple programming mode Functional model For large-scale data processing Exploits large set of commodity computers Executes process in distributed manner Offers high availability

What is MapReduce 2 • Origin from Google, [OSDI’04] • A simple programming model • Functional model • For large-scale data processing – Exploits large set of commodity computers – Executes process in distributed manner – Offers high availability

③ Motivation Large-scale data processing Want to use 1000s of Cpus But don t want hassle of managing things Mapreduce provides Automatic parallelization e distribution fault tolerance Monitoring &t status updates

Motivation 3 • Large-Scale Data Processing – Want to use 1000s of CPUs – But don’t want hassle of managing things • MapReduce provides – Automatic parallelization & distribution – Fault tolerance – I/O scheduling – Monitoring & status updates

o)Benefit of MapReduce Map/reduce Programming model from Lisp (and other functional languages) Many problems can be phrased this way easy to distribute across nodes Nice retry/failure semantics

Benefit of MapReduce 4 • Map/Reduce – Programming model from Lisp – (and other functional languages) • Many problems can be phrased this way • Easy to distribute across nodes • Nice retry/failure semantics

G) Distributed Word Count Split data→→ count→ count Very Split datal→→ count count merged big split datal→→ count count +merde count data Split data→→ count→ count」

Distributed Word Count 5 Very big data Split data Split data Split data Split data count count count count count count count count merge merged count

O Distributed Grep Split datal→→grep matches Very sp| t data→grep matches big-sp| t data→grep→ matches→cat→A∥ matches data Split data-+ grep matches

Distributed Grep 6 Very big data Split data Split data Split data Split data grep grep grep grep matches matches matches matches cat All matches

②Map+ Reduce Very MAP Partitioning big REDUcE Result Function data Map Reduce Accepts input key / value Accepts intermediate pair ★ ey/value pair -Emits intermediate Emits output key/value Rey value pair

Map+Reduce 7 • Map – Accepts input key/value pair – Emits intermediate key/value pair • Reduce – Accepts intermediate key/value* pair – Emits output key/value pair Very big data Result M A P R E D U C E Partitioning Function

②Map+ Reduce map(key val) is run on each item in set emits new-Rey/ new-val pairs reduce(key, vals) is run for each unique key emitted by mapo emits final output

8 • map(key, val) is run on each item in set – emits new-key / new-val pairs • reduce(key, vals) is run for each unique key emitted by map() – emits final output Map+Reduce

G)Square Sum (map f list lista listg'Unary operator ( map square“(1234) -14916 o reduce (14916) 30

Square Sum 9 • (map f list [list2 list3 …]) • (map square ‘(1 2 3 4)) – (1 4 9 16) • (reduce + ‘(1 4 9 16)) – (+ 16 (+ 9 (+ 4 1) ) ) – 30

G)Word Count Input consists of(url, contents) pairs map key=url, val=contents) For each word w in contents,emit(W,“1”) reduce key-word, values=unig- counts Sum all“1” s in values list Emit result "(word, sum

Word Count 10 – Input consists of (url, contents) pairs – map(key=url, val=contents): • For each word w in contents, emit (w, “1”) – reduce(key=word, values=uniq_counts): • Sum all “1”s in values list • Emit result “(word, sum)

点击进入文档下载页（PPTX格式）

共56页，可试读19页，点击继续阅读 ↓↓

点击下载（PPTX格式）

浏览记录