当前位置:高等教育资讯网  >  中国高校课件下载中心  >  大学文库  >  浏览文档

大数据集成(PPT讲稿)Big Data Integration

资源类别:文库,文档格式:PPTX,文档页数:109,文件大小:9.87MB,团购合买
 Motivation – Why do we need big data integration? – How has “small” data integration been done? – Challenges in big data integration  Schema alignment  Record linkage  Data fusion  Emerging topics
点击下载完整版文档(PPTX)

Big Data Integration Xin Luna Dong Google Inc) Divesh Srivastava(AT&T Labs-Research)

Big Data Integration Xin Luna Dong (Google Inc.) Divesh Srivastava (AT&T Labs-Research)

What is“ Big data Integration?” o Big data integration= Big data+ data integration Data integration: easy access to multiple data sources[DH[12 Virtual: mediated schema, query reformulation, link fuse answers Warehouse: materialized data, easy querying, consistency issues ◆ Big data: all about the v Size: large volume of data, collected and analyzed at high velocity Complexity huge variety of data, of questionable veracity Utility: data of considerable value

What is “Big Data Integration?”  Big data integration = Big data + data integration  Data integration: easy access to multiple data sources [DHI12] – Virtual: mediated schema, query reformulation, link + fuse answers – Warehouse: materialized data, easy querying, consistency issues  Big data: all about the V’s ☺ – Size: large volume of data, collected and analyzed at high velocity – Complexity: huge variety of data, of questionable veracity – Utility: data of considerable value 2

What is“ Big data Integration?” o Big data integration= Big data+ data integration Data integration: easy access to multiple data sources[DH[12 Virtual: mediated schema, query reformulation, link fuse answers Warehouse: materialized data, easy querying, consistency issues Big data in the context of data integration: still about the v's g Size: large volume of sources, changing at high velocity Complexity huge variety of sources, of questionable veracity Utility: sources of considerable value

What is “Big Data Integration?”  Big data integration = Big data + data integration  Data integration: easy access to multiple data sources [DHI12] – Virtual: mediated schema, query reformulation, link + fuse answers – Warehouse: materialized data, easy querying, consistency issues  Big data in the context of data integration: still about the V’s ☺ – Size: large volume of sources, changing at high velocity – Complexity: huge variety of sources, of questionable veracity – Utility: sources of considerable value 3

Outline ◆ Motivation Why do we need big data integration? How has"small"data integration been done? Challenges in big data integration ◆ Schema alignment ◆ Record linkage ◆ Data fusion ◆ merging topICs

Outline  Motivation – Why do we need big data integration? – How has “small” data integration been done? – Challenges in big data integration  Schema alignment  Record linkage  Data fusion  Emerging topics 4

Why do We need"Big Data Integration? Building web-scale knowledge bases ProBase MSR knowledge base A Little Knowledge Goes a Long Way Google knowledge graph 产 Freebase Doman Topics Facts 24M161M aGO ct knowledge Meda common

Why Do We Need “Big Data Integration?”  Building web-scale knowledge bases 5 Google knowledge graph MSR knowledge base A Little Knowledge Goes a Long Way. NELL

Why do We need"Big Data Integration? Reasoning over linked data N m①②

Why Do We Need “Big Data Integration?”  Reasoning over linked data 6

Why do We need"Big Data Integration? Geo-spatial data fusion ident Data Cnme Data SARS atellite Analytic Critica Hazard Data Geospatial Data Fusion http://axiomamuse.wordpress.com/2011/04/18/ 7

Why Do We Need “Big Data Integration?”  Geo-spatial data fusion 7 http://axiomamuse.wordpress.com/2011/04/18/

Why do We need"Big Data Integration? Scientific data analysis Genes genotypes Disease Models Expression C圆 Recombinases(cre) Function Pathways Strains/SNPs Orthology Tumors chiE 310 http://scienceline.org/2012/01/from-index-cards-to-information-overload/

Why Do We Need “Big Data Integration?”  Scientific data analysis 8 http://scienceline.org/2012/01/from-index-cards-to-information-overload/

Outline ◆ Motivation Why do we need big data integration? How has"small"data integration been done? Challenges in big data integration ◆ Schema alignment ◆ Record linkage ◆ Data fusion ◆ merging topICs

Outline  Motivation – Why do we need big data integration? – How has “small” data integration been done? – Challenges in big data integration  Schema alignment  Record linkage  Data fusion  Emerging topics 9

Small Data Integration: What Is It? Data integration solving lots of jigsaw puzzles Each jigsaw puzzle e. g, Ta j mahal) is an integrated entity Each piece of a puzzle comes from some source Small data integration solving small puzzles

“Small” Data Integration: What Is It?  Data integration = solving lots of jigsaw puzzles – Each jigsaw puzzle (e.g., Taj Mahal) is an integrated entity – Each piece of a puzzle comes from some source – Small data integration → solving small puzzles 10

点击下载完整版文档(PPTX)VIP每日下载上限内不扣除下载券和下载次数;
按次数下载不扣除下载券;
24小时内重复下载只扣除一次;
顺序:VIP每日次数-->可用次数-->下载券;
共109页,可试读30页,点击继续阅读 ↓↓
相关文档

关于我们|帮助中心|下载说明|相关软件|意见反馈|联系我们

Copyright © 2008-现在 cucdc.com 高等教育资讯网 版权所有