当前位置:高等教育资讯网  >  中国高校课件下载中心  >  大学文库  >  浏览文档

中国科学院:CERN专题计算学校《T-CSC数据存储》课程教学资源(讲义)Many ways to store data-pres

资源类别:文库,文档格式:PDF,文档页数:81,文件大小:421.32KB,团购合买
点击下载完整版文档(PDF)

Many ways to store data 4 tfevices distr/c/ Many ways to store data Sebastien Ponce sebastien.ponce@cern.ch CERN Thematic CERN School of Computing 2018 1/42 S.Ponce-CERN

Many ways to store data 1 / 42 S. Ponce - CERN devices distrib // c/c Many ways to store data S´ebastien Ponce sebastien.ponce@cern.ch CERN Thematic CERN School of Computing 2018

Many ways to store data Overall Course Structure Many ways to Store Data o Storage devices and their specificities Distributing and parallelizing storage Preserving data ●Data consistency Data safety Key ingredients to achieve efficient I/O o Synchronous vs asynchronous I/O I/O optimizations and caching 2/42 S.Ponce-CERN

Many ways to store data 2 / 42 S. Ponce - CERN devices distrib // c/c Overall Course Structure Many ways to Store Data Storage devices and their specificities Distributing and parallelizing storage Preserving data Data consistency Data safety Key ingredients to achieve efficient I/O Synchronous vs asynchronous I/O I/O optimizations and caching

Many ways to store data Outline Storage devices ●Existing devices Hierarchical storage ② Distributed storage ●Data distribution ●Data federation ③ Parallelizing files'storage ●Striping Introduction to Map/Reduce Conclusion 3/42 S.Ponce-CERN

Many ways to store data 3 / 42 S. Ponce - CERN devices distrib // c/c Outline 1 Storage devices Existing devices Hierarchical storage 2 Distributed storage Data distribution Data federation 3 Parallelizing files’ storage Striping Introduction to Map/Reduce 4 Conclusion

Many ways to store data 4 devices distn他/∥c Storage devices ①Storage devices ● Existing devices oHierarchical storage Distributed storage Parallelizing files'storage Conclusion oo HSM 4/42 S.Ponce-CERN

Many ways to store data 4 / 42 S. Ponce - CERN devices distrib // c/c zoo HSM Storage devices 1 Storage devices Existing devices Hierarchical storage 2 Distributed storage 3 Parallelizing files’ storage 4 Conclusion

Many ways to store data devices distnb /cft A variety of storage devices Main differences o Capacities from 1 GB to 10TB per unit o Prices from 1 to 300 for the same capacity o Very different reliability oVery different speeds too HSM 5/42 S.Ponce-CERN

Many ways to store data 5 / 42 S. Ponce - CERN devices distrib // c/c zoo HSM A variety of storage devices Main differences Capacities from 1 GB to 10 TB per unit Prices from 1 to 300 for the same capacity Very different reliability Very different speeds Typical numbers in 2018 Capacity per unit Latency $/TB Speed reliability RAM 16 GB 5 ns 9000 ✩ 10 GB s −1 volatile SSD 500 GB 10 ➭s 300 ✩ 550 MB s −1 poor HD 6 TB 3 ms 25 ✩ 150 MB s −1 average Tape 10 TB 100 s 20 ✩ 500 MB s −1 good

Many ways to store data devices distnb //c/ A variety of storage devices Main differences o Capacities from 1 GB to 10TB per unit o Prices from 1 to 300 for the same capacity o Very different reliability o Very different speeds Typical numbers in 2018 Capacity Latency $/TB Speed reliability per unit RAM 16GB 5ns 9000$ 10GBs-1 volatile SSD 500GB 10μs 300$ 550MBs-1 poor HD 6TB 3ms 25$ 150MBs-1 average Tape 10TB 100s 20$ 500MBs-1 good too HSM 5/42 S.Ponce-CERN

Many ways to store data 5 / 42 S. Ponce - CERN devices distrib // c/c zoo HSM A variety of storage devices Main differences Capacities from 1 GB to 10 TB per unit Prices from 1 to 300 for the same capacity Very different reliability Very different speeds Typical numbers in 2018 Capacity per unit Latency $/TB Speed reliability RAM 16 GB 5 ns 9000 ✩ 10 GB s−1 volatile SSD 500 GB 10 ➭s 300 ✩ 550 MB s−1 poor HD 6 TB 3 ms 25 ✩ 150 MB s−1 average Tape 10 TB 100 s 20 ✩ 500 MB s−1 good

Many ways to store data devices distnb //cft 花5 A variety of storage devices You cannot have everything cheap HD Tape SSD RAM reliability speed too HSM 6/42 S.Ponce-CERN

Many ways to store data 6 / 42 S. Ponce - CERN devices distrib // c/c zoo HSM A variety of storage devices You cannot have everything cheap reliability speed RAM SSD HD Tape

Many ways to store data devices distnb Reliability in real world (CERN) For disks probability of losing a disk per year:few %up to 10% with 60K disks,it's around 10 per day and all files are lost o one unrecoverable bit error in 1014 bits read/written for 10GB files,that's one file corrupted per 1000 files written too HSM 7/42 S.Ponce-CERN

Many ways to store data 7 / 42 S. Ponce - CERN devices distrib // c/c zoo HSM Reliability in real world (CERN) For disks probability of losing a disk per year : few %, up to 10% with 60K disks, it’s around 10 per day and all files are lost one unrecoverable bit error in 1014 bits read/written for 10GB files, that’s one file corrupted per 1000 files written For tapes probability of losing a tape per year : 10 −4 and you recover most of the data on it net result is 10 −7 file loss per year one unrecoverable bit error in 10 19 bits read/written for 10GB files, that’s one file corrupted per 100M files written

Many ways to store data devices distnb //c/ Reliability in real world (CERN) For disks ● probability of losing a disk per year:few %up to 10% with 60K disks,it's around 10 per day and all files are lost o one unrecoverable bit error in 1014 bits read/written for 10GB files,that's one file corrupted per 1000 files written For tapes probability of losing a tape per year:10-4 and you recover most of the data on it o net result is 10-7 file loss per year one unrecoverable bit error in 1019 bits read/written for 10GB files,that's one file corrupted per 100M files written too HSM 7/42 S.Ponce-CERN

Many ways to store data 7 / 42 S. Ponce - CERN devices distrib // c/c zoo HSM Reliability in real world (CERN) For disks probability of losing a disk per year : few %, up to 10% with 60K disks, it’s around 10 per day and all files are lost one unrecoverable bit error in 1014 bits read/written for 10GB files, that’s one file corrupted per 1000 files written For tapes probability of losing a tape per year : 10−4 and you recover most of the data on it net result is 10−7 file loss per year one unrecoverable bit error in 1019 bits read/written for 10GB files, that’s one file corrupted per 100M files written

Many ways to store data 4 devices distn/∥ch Practical Mass Storage-Real Big Data when you count in 100s of PetaBytes... The constraints disks or tapes are the only possible solutions odisks are unreliable at that scale,and need redundancy we'll see that extensively tapes are cheaper long term storage by factor 2-2.5 tape latency imposes data access on disk 0o HSM 8/42 S.Ponce-CERN

Many ways to store data 8 / 42 S. Ponce - CERN devices distrib // c/c zoo HSM Practical Mass Storage - Real Big Data when you count in 100s of PetaBytes... The constraints disks or tapes are the only possible solutions disks are unreliable at that scale, and need redundancy we’ll see that extensively tapes are cheaper long term storage by factor 2-2.5 tape latency imposes data access on disk

点击下载完整版文档(PDF)VIP每日下载上限内不扣除下载券和下载次数;
按次数下载不扣除下载券;
24小时内重复下载只扣除一次;
顺序:VIP每日次数-->可用次数-->下载券;
共81页,可试读20页,点击继续阅读 ↓↓
相关文档

关于我们|帮助中心|下载说明|相关软件|意见反馈|联系我们

Copyright © 2008-现在 cucdc.com 高等教育资讯网 版权所有