正在加载图片...
The Google File System Sanjay Ghemawat,Howard Gobioff,and Shun-Tak Leung Google ABSTRACT 1.INTRODUCTION We have designed and implemented the Google File Sys- We have designed and implemented the Google File Sys- tem,a scalable distributed file system for large distributed tem(GFS)to meet the rapidly growing demands of Google's data-intensive applications.It provides fault tolerance while data processing needs.GFS shares many of the same goals running on inexpensive commodity hardware,and it delivers as previous distributed file systems such as performance, high aggregate performance to a large number of clients. scalability,reliability,and availability.However,its design While sharing many of the same goals as previous dis- has been driven by key observations of our application work- tributed file systems,our design has been driven by obser- loads and technological environment,both current and an- vations of our application workloads and technological envi- ticipated,that reflect a marked departure from some earlier ronment,both current and anticipated,that refect a marked file system design assumptions.We have reexamined tradi- departure from some earlier file system assumptions.This tional choices and explored radically different points in the has led us to reexamine traditional choices and explore rad- design space. ically different design points. First,component failures are the norm rather than the The file system has successfully met our storage needs. exception.The file system consists of hundreds or even It is widely deployed within Google as the storage platform thousands of storage machines built from inexpensive com- for the generation and processing of data used by our ser- modity parts and is accessed by a comparable number of vice as well as research and development efforts that require client machines.The quantity and quality of the compo- large data sets.The largest cluster to date provides hun- nents virtually guarantee that some are not functional at dreds of terabytes of storage across thousands of disks on any given time and some will not recover from their cur- over a thousand machines,and it is concurrently accessed rent failures.We have seen problems caused by application by hundreds of clients. bugs,operating system bugs,human errors,and the failures In this paper,we present file system interface extensions of disks,memory,connectors,networking,and power sup- designed to support distributed applications,discuss many plies.Therefore,constant monitoring,error detection,fault aspects of our design,and report measurements from both tolerance,and automatic recovery must be integral to the micro-benchmarks and real world use. system. Second,files are huge by traditional standards.Multi-GB Categories and Subject Descriptors files are common.Each file typically contains many applica- tion objects such as web documents.When we are regularly D [4:3-Distributed file systems working with fast growing data sets of many TBs comprising billions of objects,it is unwieldy to manage billions of ap General Terms proximately KB-sized files even when the file system could support it.As a result,design assumptions and parameters Design,reliability,performance,measurement such as I/O operation and block sizes have to be revisited. Third,most files are mutated by appending new data Keywords rather than overwriting existing data.Random writes within Fault tolerance,scalability,data storage,clustered storage a file are practically non-existent.Once written,the files are only read,and often only sequentially.A variety of *The authors can be reached at the following addresses: data share these characteristics.Some may constitute large sanjay,hgobioff.shuntak.@google.com. repositories that data analysis programs scan through.Some may be data streams continuously generated by running ap- plications.Some may be archival data.Some may be in- termediate results produced on one machine and processed Permission to make digital or hard copies of all or part of this work for on another,whether simultaneously or later in time.Given personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies this access pattern on huge files,appending becomes the fo- cus of performance optimization and atomicity guarantees. bear this notice and the full citation on the first page.To copy otherwise,to republish,to post on servers or to redistribute to lists,requires prior specific while caching data blocks in the client loses its appeal. permission and/or a fee. Fourth,co-designing the applications and the file system SOSP'03.October 19-22,2003,Bolton Landing.New York,USA. API benefits the overall system by increasing our flexibility Copyright2003ACM1-58113-757-5/03/0010.$5.00.The Google File System Sanjay Ghemawat, Howard Gobioff, and Shun-Tak Leung Google∗ ABSTRACT We have designed and implemented the Google File Sys￾tem, a scalable distributed file system for large distributed data-intensive applications. It provides fault tolerance while running on inexpensive commodity hardware, and it delivers high aggregate performance to a large number of clients. While sharing many of the same goals as previous dis￾tributed file systems, our design has been driven by obser￾vations of our application workloads and technological envi￾ronment, both current and anticipated, that reflect a marked departure from some earlier file system assumptions. This has led us to reexamine traditional choices and explore rad￾ically different design points. The file system has successfully met our storage needs. It is widely deployed within Google as the storage platform for the generation and processing of data used by our ser￾vice as well as research and development efforts that require large data sets. The largest cluster to date provides hun￾dreds of terabytes of storage across thousands of disks on over a thousand machines, and it is concurrently accessed by hundreds of clients. In this paper, we present file system interface extensions designed to support distributed applications, discuss many aspects of our design, and report measurements from both micro-benchmarks and real world use. Categories and Subject Descriptors D [4]: 3—Distributed file systems General Terms Design, reliability, performance, measurement Keywords Fault tolerance, scalability, data storage, clustered storage ∗The authors can be reached at the following addresses: {sanjay,hgobioff,shuntak}@google.com. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. SOSP’03, October 19–22, 2003, Bolton Landing, New York, USA. Copyright 2003 ACM 1-58113-757-5/03/0010 ...$5.00. 1. INTRODUCTION We have designed and implemented the Google File Sys￾tem (GFS) to meet the rapidly growing demands of Google’s data processing needs. GFS shares many of the same goals as previous distributed file systems such as performance, scalability, reliability, and availability. However, its design has been driven by key observations of our application work￾loads and technological environment, both current and an￾ticipated, that reflect a marked departure from some earlier file system design assumptions. We have reexamined tradi￾tional choices and explored radically different points in the design space. First, component failures are the norm rather than the exception. The file system consists of hundreds or even thousands of storage machines built from inexpensive com￾modity parts and is accessed by a comparable number of client machines. The quantity and quality of the compo￾nents virtually guarantee that some are not functional at any given time and some will not recover from their cur￾rent failures. We have seen problems caused by application bugs, operating system bugs, human errors, and the failures of disks, memory, connectors, networking, and power sup￾plies. Therefore, constant monitoring, error detection, fault tolerance, and automatic recovery must be integral to the system. Second, files are huge by traditional standards. Multi-GB files are common. Each file typically contains many applica￾tion objects such as web documents. When we are regularly working with fast growing data sets of many TBs comprising billions of objects, it is unwieldy to manage billions of ap￾proximately KB-sized files even when the file system could support it. As a result, design assumptions and parameters such as I/O operation and blocksizes have to be revisited. Third, most files are mutated by appending new data rather than overwriting existing data. Random writes within a file are practically non-existent. Once written, the files are only read, and often only sequentially. A variety of data share these characteristics. Some may constitute large repositories that data analysis programs scan through. Some may be data streams continuously generated by running ap￾plications. Some may be archival data. Some may be in￾termediate results produced on one machine and processed on another, whether simultaneously or later in time. Given this access pattern on huge files, appending becomes the fo￾cus of performance optimization and atomicity guarantees, while caching data blocks in the client loses its appeal. Fourth, co-designing the applications and the file system API benefits the overall system by increasing our flexibility
向下翻页>>
©2008-现在 cucdc.com 高等教育资讯网 版权所有