
BuildingHigh-PerformanceandCost-EffectiveStorageSystemswithFlashMemorybasedSolidStateDrives
Building High-Performance and Cost-Effective Storage Systems with Flash Memory based Solid State Drives

2Evolution of Storage and new DemandHard diskdrive (HDD)PerformnaceGapMajorstoragedevicesince19566.E+065.E+06Merits4.E+06Largecapacity,lowcostDRAMMostcommonlyusedstorage3.E+06DISKssaoo2.E+06MechanicalNature1.E+06Unsatisfactoryperformance0.E+00Highpowerconsumption19801985199020001995Source:BryantandO'Hallaron,"ComputerSystems:AProgrammer'sPerspective",PrenticeHall,20031956:IBM305RAMACcomputer1973:IBM33402007:Hitachi GSTEmerging andwithharddisk(5MB/1,200RPM)35-70MBDeskstar7K1000,1st1TBFuture
Track 0 Track 1 Track c – 1 Sector Recording area Spindle Direction of rotation Platter Read/write head Actuator Arm Track 2 Sourc e: C omputer Architecture, M emory System D esign, B . P arhami, U CSB 0.E+00 1.E+06 2.E+06 3.E+06 4.E+06 5.E+06 6.E+06 1980 1985 1990 1995 2000 Access Time in Cycles Performnace Gap DRAM DISK Source: Bryant and O’Hallaron, “Computer Systems: A Programmer’s Perspective", Prentice Hall, 2003 Evolution of Storage and new Demand 2 1956: IBM 305 RAMAC computer with hard disk (5MB/1,200RPM) 1973: IBM 3340 35-70MB 2007: Hitachi GST Deskstar 7K1000, 1st 1TB • Hard disk drive (HDD) – Major storage device since 1956 • Merits – Large capacity, low cost – Most commonly used storage • Mechanical Nature – Unsatisfactory performance – High power consumption Emerging and Future SSD -100x

3Highpower/latencyfrommechanicaloperationsRelativeSizeofDiskVoTimeComponentsfarRandomivoSCSITrarsfer3%Seek157%Rotaion(RPM27%IntermelTranstertoEmbeddedDiskConrollerOther7%6%Adominantdiskaccesstimecomesfrommechanical operations- Seek (57%)+ rotation (27%)+ data fetch (7%)+ other overhead (6%) = 97%Datatransfertimeonly3%Source:Configuration and CapacityPlanningforSolaris Servers,SunMicrosystems
High power/latency from mechanical operations 3 • A dominant disk access time comes from mechanical operations – Seek (57%) + rotation (27%) + data fetch (7%) + other overhead (6%) = 97% – Data transfer time only 3% Source: Configuration and Capacity Planning for Solaris Servers, Sun Microsystems

A Scientific Discovery started a Revolution in DisksGiantMagneto-resistence(GMR)wasdiscoveredin1988- By Peter Gruenberg (Germany) and Albert Fert (France)-Giant resistancechanges in materials made ofalternating and verythin (nanometer thin)layerswhenexposedtomagneticfields.This discovery lays a foundation to increase the HDD densityFirstGMRbasedcommercialHDDof16GBbyIBMappearedin1997 Starting 2007, 1,000 +GB (TaraBytes) HDDs are available in the marketNextgenerationfastandhighdensitymemory:MagnetoresistiveRAM-GruenbergandFertreceivedthe2007PhysicsNobelPrizeforGMR
A Scientific Discovery started a Revolution in Disks • Giant Magneto-resistence (GMR) was discovered in 1988 – By Peter Gruenberg (Germany) and Albert Fert (France) – Giant resistance changes in materials made of alternating and very thin (nanometer thin) layers when exposed to magnetic fields. – This discovery lays a foundation to increase the HDD density – First GMR based commercial HDD of 16 GB by IBM appeared in 1997. – Starting 2007, 1,000 +GB (TaraBytes) HDDs are available in the market – Next generation fast and high density memory: Magnetoresistive RAM – Gruenberg and Fert received the 2007 Physics Nobel Prize for GMR 4

5Evolution of the 5 Minute RuleFirst version: Jim Gray and Franco Putzolu (1987, SIGMOD)-Background:diskcapacityislowandexpensive,latencyisnotanissue- Accessing I KB data in disk costs $2,0oo, but only $5 in main memory-Rule:pagesreferenced every5minutes shouldbememory residentSecondversion:JimGrayandP.Shenoy (2000,ICDE)-Background:capacity is up 1,oo0x, bandwidth only 40x,very low price- 5 minute rule becomes a caching rule for performance due to:- (1) Diskaccesses slow 10X per decade; (2)disk scanning time increasesArecentversion:G.Graefe(CACM,2009)Background:ssDisstillexpensive,diskspaceisalmostfree,lowspeedForsmallsizeblocks,5minuteruleholdsbetweenDRAM/SSDForaverylargesizeblocks,5minuteruleholdsbetweenSsD/disks
Evolution of the 5 Minute Rule • First version: Jim Gray and Franco Putzolu (1987, SIGMOD) – Background: disk capacity is low and expensive, latency is not an issue – Accessing I KB data in disk costs $2,000, but only $5 in main memory – Rule: pages referenced every 5 minutes should be memory resident • Second version: Jim Gray and P. Shenoy (2000, ICDE) – Background: capacity is up 1,000x, bandwidth only 40X, very low price – 5 minute rule becomes a caching rule for performance due to: – (1) Disk accesses slow 10X per decade; (2) disk scanning time increases • A recent version: G. Graefe (CACM, 2009) – Background: SSD is still expensive, disk space is almost free, low speed – For small size blocks, 5 minute rule holds between DRAM/SSD – For a very large size blocks, 5 minute rule holds between SSD/disks 5

6HDD Improvement has been focused on Density.Huge capacity disks with low price and small size still have- Low speed and highenergy consumption (current stage)-Highcapacitycauseshighaccesslatency (formorethan1oyears)SpecificissuesandconcernsCapacity/bandwidth increases significantly,sodoes latency- Space is almost free, but to access data is increasingly more expensive-Economicmodel:adiskshouldbeinfrequentlyaccessedforarchival- DRAMbuffer can address theperformance issues,but notthepowerAfastandlowpowerstorageishighlydesirable
HDD Improvement has been focused on Density • Huge capacity disks with low price and small size still have – Low speed and high energy consumption (current stage) – High capacity causes high access latency (for more than 10 years) • Specific issues and concerns – Capacity/bandwidth increases significantly , so does latency – Space is almost free, but to access data is increasingly more expensive – Economic model: a disk should be infrequently accessed for archival – DRAM buffer can address the performance issues, but not the power • A fast and low power storage is highly desirable. 6

Flash Memory based Solid State Drive Solid State Drive (SSD)-Asemiconductordevice-MechanicalcomponentsfreeTechnicalmerits Low latency (e.g. 75μs)Highbandwidth(e.g.250MB/sec)-Lowpower:0.06(idle)~2.4w (active)-ShockresistanceLifespan:100GB/day →>5years (x25-M)
Flash Memory based Solid State Drive • Solid State Drive (SSD) – A semiconductor device – Mechanical components free • Technical merits – Low latency (e.g. 75µs) – High bandwidth (e.g. 250MB/sec) – Low power: 0.06 (idle)~2.4w (active) – Shock resistance – Lifespan: 100GB/day → 5 years (X25-M) 7

8The ssD cell state changes as voltage changesOxide SidewallPOLY2CONTROLGATEInterPolyTunnelDielectric ONOEachSSDcelOxidePOLY1FLOATINGGATEmade up bytransistorsN+SourceN+DRAINP-Type SiliconSubstrateElectrons are storedin the“floating gate"-AlargevoltagedifferencebetweenSourceandDrainmaintainsalevelofelectronsinthefloatinggate-Thevoltagevolumedeterminesthelevel ofelectrons,whichisthestate of the cell
The SSD cell state changes as voltage changes 8 Each SSD cell made up by transistors • Electrons are stored in the “floating gate” – A large voltage difference between Source and Drain maintains a level of electrons in the floating gate – The voltage volume determines the level of electrons, which is the state of the cell

9Single Level Cell (SLC) vS. Multi-level Cell (MLC)ReferencePointSLCaiOne bitper cell0VtMLCeonnTwo bits per cell00VtSLC:two voltage statesfor 0 and 1 (1 bit per cell)-Lowdensity,highaccessspeed,highendurance,lowpower,highcostMLC: four voltage states for 00, 01, 10, 11 (2 bits per cell)Highdensity,loweredaccessspeed,loweredendurance (1Ox),increasedpower,lowcostTLC: triple level cell, 3 bits in each cell => an enhanced MLC
Single Level Cell (SLC) vs. Multi-level Cell (MLC) 9 • SLC: two voltage states for 0 and 1 (1 bit per cell) – Low density, high access speed, high endurance, low power, high cost • MLC: four voltage states for 00, 01, 10, 11 (2 bits per cell) – High density, lowered access speed, lowered endurance (10x), increased power, low cost • TLC: triple level cell, 3 bits in each cell => an enhanced MLC

10Flash Memory based Solid State DriveArchitecture of solid state drives (SSD) Host interface logic-SATA, IDE, SCSl, etc.ssDController-processor,buffermanager,flashcontroller-Integrated/DedicateRAMbuffer-AnarrayofflashmemorypackagesSSDRAMbufferFlashHostmemorySSDontrollelInterfaceFlashHostProcessorlogicmemoryFlash ctrl.FlashIDE/SATABuffermemoryManagerFlashmemoryAdaptedfromUSENIX'O8(Agrawaletal.)
Flash Memory based Solid State Drive • Architecture of solid state drives (SSD) – Host interface logic – SATA, IDE, SCSI, etc. – SSD Controller – processor, buffer manager, flash controller – Integrated/Dedicate RAM buffer – An array of flash memory packages SSD Adapted from USENIX’08 (Agrawal et al.) Host Interface logic IDE/SATA SSD Controller Processor Buffer Manager Flash ctrl. RAM buffer Flash memory Flash memory Flash memory Flash memory 10 Host