The Datacenter As A Computer

2y ago
120 Views
6 Downloads
2.80 MB
120 Pages
Last View : 21d ago
Last Download : 3m ago
Upload by : Camryn Boren
Transcription

The Datacenter as a ComputerAn Introduction to the Design ofWarehouse-Scale Machines

iiiSynthesis Lectures on ComputerArchitectureEditorMark D. Hill, University of Wisconsin, MadisonSynthesis Lectures on Computer Architecture publishes 50 to 150 page publications ontopics pertaining to the science and art of designing, analyzing, selecting and interconnectinghardware components to create computers that meet functional, performance and cost goals.The Datacenter as a Computer: An Introduction to the Design of Warehouse-ScaleMachinesLuiz André Barroso and Urs Hölzle2009Computer Architecture Techniques for Power-EfficiencyStefanos Kaxiras and Margaret Martonosi2008Chip Mutiprocessor Architecture: Techniques to Improve Throughput and LatencyKunle Olukotun, Lance Hammond, James Laudon2007Transactional MemoryJames R. Larus, Ravi Rajwar2007Quantum Computing for Computer ArchitectsTzvetan S. Metodi, Frederic T. Chong2006

Copyright 2009 by Morgan & ClaypoolAll rights reserved. No part of this publication may be reproduced, stored in a retrieval system, or transmitted inany form or by any means—electronic, mechanical, photocopy, recording, or any other except for brief quotationsin printed reviews, without the prior permission of the publisher.The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale MachinesLuiz André Barroso and Urs Hölzlewww.morganclaypool.comISBN: 9781598295566 paperbackISBN: 9781598295573 ebookDOI: 10.2200/S00193ED1V01Y200905CAC006A Publication in the Morgan & Claypool Publishers seriesSYNTHESIS LECTURES ON COMPUTER ARCHITECTURELecture #6Series Editor: Mark D. Hill, University of Wisconsin, MadisonSeries ISSNISSN 1935-3235printISSN 1935-3243electronic

The Datacenter as a ComputerAn Introduction to the Design ofWarehouse-Scale MachinesLuiz André Barroso and Urs HölzleGoogle Inc.SYNTHESIS LECTURES ON COMPUTER ARCHITECTURE # 6

viAbstractAs computation continues to move into the cloud, the computing platform of interest no longer resembles a pizza box or a refrigerator, but a warehouse full of computers. These new large datacentersare quite different from traditional hosting facilities of earlier times and cannot be viewed simplyas a collection of co-located servers. Large portions of the hardware and software resources in thesefacilities must work in concert to efficiently deliver good levels of Internet service performance,something that can only be achieved by a holistic approach to their design and deployment. In otherwords, we must treat the datacenter itself as one massive warehouse-scale computer (WSC). Wedescribe the architecture of WSCs, the main factors influencing their design, operation, and coststructure, and the characteristics of their software base. We hope it will be useful to architects andprogrammers of today’s WSCs, as well as those of future many-core platforms which may one dayimplement the equivalent of today’s WSCs on a single board.Keywordscomputer organization and design, Internet services, energy efficiency, fault-tolerant computing,cluster computing, data centers, distributed systems, cloud computing.

viiAcknowledgmentsWhile we draw from our direct involvement in Google’s infrastructure design and operation overthe past several years, most of what we have learned and now report here is the result of the hardwork, the insights, and the creativity of our colleagues at Google. The work of our Platforms Engineering, Hardware Operations, Facilities, Site Reliability and Software Infrastructure teams ismost directly related to the topics we cover here, and therefore, we are particularly grateful to themfor allowing us to benefit from their experience. Ricardo Bianchini, Fred Chong and Mark Hillprovided extremely useful feedback despite being handed a relatively immature early version of thetext. Our Google colleagues Jeff Dean and Jimmy Clidaras also provided extensive and particularlyuseful feedback on earlier drafts. Thanks to the work of Kristin Weissman at Google and MichaelMorgan at Morgan & Claypool, we were able to make this lecture available electronically withoutcharge, which was a condition for our accepting this task. We were fortunate that Gerry Kane volunteered his technical writing talent to significantly improve the quality of the text. We would alsolike to thank Catherine Warner for her proofreading and improvements to the text at various stages.Finally, we are very grateful to Mark Hill and Michael Morgan for inviting us to this project, fortheir relentless encouragement and much needed prodding, and their seemingly endless patience.

ixContents1.Introduction.11.1 Warehouse-Scale Computers. 21.2 Emphasis on Cost Efficiency. 31.3 Not Just a Collection of Servers. 41.4 One Datacenter vs. Several Datacenters. 41.5 Why WSCs Might Matter to You. 51.6 Architectural Overview of WSCs. 51.6.1 Storage. 61.6.2 Networking Fabric. 71.6.3 Storage Hierarchy. 81.6.4 Quantifying Latency, Bandwidth, and Capacity. 81.6.5 Power Usage. 101.6.6 Handling Failures. 112.Workloads and Software Infrastructure. 132.1 Datacenter vs. Desktop. 132.2 Performance and Availability Toolbox. 152.3 Cluster-Level Infrastructure Software. 192.3.1 Resource Management. 202.3.2 Hardware Abstraction and Other Basic Services. 202.3.3 Deployment and Maintenance. 202.3.4 Programming Frameworks. 212.4 Application-Level Software. 212.4.1 Workload Examples. 222.4.2 Online: Web Search. 222.4.3 Offline: Scholar Article Similarity. 242.5 A Monitoring Infrastructure. 262.5.1 Service-Level Dashboards. 26

  the datacenter as a computer2.62.72.5.2 Performance Debugging Tools. 272.5.3 Platform-Level Monitoring. 28Buy vs. Build. 28Further Reading. 293.Hardware Building Blocks. 313.1 Cost-Efficient Hardware. 313.1.1 How About Parallel Application Performance?. 323.1.2 How Low-End Can You Go?. 353.1.3 Balanced Designs. 374.Datacenter Basics. 394.1 Datacenter Tier Classifications. 394.2 Datacenter Power Systems. 404.2.1 UPS Systems. 414.2.2 Power Distribution Units. 414.3 Datacenter Cooling Systems. 424.3.1 CRAC Units. 424.3.2 Free Cooling. 434.3.3 Air Flow Considerations. 444.3.4 In-Rack Cooling. 444.3.5 Container-Based Datacenters. 455.Energy and Power Efficiency. 475.1 Datacenter Energy Efficiency. 475.1.1 Sources of Efficiency Losses in Datacenters. 495.1.2 Improving the Energy Efficiency of Datacenters. 505.2 Measuring the Efficiency of Computing. 525.2.1 Some Useful Benchmarks. 525.2.2 Load vs. Efficiency. 545.3 Energy-Proportional Computing. 565.3.1 Dynamic Power Range of Energy-Proportional Machines. 575.3.2 Causes of Poor Energy Proportionality. 585.3.3 How to Improve Energy Proportionality. 595.4 Relative Effectiveness of Low-Power Modes. 605.5 The Role of Software in Energy Proportionality. 615.6 Datacenter Power Provisioning. 62

contents  xi5.75.85.6.1 Deployment and Power Management Strategies. 625.6.2 Advantages of Oversubscribing Facility Power. 63Trends in Server Energy Usage. 65Conclusions. 665.8.1 Further Reading. 676.Modeling Costs. 696.1 Capital Costs. 696.2 Operational Costs. 716.3 Case Studies. 726.3.1 Real-World Datacenter Costs. 746.3.2 Modeling a Partially Filled Datacenter. 757.Dealing with Failures and Repairs. 777.1 Implications of Software-Based Fault Tolerance. 777.2 Categorizing Faults. 797.2.1 Fault Severity. 807.2.2 Causes of Service-Level Faults. 817.3 Machine-Level Failures. 837.3.1 What Causes Machine Crashes?. 867.3.2 Predicting Faults. 877.4 Repairs. 887.5 Tolerating Faults, Not Hiding Them. 898.Closing Remarks. 918.1 Hardware. 928.2 Software. 938.3 Economics. 948.4 Key Challenges. 968.4.1 Rapidly Changing Workloads. 968.4.2 Building Balanced Systems from Imbalanced Components. 968.4.3 Curbing Energy Usage. 968.4.4 Amdahl’s Cruel Law. 968.5 Conclusions. 97References. 99Author Biographies. 107

chapter 1IntroductionThe ARPANET is about to turn forty, and the World Wide Web is approaching its 20th anniversary. Yet the Internet technologies that were largely sparked by these two remarkable milestonescontinue to transform industries and our culture today and show no signs of slowing down. Morerecently the emergence of such popular Internet services as Web-based email, search and social networks plus the increased worldwide availability of high-speed connectivity have accelerated a trendtoward server-side or “cloud” computing.Increasingly, computing and storage are moving from PC-like clients to large Internet services. While early Internet services were mostly informational, today many Web applications offerservices that previously resided in the client, including email, photo and video storage and office applications. The shift toward server-side computing is driven primarily not only by the need for userexperience improvements, such as ease of management (no configuration or backups needed) andubiquity of access (a browser is all you need), but also by the advantages it offers to vendors. Software as a service allows faster application development because it is simpler for software vendorsto make changes and improvements. Instead of updating many millions of clients (with a myriadof peculiar hardware and software configurations), vendors need only coordinate improvementsand fixes inside their datacenters and can restrict their hardware deployment to a few well-testedconfigurations. Moreover, datacenter economics allow many application services to run at a low costper user. For example, servers may be shared among thousands of active users (and many more inactive ones), resulting in better utilization. Similarly, the computation itself may become cheaper in ashared service (e.g., an email attachment received by multiple users can be stored once rather thanmany times). Finally, servers and storage in a datacenter can be easier to manage than the desktopor laptop equivalent because they are under control of a single, knowledgeable entity.Some workloads require so much computing capability that they are a more natural fit fora massive computing infrastructure than for client-side computing. Search services (Web, images,etc.) are a prime example of this class of workloads, but applications such as language translationcan also run more effectively on large shared computing installations because of their reliance onmassive-scale language models.

  the datacenter as a computerThe trend toward server-side computing and the exploding popularity of Internet serviceshas created a new class of computing systems that we have named warehouse-scale computers, orWSCs. The name is meant to call attention to the most distinguishing feature of these machines:the massive scale of their software infrastructure, data repositories, and hardware platform. Thisperspective is a departure from a view of the computing problem that implicitly assumes a modelwhere one program runs in a single machine. In warehouse-scale computing, the program is anInternet service, which may consist of tens or more individual programs that interact to implementcomplex end-user services such as email, search, or maps. These programs might be implementedand maintained by different teams of engineers, perhaps even across organizational, geographic, andcompany boundaries (e.g., as is the case with mashups).The computing platform required to run such large-scale services bears little resemblanceto a pizza-box server or even the refrigerator-sized high-end multiprocessors that reigned in thelast decade. The hardware for such a platform consists of thousands of individual computing nodeswith their corresponding networking and storage subsystems, power distribution and conditioning equipment, and extensive cooling systems. The enclosure for these systems is in fact a buildingstructure and often indistinguishable from a large warehouse.1.1WAREHOUSE-SCALE COMPUTERSHad scale been the only distinguishing feature of these systems, we might simply refer to themas datacenters. Datacenters are buildings where multiple servers and communication gear are colocated because of their common environmental requirements and physical security needs, and forease of maintenance. In that sense, a WSC could be considered a type of datacenter. Traditionaldatacenters, however, typically host a large number of relatively small- or medium-sized applications, each running on a dedicated hardware infrastructure that is de-coupled and protected fromother systems in the same facility. Those datacenters host hardware and software for multiple organizational units or even different companies. Different computing systems within such a datacenteroften have little in common in terms of hardware, software, or maintenance infrastructure, and tendnot to communicate with each other at all.WSCs currently power the services offered by companies such as Google, Amazon, Yahoo,and Microsoft’s online services division. They differ significantly from traditional datacenters: theybelong to a single organization, use a relatively homogeneous hardware and system software plat

2 THE DATACENTEr AS A CoMPUTEr The trend toward server-side computing and the exploding popularity of Internet services has created a new class of computing systems that we have named warehouse-scale computers, or WSCs. The name is meant to call attent

Related Documents:

May 02, 2018 · D. Program Evaluation ͟The organization has provided a description of the framework for how each program will be evaluated. The framework should include all the elements below: ͟The evaluation methods are cost-effective for the organization ͟Quantitative and qualitative data is being collected (at Basics tier, data collection must have begun)

Silat is a combative art of self-defense and survival rooted from Matay archipelago. It was traced at thé early of Langkasuka Kingdom (2nd century CE) till thé reign of Melaka (Malaysia) Sultanate era (13th century). Silat has now evolved to become part of social culture and tradition with thé appearance of a fine physical and spiritual .

On an exceptional basis, Member States may request UNESCO to provide thé candidates with access to thé platform so they can complète thé form by themselves. Thèse requests must be addressed to esd rize unesco. or by 15 A ril 2021 UNESCO will provide thé nomineewith accessto thé platform via their émail address.

̶The leading indicator of employee engagement is based on the quality of the relationship between employee and supervisor Empower your managers! ̶Help them understand the impact on the organization ̶Share important changes, plan options, tasks, and deadlines ̶Provide key messages and talking points ̶Prepare them to answer employee questions

Dr. Sunita Bharatwal** Dr. Pawan Garga*** Abstract Customer satisfaction is derived from thè functionalities and values, a product or Service can provide. The current study aims to segregate thè dimensions of ordine Service quality and gather insights on its impact on web shopping. The trends of purchases have

Chính Văn.- Còn đức Thế tôn thì tuệ giác cực kỳ trong sạch 8: hiện hành bất nhị 9, đạt đến vô tướng 10, đứng vào chỗ đứng của các đức Thế tôn 11, thể hiện tính bình đẳng của các Ngài, đến chỗ không còn chướng ngại 12, giáo pháp không thể khuynh đảo, tâm thức không bị cản trở, cái được

(WANs) [24, 26, 27]. WAN traffic shares the datacenter network with intra-datacenter traffic, with the ratio of datacenter to WAN traffic typically around 5:1 [42]. Despite the small fraction of WAN traffic, we find that its impact on datacenter traffic is significant when both ty

Le genou de Lucy. Odile Jacob. 1999. Coppens Y. Pré-textes. L’homme préhistorique en morceaux. Eds Odile Jacob. 2011. Costentin J., Delaveau P. Café, thé, chocolat, les bons effets sur le cerveau et pour le corps. Editions Odile Jacob. 2010. Crawford M., Marsh D. The driving force : food in human evolution and the future.