COVER FEATURE From Micro- Processors To Nanostores .

2y ago
5 Views
3 Downloads
7.56 MB
10 Pages
Last View : 19d ago
Last Download : 2m ago
Upload by : Sutton Moon
Transcription

C OV ER F E AT U REFrom Microprocessors rathy Ranganathan, HP LabsThe confluence of emerging technologiesand new data-centric workloads offers aunique opportunity to rethink traditionalsystem architectures and memory hierarchies in future designs.What will future computing systems look like?We are entering an exciting era for systems design. Historically, the first computerto achieve terascale computing (1012, or onetrillion operations per second) was demonstrated in thelate 1990s. In the 2000s, the first petascale computer wasdemonstrated with a thousand-times better performance.Extrapolating these trends, we can expect the first exascale computer (with one million trillion operations persecond) to appear around the end of this next decade.In addition to continued advances in performance, weare also seeing tremendous improvements in power, sustainability, manageability, reliability, and scalability. Powermanagement, in particular, is now a first-class design consideration. Recently, system designs have gone beyondoptimizing operational energy consumption to examining the total life-cycle energy consumption of systems forimproved environmental sustainability. Similarly, in addition to introducing an exciting new model for deliveringcomputing, the emergence of cloud computing has enabled0018-9162/11/ 26.00 2011 IEEEr1par.indd 39significant advances in scalability as well as innovationsin the software stack.Looking further out, emerging technologies such asphotonics, nonvolatile memory, 3D stacking, and new datacentric workloads offer compelling new opportunities.The confluence of these trends motivates a rethinkingof the basic systems’ building blocks of the future and alikely new design approach called nanostores that focus ondata-centric workloads and hardware-software codesignfor upcoming technologies.THE DATA EXPLOSIONThe amount of data being created is exploding, growing significantly faster than Moore’s law. For example, theamount of online data indexed by Google is estimated tohave increased from 5 exabytes (one exabyte 1 milliontrillion bytes) in 2002 to 280 exabytes in 20091—a 56-foldincrease in seven years. In contrast, an equivalent Moore’slaw growth in computing for the corresponding time woulddeliver only a 16-fold increase.This data growth is not limited to the Internet alone, butis pervasive across all markets. In the enterprise space, thesize of the largest data warehouse has been increasing ata cumulative annual growth rate of 173 percent2—again,significantly more than Moore’s law.New kinds of dataSome common categories for data growth include thosepertaining to bringing traditionally offline data online andPublished by the IEEE Computer SocietyJANUARY 20113912/17/10 11:49 AM

cover F E AT U REto new digital media creation, including webpages, personal images, scanned records, audio files, governmentdatabases, digitized movies, personal videos, satelliteimages, scientific databases, census data, and scannedbooks. A recent estimate indicates that 24 hours of videoare uploaded on YouTube every minute. At HD rates of2-5 Mbps, that is close to 45-75 terabytes of data per day.Given that only about 5 percent of the world’s data is currently digitized,3 growth in this data category is likely tocontinue for several more years.More recently, large-scale sensor deployment has contributed to the explosion in data growth. Developmentsin nanoscale sensors have enabled tracking multipledimensions—including vibration, tilt, rotation, airflow,light, temperature, chemical signals, humidity, pressure,and location—to collect real-time data sampled at very finegranularities. These advances have motivated research-Looking ahead, it’s clear that we’reonly at the beginning of an even morefundamental shift in what we do withdata.ers to discuss the notion of developing a “central nervoussystem for the earth (CeNSE)”4 with intriguing sampleapplications of rich sensor networks in areas includingretail sales, defense, traffic, seismic and oil explorations,weather and climate modeling, and wildlife tracking. Thisvision will lead to data creation and analysis significantlybeyond anything we have seen so far.The pervasive use of mobile devices by a large part ofthe world’s population, and the ability to gather and disseminate information through these devices, contributesto additional real-time rich data creation. For example,at the time of Michael Jackson’s death in June 2009, Twitter estimated about 5,000 tweets per minute, and AT&Testimated about 65,000 texts per second. Currently, overa 90-day period, 20 percent of Internet search queries aretypically “new data.”1Significantly, this large-scale growth in data is happening in combination with a rich diversity in the type ofdata being created. In addition to the diversity in mediatypes—text, audio, video, images, and so on—there isalso significant diversity in how the data is organized:structured (accessible through databases), unstructured(accessed as a collection of files), or semistructured (forexample, XML or e-mail).New kinds of data processingThis growth in data is leading to a corresponding growthin data-centric applications that operate in diverse ways:40r1par.indd 40capturing, classifying, analyzing, processing, archiving,and so on. Examples include Web search, recommendation systems, decision support, online gaming, sorting,compression, sensor networks, ad hoc queries, cubing,media transcoding and streaming, photo processing, socialnetwork analysis, personalization, summarization, indexbuilding, song recognition, aggregation, Web mashups,data mining, and encryption. Figure 1 presents a taxonomy of data-centric workloads that summarizes this space.Compared to traditional enterprise workloads such asonline transaction processing and Web services, emerging data-centric workloads change many assumptionsabout system design. These workloads typically operateat larger scale (hundreds of thousands of servers) and onmore diverse data (structured, unstructured, rich media)with I/O-intensive, often random, data access patterns andlimited locality. In addition, these workloads are characterized by innovations in the software stack targeted atincreased scalability and commodity hardware such asGoogle’s MapReduce and BigTable.Looking ahead, it’s clear that we’re only at the beginningof an even more fundamental shift in what we do withdata. As an illustrative example, consider what happenswhen we search for an address on the Web.In the past, this request would be sent to a back-endwebserver that would respond with the image of a mapshowing the address’s location. However, in recent years,more sophisticated data analysis has been added to theresponse to this query. For example, along with just accessing the map database, the query could potentiallyaccess multiple data sources—for example, satellite imagery, prior images from other users, webpages associatedwith location information, overlays of transit maps, and soon. Beyond just static images, dynamic data sources canbe brought into play—such as providing live traffic or realtime weather information, current Twitter feeds, or livenews or video. Synthetic data such as images from userprovided 3D models of buildings or outputs from trendanalyzers and visualizers also can be superimposed onthe map.Adding personalization and contextual responses to themix introduces another layer of data processing complexity. For example, different data can be presented to theuser based on the last two searches prior to this search, oron the user’s prior behavior when doing the same search,or on locational information (for example, if the currentlocation matches the location where the user searchedpreviously).Social networks and recommendation systems add yetanother layer of data processing complexity. Examplesinclude on-map visualization of individuals’ locationsdrawn from social networks, inferred preferences, andprescriptive recommendations based on social trends. Advertisements and, more generally, business monetizationcomputer12/17/10 11:49 AM

of search, adds anotherReal-timeReal-time or interactive responses requiredResponse timelayer of data processing inBackgroundResponse time is not critical for user needsterms of accessing moreRandomUnpredictable access to regions of datastoredata sources and moreAccess patternSequentialSequential access of data chunkssophisticated algorithmsPermutationData is redistributed across the systemfor user preference andcontent relevance.AllThe entire dataset is accessedWorking setIn many cases, all thisPartialOnly a subset of data is accesseddata processing comesStructuredMetadata/schema/type are used for data recordswith fairly small latencyUnstructuredNo explicit data structure, for example, text/binary filesData typerequ irement s for reAudio/video and image data with inherent structures and specificsponse, even requiringRich mediaprocessing algorithmsreal-time responses inRead heavyData reads are significant for processingsome scenarios.Read vs. writeWriteheavyData writes are significant for processingThis scenario showshow, from simple WebComplex processing of data is required per data item; examples:Highvideo transcoding, classification, predictionsea rch a nd contents er v i ng, on l i ne d a t aProcessingSimpler processing is required per data item; examples: patternMediumcomplexitymatching, search, encryptionprocessing is evolvingto allow more complexDominated by data access with low compute ratio; examples:Lowsort, upload, download, filtering, and aggregationmea ning extractionacross multiple data repositories, a nd moreFigure 1. Data-centric workload taxonomy.s oph i s t ic a t e d c r o s s correlations, includingmore complex I/O movement. In a continuum of data proera. The “Implications of Data-Centric Workloads forcessing operations including upload/ingress; download/System Architectures” sidebar provides additional inforegress; search (tree traversal); read, modify, write; patternmation about this trend for system designs.matching; aggregation; correlation/join; index building;cubing; classification; prediction; and social networkIT’S A NEW WORLD—ANanalysis, recent trends show a strong movement towardINFLECTION POINT IN TECHNOLOGYoperations with more complex data movement patterns.5Concurrently, recent trends point to several potentialtechnology disruptions on the horizon.Similar trends can be seen in enterprise data manOn the compute side, recent microprocessors have faagement across the information insight outcome lifevored multicore designs emphasizing multiple simplercycle. There is an increasing emphasis on real-time feedscores for greater throughput. This is well matched withof business information, often across multiple formal orthe large-scale distributed parallelism in data-centricad hoc data repositories, reduced latencies between eventsworkloads. Operating cores at near-threshold voltage hasand decisions, and sophisticated combinations of parallelbeen shown to significantly improve energy efficiency.7analytics, business intelligence, and search and extractionoperations. Jim Gray alluded to similar trends in scientificSimilarly, recent advances in networking show a strongcomputing when discussing a new era in which sciengrowth in bandwidth for communication between diftific phenomena are understood through large-scale dataferent compute elements at various system design levels.analysis.6 Such trends can also be seen in other importantHowever, the most important technology changes pertinent to data-centric computing relate to the advances inworkloads of the future, with applications like compuand adoption of nonvolatile memory. Flash memories havetational journalism, urban planning, natural-languagebeen widely adopted in popular consumer systems—forprocessing, smart grids, crowdsourcing, and defense apexample, Apple’s iPhone—and are gaining adoption in theplications. The common traits in all these future workloadsenterprise market—for example, Fusion-io.are an emphasis on complex cross-correlations acrossFigure 2 shows the trends in costs for these technologiesmultiple data repositories and new data analysis/computerelative to traditional hard disks and DRAM memories.assumptions.Emerging nonvolatile memories have been demonstratedTogether, this growing complexity and dynamism into have properties superior to flash memories, most noextraction of meaning from data, combined with the largetably phase-change memory (PCM)8 and, more recently,scale diversity in the amount of data generated, representan interesting inflection point in the future data-centricmemristors.9 Trends suggest that future nonvolatileJANUARY 2011r1par.indd 414112/17/10 11:49 AM

cover F E AT U REImplications of Data-Centric Workloads for System ArchitecturesAn important trend in the emergence of data-centric workloadshas been the introduction of complex analysis at immensescale, closely coupled with the growth of large-scale Internet Webservices. Traditional data-centric workloads like Web serving andonline transaction processing are being superseded by workloadslike real-time multimedia streaming and conversion; history-basedrecommendation systems; searches of text, images, and evenvideos; and deep analysis of unstructured data—for example,Google Squared.From a system architecture viewpoint, a common characteristicof these workloads is their general implementation on highly distributed systems, and that they adopt approaches that scale bypartitioning data across individual nodes. Both the total amount ofdata involved in a single task and the number of distributed compute nodes required to process the data reflect their large scale.Additionally, these workloads are I/O intensive, often with randomaccess patterns to small-size objects over large datasets.Many of these applications operate on larger fractions of data inmemory. According to a recent report, the amount of DRAM used inFacebook for nonimage data is approximately 75 percent of thetotal data size.1 While this trend partly reflects the low latencyrequirements and the limited locality due to complex linkagesbetween data for the Facebook workload, similar trends for largermemory capacities can be seen for memcached servers and TPC-Hbenchmark winners over the past decade. Similarly, search algorithms such as the one from Google have evolved to store theirsearch indices entirely in DRAM. These trends motivate a rethinkingof the balance between memory and disk-based storage in traditional designs.Interestingly, datasets and the need to operate on larger fractions of the data in-memory continue to increase, there will likely bean inflection point at which conventional system architecturesbased on faster and more powerful processors and ever deepermemory hierarchies are not likely to work from an energy perspective (Figure A). Indeed, a recent exascale report identifies theamount of energy consumed in transporting data across differentlevels as a key limiting factor.2 Complex power-hungry processorsalso are sometimes a mismatch with data-intensive workloads,leading to further energy inefficiencies.Recent data-centric workloads have been characterized bynumerous commercially deployed innovations in the softwarestack—for example, Google’s BigTable and MapReduce, Amazon’sDynamo, Yahoo’s PNUTS, Microsoft’s Dryad, Facebook’s Memcached, and LinkedIn’s Voldemort. Indeed, according to a recentpresentation, the software stack behind the very successful Googlesearch engine was significantly rearchitected four times in the pastseven years to achieve better performance at increasedscale.3The growing importance of this class of workloads,their focus on large-scale distributed systems withever-increasing memory use, the potential inadequacy of existing architectural approaches, and theCrelative openness to software-level innovations in theL1emerging workloads offer an opportunity for a corre sponding clean-slate architecture design targeted atdata-centric computing.BoardBoardCL1 CL1 CL1 CL1 CL1 CL1 L2 MEM CTLDIMMDIMMDIMMDIMMHDDHDDIOCL1 L2 MEM CTLMEM CTLDIMMDIMMDIMMDIMMDIMMDIMMDIMMDIMMHDDHDDIOMEM CTLDIMMDIMMDIMMDIMMHDDHDDNetwork interfaceFigure A. Changing workload trends motivate a rethinking of thetraditional designs with deep hierarchies.memories can be viable DRAM replacements, achievingcompetitive speeds at lower power consumption, withnonvolatility properties similar to disks but without thepower overhead. Additionally, recent studies have identified a slowing of DRAM growth due to scaling challengesfor charge-based memories.10,11 The adoption of NVRAM asa DRAM replacement can potentially be accelerated due tosuch limitations in scaling DRAM.42r1par.indd 42References1. J. Ousterhout et al., “The Case for RAM Clouds: Scalable High-Performance Storage Entirely in DRAM,”ACM SIGOPS Operating Systems Rev., vol. 43, no. 4,2009, pp 92-105.2. P. Kogge ed., “ExaScale Computing Study: Technology Challenges in Achieving Exascale Systems,”2008; 20exascale%20-%20hardware%20(2008).pdf.3. J. Dean, “Challenges in Building Large-Scale Information Retrieval Systems,” keynote talk, Proc. 2ndAnn. ACM Conf. Web Search and Data Mining(WSDM 09), ACM Press, 2009; http://wsdm2009.org/proceedings.php.Density and endurance have been traditional limitations of NVRAM technologies, but recent trends suggestthat these limitations can be addressed. Multilevel designscan achieve increased density, potentially allowing multiple layers per die.12 At a single chip level, 3D die stackingusing through-silicon vias for interdie communication canfurther increase density. Such 3D stacking also has theadditional advantage of closely integrating the processorcomputer12/17/10 11:49 AM

Cost/Gbyteand memory for higher bandwidth and107lower power (due to short-length low106capacitance wires). Structures like wirebonding in system-in-package or pack105age-on-package 3D stacking are alreadyintegrated into products currently on the104market, such as mobile systems, while103more sophisticated 3D-stacking solutionshave been demonstrated in the lab.102In terms of endurance, compared toHDD101flash memories, PCMs and memristorsDRAMoffer significantly better functional0NAND10ity—107-108 writes per cell compared tothe 105 writes per cell for flash. Optimi10–1zations at the technology, circuit, and10–2systems levels have been shown to further197519801985199019952000200520102015address endurance issues, and more improvements are likely as the technologiesFigure 2. Nonvolatile memory cost trends. These trends suggest that futuremature and gain widespread adoption.11,13nonvolatile memories can be viable DRAM replacements.More details about emerging nonvolatile memories can be found in severalof power-efficient compute cores. Through-silicon viasrecent overviews and tutorials 14,15—for example, HotChipsare used to provide wide, low-energy datapaths between2010 (www.hotchips.org).the processors and the datastores. Each nanostore canThese trends suggest that technologies like PCM andact as a full-fledged system with a network interface. Inmemristors, especially when viewed in the context of addividual such nanostores are networked through onboardvances like 3D die stacking, multicores, and improvedconnectors to form a large-scale distributed system ornetworking, can induce more fundamental architecturalcluster akin to current large-scale clusters for data-centricchange for data-intensive computing than traditional apcomputing. The system can support different network toproaches that use them as solid-state disks or as anotherpologies, including traditional fat trees or recent proposalsintermediate level in the memory hierarchy.like HyperX.16NANOSTORES: A NEW SYSTEMIn terms of physical organization, multiple nanostorechips are organized into small daughter boards (microARCHITECTURE BUILDING BLOCK?blades) that, in turn, plug

are also seeing tremendous improvements in power, sus-tainability, manageability, reliability, and scalability. Power management, in particular, is now a first-class design con-sideration. Recently, system designs have gone beyond optimizing operational energy consumption to examin-ing the total life-cycle energy consumption of systems for

Related Documents:

Pentium Pro Processors 97 Pentium II Processors 97 Pentium III 99 Celeron 100 Intel Pentium 4 Processors 101 Pentium 4 Extreme Edition 104 Intel Pentium D and Pentium Extreme Edition 106 Intel Core Processors 108 Intel Core 2 Family 108 Intel (Nehalem) Core i Processors 110

studio re-mix and one last thing: don't worry, you can't go wrong won't let you! SoundBITE micro SoundBITE SoundBITE micro SoundBITE micro SoundBITE micro SoundBITE micro SoundBITE micro The automat

quality of micro holes produced by micro-EDM and investigated the influence of parameters on performance of micro-EDM of WC in obtaining high quality micro-holes, good surface finish and circularity [5]. M.S. Rasheed et al. analyzed the effect of micro-EDM parameters on MRR, TWR and Ra while machining Ni-Ti SMA (shape memory alloy)

5 10 feature a feature b (a) plane a b 0 5 10 0 5 10 feature a feature c (b) plane a c 0 5 10 0 5 10 feature b feature c (c) plane b c Figure 1: A failed example for binary clusters/classes feature selection methods. (a)-(c) show the projections of the data on the plane of two joint features, respectively. Without the label .

Digital RX/TX DSP Devices SHARC Processors SigmaDSP Audio Processors SigmaDSP Processors for TV TigerSHARC Processors RF & MICROWAVE Communications Analog Front Ends Direct Digital Synthesis & Modulators Integrated Transceivers, Transmitters & Receivers IQ Modulators & Demodulators Mixers

QuickSpecs HPE ProLiant BL460c Gen9 Server Blade Standard Features Page 4 NOTE: For Sixteen-core processors. 35MB (1x35MB) L3 cache NOTE: For Fourteen-core processors. 30MB (1x30MB) L3 cache NOTE: For Twelve-core processors. 25MB (1x25MB) L3 cache NOTE: For Ten or Eight-core processors.

Technology transformation of processors is clear from the table, the processors changes from 4 bit to 64 bit, from 2 MHz to 3.6 GHz with Turbo 4.0GHz, from single physical core to multi core (2,4,6,8) and 10 core and 12 core processors are in pipeline and above all the manufacturing technology is changed from 6µm to 22 nm.

Fusion Catalyst The New Standard in Display Wall Processors The Fusion Catalyst family of display wall processors ushers in a new era of performance and flexibility for collaborative visualization applications. Employing cutting edge, second generation PCI Express technology, Fusion Catalyst processors offer up to an astonishing