Understanding And Designing New Server Architectures For .

2y ago
12 Views
2 Downloads
539.89 KB
12 Pages
Last View : 1m ago
Last Download : 3m ago
Upload by : Baylee Stein
Transcription

International Symposium on Computer ArchitectureUnderstanding and Designing New Server Architecturesfor Emerging Warehouse-Computing EnvironmentsKevin Lim*, Parthasarathy Ranganathan , Jichuan Chang ,Chandrakant Patel , Trevor Mudge*, Steven Reinhardt*† †* University of Michigan, Ann ArborHewlett-Packard LabsReservoir Labs{ktlim,tnm,stever}@eecs.umich.edu, {partha.ranganathan, jichuan.chang, chandrakant.patel}@hp.comfocus on costs motivates leveraging the “sweet spot” ofcommodity pricing and energy efficiency, and alsoreflects in decisions to move high-end hardwarefeatures into the application stack (e.g., highavailability, manageability). Additionally, the highvolume in this market and the relative dominance of afew key players – e.g., Google, Yahoo, eBay, MSN(Microsoft), Amazon – allow for exploring options likecustom-designed servers in “green-field” datacentersbuilt from scratch. Indeed, Google and MSN’spurchase of real estate near the internet backbone orpower grid for this purpose has received a lot of recentpress [19].All these trends motivate the need for research inunderstanding these workloads, and on new systemarchitectures targeted at this market with compellingcost/performance advantages. This paper seeks toaddress this challenge. In particular, we make thefollowing contributions. We put together a detailedevaluation infrastructure including the first-everbenchmark suite for warehouse-computing workloads,along with detailed performance, cost, and powermodels and metrics. Using these tools, we identify fourkey areas for improvement (CPU, packaging, memory,disk) and study a new system architecture that takes aholistic approach at addressing these bottlenecks. Ourproposed solution has novel features including the useof low-cost, low-power components from the highvolume embedded/mobile space and novel packagingsolutions, along with memory sharing and flash-baseddisk caching. Our results are promising, combining toprovide a two-fold improvement in performance-perdollar, for our benchmark suite. More importantly, theypoint to the strong potential of cost-efficient ensemblelevel design for this class of workloads.AbstractThis paper seeks to understand and design nextgeneration servers for emerging “warehousecomputing” environments. We make two keycontributions. First, we put together a detailedevaluation infrastructure including a new benchmarksuite for warehouse-computing workloads, anddetailed performance, cost, and power models, toquantitatively characterize bottlenecks. Second, westudy a new solution that incorporates volume nonserver-class components in novel packaging solutions,with memory sharing and flash-based disk caching.Our results show that this approach has promise, witha 2X improvement on average in performance-perdollar for our benchmark suite.1. IntroductionThe server market is facing an interesting inflectionpoint. Recent market data identifies the “internetsector” as the fastest growing segment of the overallserver market, growing by 40-65% every year, andaccounting for more than 65% of low-end serverrevenue growth last year. By contrast, the rest of theserver market is not growing. Indeed, several recentkeynotes [3,6,19,25,26,33,36] point to growingrecognition of the importance of this area.However, the design of servers for this market posesseveral challenges. Internet sector infrastructures havemillions of users often running on hundreds ofthousands of servers, and consequently, scale-out is akey design constraint. This has been likened to thedesign of a large warehouse-style computer [3], withthe programs being distributed applications like mail,search, etc. The datacenter infrastructure is often thelargest capital and operating expense. Additionally, asignificant fraction of the operating costs is power andcooling.These constraints, in turn, lead to several designdecisions specific to internet sector infrastructures. The1063-6897/08 25.00 2008 IEEEDOI 10.1109/ISCA.2008.372. Evaluation EnvironmentA challenge in studying new architectures forwarehouse-computing environments has been the lackof access to internet-sector workloads. The proprietarynature and the large scale of deployment are key315Authorized licensed use limited to: Univ of Calif San Diego. Downloaded on October 4, 2008 at 19:57 from IEEE Xplore. Restrictions apply.

impediments in duplicating these environments. Theseenvironments have a strong focus on cost and powerefficiency, but there are currently no complete systemlevel cost or power models publicly available, furtherexacerbating the difficulties.activity to interact with the backend server.Ytube: This benchmark is representative of web2.0trends of using rich media types and models mediaservers servicing requests for video files.Ourbenchmark consists of a heavily modifiedSPECweb2005 Support workload driven withYoutube traffic characteristics observed in edgeservers by [14]. We modify the pages, files, anddownload sizes to reflect the distributions seen in [14],and extend the QoS requirement to model streamingbehavior. Usage patterns are modeled after a Zipfdistribution. Performance is measured as the number ofrequests per second, while ensuring that the QoSviolations are similar across runs. The workloadbehavior is predominantly IO-bounded.Mapreduce: This benchmark is representative ofworkloads that use the web as a platform. We model acluster running offline batch jobs of the kind amenableto the MapReduce [7] style of computation, consistingof a series of “map” and “reduce” functions performedon key/value pairs stored in a distributed file system.We use the open-source Hadoop implementation [13]and run two applications – (1) mapreduce-wc thatperforms a word count over a large corpus (5 GB), and(2) mapreduce-write that populates the file system withrandomly-generated words. Performance is measuredas the amount of time to perform the task. Theworkload involves both CPU and I/O activity.For websearch and webmail, the servers areexercised by a Perl-based client driver, which generatesand dispatches requests (with user-defined think time),and reports transaction rate and QoS results. The clientdriver can also adapt the number of simultaneousclients according to recently observed QoS results, toachieve the highest level of throughput withoutoverloading the servers. For ytube we use a modifiedSPECweb2005 client driver, which has similarfunctionality to the other client drivers.We believe that our benchmark suite is a goodrepresentative of internet sector workloads for thisstudy. However, this suite is a work in progress, andSection 4 discusses further extensions to model our2.1 A benchmark suite for the internet sectorIn order to perform this study we have created a newbenchmark suite with four workloads representative ofthe different services in internet sector datacenters.Websearch: We choose this to be representative ofunstructured data processing in internet sectorworkloads. The goal is to service requests to searchlarge amounts of data within sub-seconds. Ourbenchmark uses the Nutch search engine [21] runningon the Tomcat application server and Apache webserver. We study a 20GB dataset with a 1.3GB index ofparts of www.dmoz.org and Wikipedia. The keywordsin the queries are based on a Zipf distribution of thefrequency of indexed words, and the number ofkeywords is based on observed real-world querypatterns [40]. Performance is measured as the numberof requests per second (RPS) for comparable Quality ofService (QoS) guarantees. This benchmark emphasizeshigh throughput with reasonable amounts of dataprocessing per request.Webmail: This benchmark seeks to representinteractive internet services seen in web2.0applications. It uses PHP-based SquirrelMail serverrunning on top of Apache. The IMAP and SMTPservers are installed on a separate machine usingcourier-imap and exim. The clients interact with theservers in sessions, each consisting of a sequence ofactions (e.g., login, read email and attachments,reply/forward/delete/move, compose and send). Thesize distributions are based on statistics collectedinternally within the University of Michigan, and theclient actions are modeled after MS Exchange ServerLoadSim “heavy-usage” profile [35]. Performance ismeasured as the number of RPS for comparable QoSguarantees. Our benchmark includes a lot of networkTable 1: Summary details of the new benchmark suite to represent internet sector workloads.WorkloadwebsearchEmphasizethe role ofunstructureddataDescriptionOpen source Nutch-0.9, Tomcat 6 with clustering, and Apache2. 1.3GB indexcorresponding to 1.3 million indexed documents, 25% of index terms cached inmemory. 2GB Java heap size. QoS requires 95% queries take 0.5 seconds.Perf metricRequest-persec (RPS) w/QoSwebmailinteractiveinternetservicesRPS w/ QoSytubethe use ofrich mediaweb as aplatformSquirrelmail v1.4.9 with Apache2 and PHP4, Courier-IMAP v4.2 and Exim4.5.1000 virtual users with 7GB of mail stored. Email/attachment sizes and usagepatterns modeled after MS Exchange 2003 LoadSim heavy users. QoS requires 95% requests take 0.8 second.Modified SPECweb2005 Support workload with Youtube traffic characteristics.Apache2/Tomcat6 with Rock httpd server.Hadoop v0.14 with 4 threads per CPU and 1.5GB Java heap size. We study twoworkloads - distributed file write (mapred-wr) and word count (mapred-wc)mapreduce316Authorized licensed use limited to: Univ of Calif San Diego. Downloaded on October 4, 2008 at 19:57 from IEEE Xplore. Restrictions apply.RPS w/ QoSExecutiontime

DetailsPer-server cost ( )CPUMemoryDiskBoard mgmtPower fansSwitch/rack costServer power (Watt)CPUMemoryDiskBoard mgmtPower fansSwitch/rack powerServer qty per rackActivity factorK1 / L1 / K23-yr power & coolingTotal costs ( )For the power and cooling costs, we have twosubcomponents. We first compute the rack-level powerconsumption (P consumed). This is computed as a sumof power at the CPU, memory, disk, power-andcooling, and the rest of the board, at the per-serverlevel, and additional switch power at the rack level.Given that nameplate power is often overrated [11], wetried to obtain the maximum operational powerconsumption of various components from spec ,18,24,30,32], or personal communications withvendors. This still suffers from some inaccuracies sinceactual power consumption has been documented to belower than worst-case power consumption [27]. We usean “activity factor” of 0.75 to address this discrepancy.As validation, we compared the output of our model tosystems that we had access to and our model wasrelatively close to the actual consumption. (We alsostudied a range of activity factors from 0.5 to 1.0 andour results are qualitatively similar, so we don’t presentthem.)Second, we use P consumed as input to determinethe burdened cost of power using the methodologydiscussed by Patel et al. [27,28].Srvr1Srvr2 3,225 1,620 1,700 650 350 350 275 120Srv2 Costs Breakdown 400 250 500 250Fans P&C 8% 2,750 2,750340215Rack P&C 0%Board210105CPU HWDisk P&C252520%P&C 9%15102%Mem5040MemHW4035P&C11%40406%Disk4040HW 4%CPU0.750.75P&C1.33 / 0.8 / 0.667BoardFan22% Rack 2,464 1,561HW 2% HW 8% HW 8% 5,758 3,249(a)(b)Figure 1: Cost models and breakdowns.target workloads even more accurately.2.2 Metrics and modelsMetrics: The key performance/price metric forinternet sector environments is the sustainableperformance (Perf) divided by total cost of ownership(abbreviated as TCO- ). For performance, we use thedefinition specific to each workload in Table 1. Fortotal lifecycle cost, we assume a three-yeardepreciation cycle and consider costs associated withbase hardware, burdened power and cooling, and realestate. In our discussion of specific trends, we alsoconsider other metrics like Performance-per-Watt(Perf/W) and performance per specific costcomponents such as infrastructure only (Perf/inf- ) andpower and cooling (Perf/P&C- ).Cost model: The two main components of our costmodel are (1) the base hardware costs, and (2) theburdened power and cooling costs. For the basehardware costs, we collect the costs of the individualcomponents – CPU, memory, disk, board, power andcooling (P&C) components such as power supplies,fans, etc. – at a per-server level. We cumulate thesecosts at the rack level, and consider additional switchand enclosure costs at that level. We use a variety ofsources to obtain our cost data, including publiclyavailable data from various vendors (newegg, Micron,Seagate, Western Digital, etc.) and industry-proprietarycost information through personal communicationswith individuals at HP, Intel/AMD, ARM, etc.Wherever possible, we also validated the consistencyof the overall costs and the breakdowns with priorpublications from internet-sector companies [3]. Forexample, our total server costs were similar to thatlisted from Silicon Mechanics [16].PowerCoolC ost (1 K 1 L1 K 2 * L1 ) * U s, grid * PconsumedThis model considers the burdened power and coolingcosts to consist of electricity costs at the rack level, theamortized infrastructure costs for power delivery (K1),the electricity costs for cooling (L1) and the amortizedcapital expenditure for the cooling infrastructure (K2).For our default configuration, we use published data ondefault values for K1, L1, and K2 [28]. There is a widevariation possible in the electricity tariff rate (from 50/MWHr to 170/MWhr), but in this paper, we use adefault electricity tariff rate of 100/MWhr [27].Figure 1 illustrates our cost model.Performanceevaluation:Toevaluateperformance, we used HP Labs' COTSon simulator[10], which is based on AMD's SimNow [5]infrastructure. It is a validated full-system x86/x86-64simulator that can boot an unmodified Linux OS andexecute complex applications. The simulator guest runs64-bit Debian Linux with the 2.6.15 kernel. Thebenchmarks were compiled directly in the simulatedmachines where applicable. Most of the benchmarksuse Java, and were run using Sun’s Linux Java SDK5.0 update 12. C/C code was compiled with gcc4.1.2 and g 4.0.4. We also developed an offlinemodel using the simulation traces to evaluate memorysharing; this is discussed more in Section 3.4.317Authorized licensed use limited to: Univ of Calif San Diego. Downloaded on October 4, 2008 at 19:57 from IEEE Xplore. Restrictions apply.

collection of individual systems, and second, if thecombination of the improvements in each of these areascan lead to an overall design that improvessignificantly on the current state of the art.Below, we evaluate each of these ideas in isolation(3.2-3.5), and then consider the net benefits when thesesolutions are used together (3.6).Table 2: Summary of systems considered.System "Similar to"Srvr1Srvr2DeskMoblEmb1Emb2System FeaturesXeon MP,2p x 4 cores, 2.6 GHz,Opteron MPOoO, 64K/8MB L1/L2Xeon, Opteron 1p x 4 cores, 2.6 GHz,OoO, 64K/8MB L1/L2Core 2, Athlon 1p x 2 cores, 2.2 GHz,64OoO, 32K/2MB L1/L2Core 2 Mobile, 1p x 2 cores, 2.0 GHz,TurionOoO, 32K/2MB L1/L2PA Semi,1p x 2 cores, 1.2 GHz,Emb. AthlonOoO, 32K/1MB L1/L264AMD Geode, 1p x 1 cores, 600MHz,VIA Eden-Ninord.,32K/128K L1/L2WattInf- 340 3,294215 1,6891358497898952499353793.2 Low-power low-cost CPUsWhereas servers for databases or HPC havetraditionally focused on obtaining the highestperformance per server, the scale-out nature of theinternet sector allows for a focus on performance/ byutilizing systems that offer a superior performance/ .Indeed, publications by large internet sector companiessuch as Google [4] exhibit the usefulness of buildingservers using commodity desktop PC parts. Theintuition is that volume drives cost; compared toservers that have a limited market and higher pricemargins, commodity PCs have a much larger marketthat allows for lower prices. Additionally, theseprocessors do not include cost premiums for featureslike multiprocessor support and advanced ECC that aremade redundant by reliability support in the softwarestack for internet sector workloads.In this section, we quantitatively evaluate thebenefits of such an approach studying the effectivenessof low-end servers and desktops for this market. Wetake this focus on performance/ one step further,exploring an alternative commodity market – theembedded/mobile segment. Trends in transistor scalingand embedded processor design have broughtpowerful, general purpose processors to the embeddedspace, many of which are multicore processors.Devices using embedded CPUs are shipped in evenmore volume than desktops – leading to even bettercost savings. They are often designed for minimalpower consumption due to their use in mobile systems.Power is a large portion of the total lifecycle costs, sogreater power-efficiency can help reduce costs. Thekey open question, of course, is whether these cost andpower benefits can offset the performance degradationrelative to the baseline server.In this section, we consider six different systemconfigurations (Table 2). Srvr1 and srvr2 representmid-range and low-end server systems; desk representsdesktop systems, mobl represents mobile systems, andemb1 and emb2 represent a mid-range and low-endembedded system respectively. All servers have 4 GBof memory, using FB-DIMM (srvr1, srvr2), DDR2(desk, mobl, emb1) or DDR1 (emb2) technologies.Srvr1 has a 15k RPM disk and a 10 Gigabit NIC, while3. A New Server Architecture3.1 Cost Analysis and Approach TakenFigure 1(a) lists the hardware component costs, thebaseline power consumption, and the burdened costs ofpower and cooling for two existing serverconfigurations (srvr1 and srvr2). Figure 1(b) presents apie-chart breakdown of the total costs for srvr2separated as infrastructure (HW) and burdened powerand cooling (P&C). Our data shows several interestingtrends. First, power and cooling costs are comparableto hardware costs. This is consistent with recent studiesfrom internet sector workloads that highlight the sametrend [11]. Furthermore, the CPU hardware and CPUpower and cooling are the two largest components oftotal costs (contributing 20% and 22% respectively).However, it can be seen that a number of othercomponents together contribute equally to the overallcosts. Consequently, to achieve truly compellingperformance/ advantages, solutions need toholistically address multiple components.Below, we examine one such holistic solution.Specifically, we consider four key issues: (1) Can wereduce overall costs from the CPU (hardware andpower) by using high-volume lower-cost lower-power(but also lower-performance) non-server processors?(2) Can we reduce the burdened costs of power bynovel packaging solutions? (3) Can we reduce theoverall costs for memory by sharing memory across acluster/ensemble? (4) Can we reduce the overall costsfor the disk component by using lower-power (butlower performance) disks, possibly with emerging nonvolatile memory?Answering each of these questions in detail is notpossible within the space constraints of this paper. Ourgoal here is to evaluate first, if considerable gains arepossible in each of these areas when the architecture isviewed from the ensemble perspective rather than as a318Authorized licensed use limited to: Univ of Calif San Diego. Downloaded on October 4, 2008 at 19:57 from IEEE Xplore. Restrictions apply.

100%Rack switchPower fanBoard mb1Perf/WMoblHMeanPerf/Inf- rkloadSrvr1Emb2Emb1MoblDeskSrvr2Srvr1(a) Inf.- breakdownPerf(b) P&C- breakdownSrvr2 Desk Mobl Emb1 376%350%93%44%206%101%140%Hmean139% 162% 125% 76%47%152% 233%146% 197%114% 177%Hmean112% 105% 164% 181% 101%Perf/TCO- hput and reciprocal of execution times across thebenchmarks.Looking at Figure 2(a), we can see that, at a persystem level, the hardware costs are dramatically lowerfor the consumer systems. The biggest costs reductioncome in the CPU component. The use of consumertechnologies, like DDR2 for memory, lead toreductions in other components as well. The desksystem is only 25% of the costs of the srvr1 system,while the emb1 is only 15% of the costs. The moblsystem sees higher costs relative to the desktop becauseof the higher premium for low-power components inthis market. Similar trends can be seen for power andcooling costs in Figure 2(b). As one would expect, thedesktop system has 60% lower P&C costs compared tosrvr1, but the emb1 system does even better, saving85% of the costs. Unlike with hardware costs, there is amore gradual progression of savings in the power andcooling.Figure 2(c) highlights several interesting trends forperformance. As expected, the lower-end systems seeperformance degradation compared to srvr1. However,the relative rate of performance degradation varies withbenchmark and the system considered. The mapreduceworkloads and ytube see relatively smallerdegradations in performance compared to websearchand webmail and also see a much more dramaticinflection at the transition between emb1 and emb2systems. This is intuitive given that these workloads arenot CPU-intensive and are primarily network or diskbound. The desktop system sees 10-30% performancedegradation for mapreduce and ytube and 65-80%performance loss for websearch and webmail. Incomparison the emb1 system sees 20-50% degradationsfor the former two workloads and 75-90% loss theremaining two. Emb2 consistently underperforms forall workloads.Comparing the relative losses in performance to thebenefits in costs, one can see significant improvementsin performance/Watt and performance/ for the desk,mobl, and emb1; emb2 does not perform as well. Forexample, emb1 achieves improvements of 3-6X inperformance/total costs for ytube and mapreduce andan improvement of 60% for websearch compared tosrvr1. Webmail achieves a net degradation inperformance/ because of the significant decrease inperformance, but emb1 still performs competitivelywith srvr1, and does better than the other systems.Performance/W results show similar trends except forstronger improvements for the mobile systems.Overall, our workloads show a benefit in improvedperformance per costs when using lower-end consumerplatforms optimized for power and costs, compared 5%106%147%126% 132% 140% 192%95%(c) Performance, cost and power efficienciesFigure 2: Summary of benefits from using low-costlow-power CPUs from non-server markets.all others have a 7.2k RPM disk and a 1 Gigabit NIC.Note that the lower-end systems are not balanced froma memory provisioning point of view (reflected in thehigher costs and power for the lower-end systems thanone would intuitively expect). However, our goal is toisolate the effect of the processor type, so we keepmemory and disk capacity constant (but in the differenttechnologies specific to the platform). Later sectionsexamine changing this assumption.Evaluation: Figure 2 presents our evaluationresults. The break down of infrastructure costs and theburdened power and cooling costs are summarized inFigures 2(a) and 2(b) respectively. Figure 2(c) showsthe variation in performance, performance/ andperformance/Watt; for better illustration of the benefits,performance/ is shown as performance/total costs andperformance/infrastructure costs (performance/powerand-cooling-costs can be inferred). Also listed is theaverage computed as the harmonic mean of the319Authorized licensed use limited to: Univ of Calif San Diego. Downloaded on October 4, 2008 at 19:57 from IEEE Xplore. Restrictions apply.

blade servers as the exemplar for the rest of ourdiscussions, since they are well-known in the market.Dual-entry enclosures with directed airflow:Figure 3(a) shows how a server level enclosure can beredesigned to enable blades to be inserted from frontand back to attach to a midplane. The key intuition is topartition the air flow, and allow cold air to be directedvertically through the blades. This is done byincreasing the volume of the enclosure to create an inletand exhaust plenum, and direct the air flow in thedirections indicated by the arrows in the picture. Theair flow is maintained through all the blades in parallelfrom intake plenum to exhaust plenum. (This is akin toa parallel connection of resistances versus a serial one.)Compared to conventional blade enclosures whichforce air directly from front to back, this results inshorter flow length (distance traversed by the air),lower pre-heat (temperature of the air hitting theblades), reduced pressure drop and volume flow. Ourthermo-mechanical analysis of the thermal resistanceair flow improvements with this design (calculationsomitted for space) show significant improvements incooling efficiencies ( 50%). Compared to the baselinethat can allow 40 1U “pizza box” servers per rack, ournew design can allow 40 blades of 75W to be insertedin a 5U enclosure, allowing 320 systems per rack.Board-level aggregated heat removal: Figure 3also shows an even more radical packaging design.With low-power systems, one can consider blades ofmuch smaller form factors that are integrated onconventional blades that fit into an enclosure. Asshown in Figure 3(b), we propose an innovativepackaging scheme that aggregates the powerdissipating components at the device and packagelevel. The smaller form factor server modules areinterspersed with planar heat pipes that transfer the heatat an effective conductivity three times that of copperto a central location. The aggregated heat is removedwith a larger optimized heat sink that enableschanneling the flow through a single heat sink asopposed to multiple separate conventional heat sinks.The increased conductivity and the increased area forheat extraction lead to more effective cooling. Thesmaller blades can be connected to the blades throughdifferent interfaces – ATC or COMX interfaces aregood candidates [29]. Figure 4 shows an example witheight such smaller 25W modules aggregated on abigger blade. With higher power budgets, four suchsmall modules can be supported on a bigger blade,allowing 1250 systems per rack.These cooling optimizations have the potential toimprove efficiencies by 2X and 4X. Although they usespecialized designs, we expect our cooling solutions to(a) Dual-entry enclosure with directed airflow(b) Aggregated cooling of microbladesFigure 3: New proposed cooling architectures.Aggregated cooling and compaction can bringdown total costs without affecting performance.servers such as srvr1 and srvr2. Our desk configurationperforms better than srvr1 and srvr2 validating currentpractices to use commodity desktops [4]. However, akey new interesting result for our benchmark study isthat going to embedded systems has the potential tooffer more cost savings at the same performance; butthe choice of embedded platform is important (e.g.emb1 versus emb2). It must be noted that these resultshold true for our workloads, but more study is neededbefore we can generalize these results to all variationsof internet sector workloads.Studying embedded platforms with larger amountsof memory and disk added non-commodity costs to ourmodel that can be further optimized (the memory bladeproposed in Section 3.4 addresses this). Additionally,srvr1 consumes 13.6KW/rack while emb1 consumesonly 2.7KW/rack (for a standard 42U rack). We do notfactor this in our results, but this difference can eithertranslate into simpler cooling solutions, or smallerform-factors and greater compaction. The next sectionaddresses the latter.3.3 Compaction and Aggregated CoolingOur discussion in Section 3.1 identifies that after theprocessor, inefficiencies in the cooling system are thenext largest factor of cost. Lower-power systems offerthe opportunity for smaller form factor boards, whichin turn allow for optimizations to the cooling system.Below, we discuss two such optimizations. We use320Authorized licensed use limited to: Univ of Calif San Diego. Downloaded on October 4, 2008 at 19:57 from IEEE Xplore. Restrictions apply.

be effective in other enterprise environments. Whencombined with the significant and growing fraction ofthe market represented by warehouse computingenvironments, these designs should have enoughvolume to drive commoditization.CPUCPUiLOSouthbridgeDiskMemory ControllerHub (Northbridge)PCIe BridgeDIMMCPUiLOMemory costs and power are an important part ofthe system level picture, especially as the cost andpower of other components are reduced. Yet at adatacenter level, it can be difficult to properly choosethe amount of memory in each server. The memorydemands across workloads vary widely, and paststudies have shown that per-server sizing for peakloads can lead to significant ensemble-leveloverprovisioning [11,31]. To address memoryoverprovisioning, we provision memory at a coarsergranularity (e.g., per blade chassis), sizing each largerunit to meet the expected aggregate peak demand. Ourdesign provides a remote memory pool which isdivided among all the attached servers. This allows usto provision memory more accurately across servers,and allows the attached servers to have smaller localmemories by exploiting locality to maintain highperformance. By right provisioning memory in such ahierarchical environment, we obtain power and costssavings and enable further optimizations.Basic architecture: Our design is illustrated inFigure 4(a). Each server blade has a smaller localmemory, and multiple servers are connected to amemory blade, which provides the remote memorypool and handles accesses on a page-size granularity.Within a single enclosure, the server and memoryblades are con

Chandrakant Patel , Trevor Mudge*, Steven Reinhardt*† * University of Michigan, Ann Arbor Hewlett-Packard Labs †Reservoir Labs {ktlim,tnm,stever}@eecs.umich.edu, {partha.ranganathan, jichuan.chang, chandrakant.patel}@hp.com Abstract This paper seeks to understand and design next-generation servers for emerging “warehouse-computing .

Related Documents:

SQL Server 2005 SQL Server 2008 (New for V3.01) SQL Server 2008 R2 (New for V3.60) SQL Server 2012 (New for V3.80) SQL Server 2012 R2 (New for V3.95) SQL Server 2014 (New for V3.97) SQL Server 2016 (New for V3.98) SQL Server 2017 (New for V3.99) (Recommend Latest Service Pack) Note: SQL Server Express is supported for most situations. Contact .

When provisioning a Windows Server for a specific role there are additional items to consider for further securing the server. When planning and provisioning your server layout, designate one primary purpose per server. Whenever possible, designate one server as the database server, one server as the web server, and one server as the file server.

Server 2005 , SQL Server 2008 , SQL Server 2008 R2 , SQL Server 2012 , SQL Server 2014 , SQL Server 2005 Express Edition , SQL Server 2008 Express SQL Server 2008 R2 Express , SQL Server 2012 Express , SQL Server 2014 Express .NET Framework 4.0, .NET Framework 2.0,

Introduction 1-2 Oracle Forms Server and Reports Server Installation Guide Introduction Oracle Forms Server and Reports Server is an integrated set of database tools i Oracle Forms i. Oracle Forms Server Server and Reports Server Server. UNIX. Installation Guide Compaq Tru64 .

Administrasi Server Sementara itu peta konsep mata pelajaran menjelaskan struktur urutan kegiatan belajar dan topik materi pelajaran. Gambar 2 dibawah ini menjelaskan peta konsep mata pelajaran Administrasi Server kelas XI semester 2. Administrasi Server 2 1. Server FTP 2. Server e-Mail 3. Server WebMail 4. Server Remote 5. Server NTP 6. Server .

System x3650 Type 7979 Turn off the server and install options. Did the server start correctly? Yes No Go to the Server Support flow chart on the reverse side of this page. Start the server. Did the server start correctly? Yes No Install the server in the rack cabinet and cable the server and options; then, restart the server. Was the server .

Thycotic Secr et server: v IBM Security Identity server v T ivoli Dir ectory Integrator server v IBM Security Secr et Server and Thycotic Secr et server adapter The IBM Security Secr et Server and Thycotic Secr et server is installed on a dif fer ent server as shown in Figur e 2 . RMI calls IBM Security Identity Server Dispatcher Service

Bitlocker Administration and Monitoring MBAM Anforderungen: Hardware: 4 CPUs 12 GB RAM 100 GB Disk OS: Windows Server 2008 R2 SP1 Windows Server 2012/R2 Windows Server 2016 SQL Server: SQL Server 2008 R2 SQL Server 2012 SP1 SQL Server 2012 SP2 SQL Server 2014 SQL Server 2014 SP1