Influence Of Technology Directions On System Architecture

2y ago
3.43 MB
59 Pages
Last View : 1d ago
Last Download : 5m ago
Upload by : Camden Erdman

Influence of Technology Directionson System ArchitectureDr. Randy IsaacVP of Science and TechnologyIBM Research DivisionSeptember 10, 2001

Moore's Law continues beyond conventional scalingPower becomes the limiting metricThe integration focus moves from circuit to processor

1000 Buys.1E 12Computations/sec1E 9MechanicalElectro-mechanicalVacuum tubeDiscrete transistorIntegrated circuit1E 61E 31E 01E-31E-61900192019401960198020002020Yearafter Kurzweil, 1999 & Moravec, 1998

Integrated Circuit Performance Trends1110MemoryLogicxDensity10# Transistors per 10 4104Mb10MHzx1MHz3198019902000

The Original Moore's Law ProposalAfter G. E. Moore"Electronics," 1965

A Decade of AgreementAfter G. E. MooreProc. IEDM, 1975

Complexity's InfluenceAfter G. E. MooreSPIE v. 2440,1995

Increased integrationFunction implemented with:One ChipFew siliconcomponentsFunctionSpeedCostMany siliconcomponents

Partitioning the Improvement RateImproving Integration: Components per chip50% Gain from Lithography25% Gain from Device and Circuit Innovation25% Gain from Increased Chip Size(manufacturability)Improving Performance:Transistor Performance ImprovementInterconnect Density and DelayPackaging and CoolingCircuit-level and System-level Gains

Evolution of Memory Density100,000Megabits/chip10,0001,000doubling time: 3 yr.100doubling time: 1.5 yr.10101980198519901995Year2000200520102015

ITRS Lithography Roadmap5009597020508111994 SIA NTRS350Minimum Feature Size (nm)(DRAM Half-Pitch)992501997 SIA NTRS1801998 / 1999 ITRS130100ISMT Litho2000 Plan7050Area for FutureAcceleration352595979902050811Industry-Wide Lithography Technology Acceleration

Dimensions in LithographyNanometers10010100010.10.01Feature SizesWavelength435&405365248157193Deep UVExtreme UVX-ray ProximityElectron Beam

Device ScalingOriginal DeviceVoltage, VWIRINGtoxWGATEn sourcen drainLp substrate, doping NAxd

Device ScalingScaled DeviceVoltage, V / αSCALING:WIRINGtox/ααGATEn sourceW/ααn drainL/ααxd/ααp substrate, doping α *NAVoltage:Oxide:Wire width:Gate width:Diffusion:Substrate:NAV/ααtox /ααW/ααL/ααxd /ααα*

Device ScalingSCALING:Scaled DeviceVoltage, V / αWIRINGtox/ααW/αGATE n sourceVoltage:Oxide:Wire width:Gate width:Diffusion:Substrate:V/ααtox /ααW/ααL/ααxd /ααα * NA n drainL/αxd/ααp substrate, doping α*NARESULTS:Higher Density: αα2Higher Speed: ααLower Power/ckt: 1/α/α 2Power Density: Constant

Fundamental atomic limit to scalingrecipe1.2 nm oxynitridepresentfuturesilicon bulk field effect transistor (FET)Oxide thickness is approaching a few atomic layers

Gate Current Density (A/cm2)Limit of Oxide Scaling1E 61E 41E 21E 01E-21E-41E-61E-801234Gate Oxide Thickness (nm)(Gate voltages: 0.9 to 2.0 V)

High Performance CMOS Logic TrendIndustry Logic Performance Trends1000FPG100101199419961998200020022004Year of First Production20062008

Relative CMOS Device PerformanceNew structures are needed to maintain device performance.Relative Device PerformanceDouble Gate FETsLow Temp.SOI FETsBulk FETs19861998199220102004Year of Technology Capability?

Relative Device PerformanceMOSFET Device Structure (R)evolution?YearNew devices/materials support accelerated growth rate

Better Performance Without Scaling

Novel DevicesV-GrooveTransistorsCarbon NanotubesOrganic TransistorsQuantum ComputingMolecularDevices

64-bit S/390 Microprocessor47 Million transistorsCopper interconnect -- 7 layersSize: 17.9 x 9.9 mmSingle scalar, in-order executionSplit L1 cache (256K I & D)BTB 2K x 4, multiportedOn chip compression unit 1 GHz frequency on a 20-way system

Blue Pacific3.9 trillion operations/secCan simulate nuclear devices15,000 X speed of average desktop80,000 X memory of average desktop75 terabytes of disk storage capacity

System Level Performance ImprovementP e r f o r m a n c eOverall System Level Performance Improvement Will ComeFrom Many Small Improvements60 to 90%CAGROverall performanceApplication tuningMiddleware tuningOS: tuning/scalabilityCompilersMulti-way systemsMotherboard design: electrical, debugMemory subsystem: latency/bandwidthPackaging: more pins, better electrical/coolingTools / environment / designer productivityArchitecture/Microarchitecture/Logic designCircuit designNew device structuresOther process technologyTraditional CMOS scaling20% CAGRYear 2000

Moore's Law continues beyond conventional scalingPower becomes the limiting metricThe integration focus moves from circuit to processor

Microprocessor Size Trends256xMagnitude64x# TransistorsMoore's LawIndustry2xrs.y.5/116x4x2xrs.y9/ 1.s.2x / 6 yrChip Area1x flat½x / 6 yrs.1/4x½x / 82000

Microprocessor Performance y2xrs.y.5/1rs.y22x /16xPower.3 yrs2x /s.2x / 6 yr4xPower y19982000

Microprocessor Scaling Trends486DXDeviceScalingMoore'sLawPentium 404/10/892001200104/23/01Technology (um) (V) (MHz)2510064001700SpecInt950.52.012871# Transistors (M)1.21.230742Chip Size (sq. wer (W)Power Density (W/

Power Density: The Fundamental Problem1000W/cm 2Nuclear Reactor100101PentiumPentiumIII IIHot PlatePentium Pro Pentium i386i4861.5µµ1µµ0.7µµ 0.5µµ 0.35µµ 0.25µµ 0.18µµ 0.13µµ 0.07µµ 0.1µµSource: Fred Pollack, Intel. New Microprocessor Challengesin the Coming Generations of CMOS Technologies, Micro32

PowerIT electrical power needs are projected to reach crisisproportionsServer farm energy consumption is increasing exponentially.more Watts/sq. ft than semiconductor or automobile needs constitute 60% of costInteresting anecdotesThe "2,400 megawatt problem":27 farms proposed for South King County will require as much energyas Seattle (including Boeing)Exodus considering building power plant near its Santa Clara facilitySan Jose City Council approved 250 MW power plant for US DataPortserver farmand installation of 80 back-up diesel generators

Server Farm Heat Density TrendHighest Communication: 28% AGRLowest Tape storage: 7%* Slower growth after 2005 due to improvement in semiconductor power consumptionReprinted with permission of The Uptime Institute from a White Paper titled Heat Density Trends in Data Processing, Computer Systems, andTelecommunications Equipment Version 1.0.

Energy Dissipated per Logic Operation1E 10Energy (pJ)1E 61E 21E-21E-6kT (room temp.)1E-10194019601980YEAR20002020

Device ScalingSCALING:Scaled DeviceVoltage, V / αWIRINGtox/ααGATEn sourceW/ααVoltage:Oxide:Wire width:Gate width:Diffusion:Substrate:V/ααtox /ααW/ααL/ααxd /ααα * NAn drainRESULTS:L/ααxd/ααp substrate, doping α *NAHigher Density:Higher Speed:Lower Power/ckt:Power Density: αα2 αα 1/α/α 2 Constant

MOSFET Device Parameter Trends1000Tox (CC)10010classic scaling10.10.01Vdd (V)Vt (V)0.1Gate Length, Lgate (um)1

Source/Drain Current (A/cm)Low Temperature CMOS1.0E 21.0E 0L 25 nmVds 1V1.0E-21.0E-41.0E-6T 100KT 200KT 300K1.0E-81.0E-10-0.500.51Gate Voltage (V)Subthreshold slope steepens as temperature is reduced

CMOS Performance Parameter Trends100001000Cgate (fF/um)Inverter Delay (ps)NFET Id-sat (A/m)Power Density (W/cm2)CV/I Delay (a. u.)1001010.10.010.1LGATE (lm)1

Relative Power Density in Scaled CMOS4(48)Relative Power Density1.2V(25)1.5VHigh Performance3(12.8)1.8V(6.3)2.5V(RELATIVE DENSITY)2(2.5)3.3V(1.0)0.8V10.051.0V0.1Low Power1.2V1.5V0.2Channel Length (µµ m)After B. Davari, et al., IEEE Proc. Vol. 83, p. 595, 1995.5.0V2.5V0.51.0

CMOS Power Density Trends1000?100Active PowerDensityPower (W/cm2)1010.10.010.001Subthreshold Power Density0.00011E-50.010.1Gate Length (um)1

Microprocessor Power Draw vs. Frequency4035Power (Watts)302520151050200300400500Operating Frequency (MHz)600700

Moore's Law continues beyond conventional scalingPower becomes the limiting metricThe integration focus moves from circuit to processor

We've been here before!Heat Flux Explosion2Module Heat Flux(watts/cm )14IBM ES900012Bipolar10CMOS8Fujitsu VP2000IBM 3090SNTT6IBM GPIBM RY5Fujitsu M-78042Vacuum IBM 360019501960IBM 3090CDC Cyber 205IBM 4381IBM 3081Fujitsu M380IBM 370IBM 3033Steam Iron(5W/cm2)IBM RY7PulsarIBM RY6IBM RY4ApacheMercedPentium II(DSIP)197019801990Year of Announcement20002010

S/390 Mainframe CPU PerformanceRelative Performance1000BipolarCMOSS/390 G7S/390 G6S/390 G51009021-711S/390 G4S/390 G310168119701975198019851990Year199520002005

S/390: Comparison of Bipolar and CMOSES9000 9X2S/390 G5TechnologyBipolarCMOSTotal Chips500029(12 CPUs)Total Parts665992Weight (lbs)31.1 K2.0 KPower Req (KW)1535Chips/processor3901102467252Maximum Memory (GB)Space (sq ft)

S/390 Mainframe CPU PerformanceRelative Performance1000BipolarCMOSS/390 G7S/390 G6S/390 G51009021-711S/390 G4S/390 G310168119701975198019851990Year199520002005

Focus on massively parallel systemsUse slower processors with much greater power efficiencyScale to desired performance with parallel systemsWorkload scaling efficiency must sustain power efficiencyPhysical distance must be small to keep communicationpower manageable.Example: Processor A is slower than Bby a factor S but more power efficient by E.Then MP System A at the same performanceas MP System B has lower power by E/S.

Microprocessor Efficiencies0.1 MIPS/mW30001 MIPS/mWPerformance (DMIPS)4000 200010000100100010000Active power (mW)100000

Parallel Performance Scaling Model100Ideal scalingRelative performance806040Pmax20Nmax0Number of processorsReal scaling

Power/Bandwidth by Interconnect Lengthroom-roomPower/Bandwidth chip10.0010.010.11Interconnect Length (meters)10100

Supercomputer Peak Performance1E 16PetaflopBlueGenePeak Speed (flops)1E 141E 121E 10Doubling time 1.5 yr.1E 8CDC STAR-100 (vectors)ASCI WhiteASCI RedBlue PacificASCI RedCP-PACSNWTCM-5ParagonDeltai860 (MPPs)CRAY-2Y-MP8X-MP4Cyber 205X-MP2 (parallel vectors)CRAY-1CDC 7600CDC 6600 (ICs)1E 6IBM Stretch1E 41E 21940IBM 701IBM 7090 (transistors)IBM 704UNIVACENIAC (vacuum tubes)1950196019701980Year Introduced199020002010

ASCI White

Cellular Architecturecomputational efficiency 0.2 GFLOP/W

Example of a Cellular NodeIBM PPC440 system-on-chip440 PowerPC 1 Watt32 kB I-Cache32 kB D-Cache10/100MbEthernetIntegrated memory ppedControlor CacheLink DMADDR/DDR2 BufferscontrollerLinkDMAand GlobalTree&buffersFP2.8GFPLBInfc1Gb Ethernetor Infiniband24EthernetFor bootEthernetfor I/O144Six 2Gb/sec DDR SDRAM256-512MBserial linksGlobalFunctions

Cellular Communication Networks65536 nodes interconnected with three integratednetworksEthernetIncorporated into every node ASICDisk I/OHost control, booting and diagnostics3 Dimensional TorusVirtual cut-through hardware routing to maximize efficiency2.8 Gb/s on each of 12 node links (total 4.2 GB/s per node)Communication backbone134 TB/s total torus interconnect bandwidth1.4/2.8 TB/s bisectional bandwidthGlobal TreeOne-to-all or all-all broadcast functionalityArithmetic operations implemented in tree 1.4 GB/s of bandwidth from any node to all other nodesLatency of tree less than 1usec 90TB/s total binary tree bandwidth (64k machine)

Node Card and I/O Card DesignCompute cards8 processors, 2 x 2 x 2 (x,y,z)256 MB RAM each processorRedundant power suppliesFast EthernetI/O cards4 processors (no torus)512MB-1GB each processorRedundant Power SuppliesFast and 1Gb EthernetGb EthernetI/O Node100Mb EthernetSwitchCompute Nodes

Rack Design1024 compute nodes256 GB DRAM2.8TF peakDRAM DRAM DRAMBL ASIC2 coresDRAM DRAM DRAMDRAM DRAM DRAMOne compute nodeDRAM DRAM DRAM16 I/O nodes8 GB DRAM16 Gb EthernetDRAM DRAM DRAMBL ASIC2 coresDRAM DRAM DRAMDRAM DRAM DRAMDRAM DRAM DRAMDRAM DRAM DRAMOne I/O node 15 KW, air cooled1 1 or 2 1 redundant power2 1 redundant fans

Building a Cellular SystemSystem(64 cabinets, 32x32x64)Cabinet(128 boards, 8x8x16)Board(8 chips, 2x2x2)Chip(2 processors)360 TF/s16 TB440 coreEDRAM440 coreI/O5.6 GF/s4 MB44.8 GF/s2.08 GB5.7 TF/s266 GB

Moore's Law continues beyond conventional scalingTechnology innovation will overcome limitsPower becomes the limiting metricTechnology trend is to higher power densityThe integration focus moves from circuit to processorRadical power reduction depends on efficient processorsMassively parallel systems have great potential

( Hopefully Not ) The End!

IBM 360 IBM 370IBM 3033 IBM ES9000 Fujitsu VP2000 IBM 3090S NTT Fujitsu M-780 IBM 3090 CDC Cyber 205 IBM 4381 IBM 3081 Fujitsu M380 IBM RY5 IBM GP IBM RY6 Apache Pulsar Merced IBM RY7

Related Documents:

Transition to Circles of Influence Circles of Influence Draw Circles of Influence on a board or chart paper for everyone to see (see handout) Label each circle appropriately (using Circles of Influence handout) SHARE facilitator circles of influence with group, walking through each piec

tiveness of the six influence strategies and the conditions under which certain influence strategies are more likely to be used. The concepts of power and influence have been the focus of scholarly attention for several decades (cf. French and Raven 1959; Spekman 1979). In contrast, research on influence strategies that people use to translate

10 The general influence of Horace Bushnell on Olmsted and the influence of his sermon, "Unconscious Influence"11 is also well documented.12 However, further examina tion of Bushnell's writings revealed more evidence of the direct influence of Bushnell on Olmsted, including Olmsted's practice of landscape design and theories of city .

More specifically, the study aims to understand the influence of Family, Friends and community on individuals' intention to adopt Bitcoin, which is the world's most popular virtual currency. 3. Literature Review 3.1 Social Influence Social influence is known as the influence of surrounding people - including family, friends, community,

also following directions correctly Follow My Lead What happens when you are following or giving directions if the directions are not clear? What would be the results of giving directions to someone who isn’t listening? Materials List Bread in plastic bags, jars of peanut butter, jars of jelly (or other type of sandwich makings), plastic knives,

Substitute Teacher Handboo / 8th Edition Following Directions G R A D E L E V E L S K-2 Time: 15–30 minutes Materials Teacher Directed Instructions Needed: Following Directions Activity Sheet Crayons Advance Preparation: Photocopy one Following Directions Activity Sheet for each student. Objective: Students will f

Purpose: Students recognize the importance of following directions on a job or school application. 1. StudentS learn the importance of reading directionS before filling out a job application. Hand out copies of the “Directions” activity sheet (#1). Explain to students that you are going to test their ability to follow directions. Allow them .

Directions Students accept procedures and directions if they understand their purpose and see that they are applied consistently and fairly. The best procedures and directions make students’ lives feel more secure. According to Pickhardt (n.d.), values-based rules provide structure for effective relationships. For example, rules guide