Chapter 4 Low-Power VLSI DesignPower VLSI Design

2y ago
37 Views
8 Downloads
283.38 KB
44 Pages
Last View : 1m ago
Last Download : 3m ago
Upload by : Maxton Kershaw
Transcription

Chapter 4Low--Power VLSI DesignLowJin-Fu LiAdvanced Reliable Systemsy((ARES)) Lab.Department of Electrical EngineeringNational Central UniversityJhongli, Taiwan

Outline IntroductionLow-Power Gate-Level DesignLow-Power Architecture-Level DesignAlgorithmic-Level Power ReductionRTL TechniquesT h iforf OptimizingO i i i PowerPNational Central UniversityEE4012VLSI Design2

Introduction Most SOC design teams now regard power as oneg concernsof their top design Why low-power design? Battery lifetime (especially for portable devices) Reliability Power consumption Peak powerp Average powerNational Central UniversityEE4012VLSI Design3

Overview of Power Consumption Average power consumption Dynamicyppower consumptionp Short-circuit power consumption Leakage power consumption Static power consumption DDynamici power didissipationi ti dduringi switchingit hiCinputCdrainNational Central UniversityinterconnectEE4012VLSI DesignCinput4

Overview of Power Consumption Generic representation of a CMOS logic gate forgppower SnetworkC drain Cint erconnect CinputTdVoutdVout1 T /2 [ Vout ( Cload)dt (VDD Vout )(Cload)dt ]T /2T 0dtdtNational Central UniversityEE4012VLSI Design5

Overview of Power Consumption The average power consumption can be expressedasPavg 122 C load V DD C load V DDf CLKTThe node transition rate can be slower than theclock rate. To better represent this behavior, anode transition factorf( T ) should be introduced2Pavg T C load V DDf CLK The switching power expressed above are derivedby taking into account the output node loadcapacitanceNational Central UniversityEE4012VLSI Design6

Overview of Power VACloadVBVoutThe generalized expression for the average power dissipationcan be rewritten asPavgNational Central University # ofnodes Ti C iV i V DD f CLK i 1 EE4012VLSI Design7

Gate--Level Design – Technology MappingGate The objective of logic minimization is to reduce theboolean function. For low-power design, the signal switching activityis minimized by restructuring a logic circuit The power minimization is constrained by thedelay, however, the area may increase. During this phase of logic minimization, thefunction to be minimized is P i (1 P i ) CiiNational Central UniversityEE4012VLSI Design8

Gate--Level Design – Technology MappingGate The first step in technology mapping is to decomposeeach logic function into two-input gates The objective of this decomposition is to minimizing thetotal power dissipation by reducing the total switchingactivityti itA 0.2 0.0384B 0.2ABCD 0.0196 0.0099C 0.5D 00.55A 0.2 0.0384B 0.2C 0.5D 00.55National Central University 0.0099 0.1875EE4012VLSI Design9

Gate--Level Design – Phase AssignmentGateHigh activity nodeHigh activity nodeAABBCNational Central UniversityCEE4012VLSI Design10

Gate--Level Design – Pin SwappingGateacbdadbcdaSwitchinng activitySwitching activityycbadcbaNational Central UniversitybcdabcdEE4012VLSI Design11

Gate--Level Design – Glitching PowerGate Glitches spurious transitions due to imbalanced path delays A design has more balanced delay paths has fewer gglitches,, and thus has less powerpdissipationp Note that there will be no glitches in a dynamic CMOSlogicgAABBDECNational Central UniversityCDEEE4012VLSI Design12

Gate--Level Design – Glitching PowerGate A chain structure has more glitchesA tree structure has fewer glitchesABChain structureCDATree structureBCDNational Central UniversityEE4012VLSI Design13

Gate--Level Design – PrecomputationGateREGR1REGR1Combinational LogicCombinational LogicREGR2REGR2PrecomputationLogicgNational Central UniversityEE4012VLSI Design14

Gate--Level Design – PrecomputationGateA n-1 A n1 B n-1 REGR1A n-2:0 REGR2EnablePrecomputation logicB n-2:0 National Central University1-bit 2VLSI Design15

Gate--Level Design – Gating ClockGateD QD QD QD QFail DFT rulecheckingclkTD QD QD QD QAdd control pinto solve DFTviolationproblemclkNational Central UniversityEE4012VLSI Design16

Gate--Level Design – Input GatingGatef1clk selectl tf2National Central UniversityEE4012VLSI Design17

Clock--Gating in LowClockLow-Power FlipFlip--FlopDDQCKSource: Prof. V. D. AgrawalNational Central UniversityEE4012VLSI Design18

Reduced--Power Shift CK(f/2)Flip-flops are operated at full voltage and half the clock frequency.Source: Prof. V. D. AgrawalNational Central UniversityEE4012VLSI Design19

Power Consumption of Shift Register16-bit shift register, 2μ CMOSFreqF(MHz)PowerP(μW)133 033.01535216.588748 258.25738101.0Noormalizeed powerDeg. OfDparallelismC. Piguet, “Circuit and Logic LevelDesign ” pages 103-133Design,103 133 in WW. Nebeland J. Mermet (ed.), Low PowerDesign in Deep SubmicronElectronics Springer,Electronics,Springer 19971997.P C’VDD2f/n050.50.25000.0124Degree of parallelism, nSource: Prof. V. D. AgrawalNational Central UniversityEE4012VLSI Design20

Architecture--Level Design – 16R1632fref/216x16multiplier32RRM 32UXfref/2frefAssume that With the same 16x16multiplier, the power supply canbe reduced from Vref to Vref/1.83.VreffPparallel 2.2Cref ()1.832f reff2Rfref/2 0.33Pref16Bfref16x16multiplierp32R16fref/2National Central UniversityEE4012VLSI Design21

Architecture--Level Design – PipeliningArchitectureThe hardware between the pipeline stages is reduced thenthe reference voltage Vref can be reduced to Vnew to maintainthe same worst case delay.delay For example,example let a 50MHzmultiplier is broken into two equal parts as shown below. Thedelay between the pipeline stages can be remained at 50MHzwhen the voltage Vnew is equal to Vref/1.83/1 83(A line 1 .2 C ref (National Central UniversityV ref1 .83) f ref 0 .36 Pref2EE4012VLSI Design22

Architecture--Level Design – RetimingArchitectureRetiming is a transformation technique used to change thelocations of delay elements in a circuit without affecting theinput/output characteristics of the circuitcircuit.Two versions of an IIR filter.(1)(1)x(n)( )y(n))y(Dw(n)(1)Dax(n)( )D2D(1)(2)bretimingDDw1(n)w2(n)(2)National Central Universityy(n))y(EE4012VLSI Designa(2) 2Db(2)23

Architecture--Level Design – RetimingArchitectureRetiming for pipeline fREGC1(6ns)C3(4ns)frefNational Central UniversityEE4012VLSI Design24

Architecture--Level Design – RetimingArchitectureClock cycle is 4 gate delaysClock cycle is 2 gate delaysNational Central UniversityEE4012VLSI Design25

Architecture--Level Design –ArchitecturePower ManagementC2C1C1 FREEZEC2 FREEZEC2C1C1 FREEZEFREEZEC2 FREEZENational Central UniversityEE4012VLSI Design26

Architecture--Level Design –ArchitectureBus Segmentation Avoid the sharing of resources Reduce the switched capacitance For example: a global system bus A single shared bus is connected to all modules, thisstructure results in a large bus capacitance due to The large number of drivers and receivers sharing the samebus The parasitic capacitance of the long bus line A segmentedgbus structure Switched capacitance during each bus access issignificantly reduced Overall routing area may be increasedNational Central UniversityEE4012VLSI Design27

Architecture--Level Design –ArchitectureBus SegmentationCbusInterfaceBusCbus1Cbus1National Central UniversityEE4012VLSI Design28

Algorithmic--Level Design –Algorithmicfactivity ReductionMinimization the switching activity, at high level, is one way toreduce the power dissipation of digital processors.One method to minimize the switching signalssignals, at the algorithmiclevel, is to use an appropriate coding for the signals rather thanstraight binary code.The table shown below shows a comparison of 3-bit representationof the binary and Gray codes.Binary CodeGray 00National Central UniversityDecimal EquivalentEE4012VLSI Design0123456729

State Encoding for a Counter Two-bit binary counter: Two-bit Gray-code counter Gray-codeGraycode counter is more power efficient. State sequence,q, 00 01 10 11 00 Six bit transitions in four clock cycles 6/4 1.5 transitions pper clock StateSt t sequence, 00 01 11 10 00 Four bit transitions in four clock cycles 4/4 11.00 ttransitioniti per clockl kG. K. Yeap, Practical Low Power Digital VLSI Design, Boston:Kluwer Academic Publishers (now Springer)Springer), 19981998.Source: Prof. V. D. AgrawalNational Central UniversityEE4012VLSI Design30

Binary Counter: Original EncodingPresentstateaNext statebabAB0001011010111100A a’b ab’B a’b’ ab’ABCKCLRSource: Prof. V. D. AgrawalNational Central UniversityEE4012VLSI Design31

Binary Counter: Gray EncodingPresentstateNext stateabAB0001011110001110A a’b abB a’b’ a’baABbCKCLRSource: Prof. V. D. AgrawalNational Central UniversityEE4012VLSI Design32

Three--Bit CountersThreeBinaryGray-codeStateggNo. of togglesStateNo. of 0111111110210111111100100030001Av. Transitions/clock 1.75Av. Transitions/clock 1Source: Prof. V. D. AgrawalNational Central UniversityEE4012VLSI Design33

N-Bit Counter: Toggles in Counting Cycle Binary counter: T(binary) 2(2N – 1)Gray code counter: T(gray) 2NGray-codeT(gray)/T(binary) 2N-1/(2N – 1) .666731480.5714430160.5333562320.51616126640.5079 --0.5000Source: Prof. V. D. AgrawalNational Central UniversityEE4012VLSI Design34

FSM State EncodingTransitionprobabilitybased onPI 4090.9000.60.1010.111090.9Expected number of state-bit transitions:2(0.3 0.4) 1(0.1 0.1) 1.61(0.3 0.4 0.1) 2(0.1) 1.0State encoding can be selected using a power-based cost function.Source: Prof. V. D. AgrawalNational Central UniversityEE4012VLSI Design35

FSM: ClockClock-Gating Moore machine: Outputs depend only on thestate variables. If a state has a self-loop in the state transitiongraph (STG), then clock can be stoppedwhenever a self-loop is to be executed.Xi/ZkSiSkSjXk/ZkC oc caClockcan be stoppedwhen (Xk, Sk) combinationoccurs.Xj/ZkSource: Prof. V. D. AgrawalNational Central UniversityEE4012VLSI Design36

Clock--Gating in Moore tionlogicCKLatchPOL. Benini and GLG. De MicheliMicheli,Dynamic Power Management,Boston: Springer, 1998.Source: Prof. V. D. AgrawalNational Central UniversityEE4012VLSI Design37

Bus Encoding for Reduced Power Example: Four bit bus Bit-inversion encoding for N-bit bus:Number of bit transitioonsafter iinversionn encodding 0000 1110 has three transitions. If bits of second pattern are invertedinverted, then 0000 0001 will have only one transition.NN/200N/2N b off biNumberbit transitionsiiNSource: Prof. V. D. AgrawalNational Central UniversityEE4012VLSI Design38

Sennt dataRReceiveed dataBus--Inversion Encoding LogicBusPolaritydecisionlogicBus registerPolarity bitM. Stan and W. Burleson, “Bus-InvertCoding for Low Power I/O,” IEEETrans. VLSI Systems, vol. 3, no. 1, pp.49-58, March 1995.Source: Prof. V. D. AgrawalNational Central UniversityEE4012VLSI Design39

RTL--Level Design –RTLDecoder with enableSimple Decodermodule decoder (a, sel);input [1:0[ a;ouput [3:0] sel;reg [3:0][3 0] sel;lalways @(a) begincase (a)2’b00: sel 4’b0001;2’b01: sel 4’b0010;2’b10: sel 4’b0100;2’b11: sel 4’b1000;endcaseendendmoduleNational Central UniversitySignal Gatingmodule decoder (en,a, sel);inputen;input [1:0[ a;ouput [3:0][3 0] sel;lreg [3:0] sel;always @({en,a}) begincase ({en,a})3’b100: sel 4’b0001;3’b101: sel 4’b0010;3’b110: sel 4’b0100;3’b111: sel 4’b1000;default: sel 4’b0000;endcasedendendmoduleEE4012VLSI Design40

RTL--Level Design –RTLDatapath ReorderingInitialA BReorderedstableMuxMMuxglitchyglitchylit hA BMuxMuxstableNational Central UniversityEE4012VLSI Design41

RTL--Level Design –RTLMemory Partition128x32dinaddrwritedout32noepre addrpre addr8d[ ]q addr[7:0]M 32UdoutXaddr[7:1]dd [7 1]noeclkwriteaddrdinaddr0National Central UniversityEE4012VLSI Designdout32128x3242

RTL--Level Design –RTL Memory PartitionApplication-driven memory partition64K bytesReadsDataARMCoreAddrR/W28K4K32K64KNational Central UniversityEE4012VLSI DesignAddrRange43

RTL--Level Design –RTL Memory PartitionA power-optimal partitioned memory organizationDecoderDataAddrR/WCSEE4012VLSI DesignDataAddrR/WCSNational Central UniversityDataAddrR/WCSARMCore44

Overview of Power Consumption The average power consumption can be expressed as 1 avg C load V DD C load V DD f CLK T P 2 The node transition rate can be slower than the clock rate. To better represent this behav

Related Documents:

Part One: Heir of Ash Chapter 1 Chapter 2 Chapter 3 Chapter 4 Chapter 5 Chapter 6 Chapter 7 Chapter 8 Chapter 9 Chapter 10 Chapter 11 Chapter 12 Chapter 13 Chapter 14 Chapter 15 Chapter 16 Chapter 17 Chapter 18 Chapter 19 Chapter 20 Chapter 21 Chapter 22 Chapter 23 Chapter 24 Chapter 25 Chapter 26 Chapter 27 Chapter 28 Chapter 29 Chapter 30 .

VLSI Design 2 Very-large-scale integration (VLSI) is the process of creating an integrated circuit (IC) by combining thousands of transistors into a single chip. VLSI began in the 1970s when complex semiconductor and communication technologies were being developed. The microprocessor is a VLSI device.

VLSI IC would imply digital VLSI ICs only and whenever we want to discuss about analog or mixed signal ICs it will be mentioned explicitly. Also, in this course the terms ICs and chips would mean VLSI ICs and chips. This course is concerned with algorithms required to automate the three steps “DESIGN-VERIFICATION-TEST” for Digital VLSI ICs.

VL2114 RF VLSI Design 3 0 0 3 VL2115 High Speed VLSI 3 0 0 3 VL2116 Magneto-electronics 3 0 0 3 VL2117 VLSI interconnects and its design techniques 3 0 0 3 VL2118 Digital HDL Design and Verification 3 0 0 3 VL2119* Computational Aspects of VLSI 3 0 0 3 VL2120* Computational Intelligence 3 0 0 3

Dr. Ahmed H. Madian-VLSI 3 What is VLSI? VLSI stands for (Very Large Scale Integrated circuits) Craver Mead of Caltech pioneered the filed of VLSI in the 1970’s. Digital electronic integrated circuits could be viewed as a set

TO KILL A MOCKINGBIRD. Contents Dedication Epigraph Part One Chapter 1 Chapter 2 Chapter 3 Chapter 4 Chapter 5 Chapter 6 Chapter 7 Chapter 8 Chapter 9 Chapter 10 Chapter 11 Part Two Chapter 12 Chapter 13 Chapter 14 Chapter 15 Chapter 16 Chapter 17 Chapter 18. Chapter 19 Chapter 20 Chapter 21 Chapter 22 Chapter 23 Chapter 24 Chapter 25 Chapter 26

15A04604 VLSI DESIGN Course Objectives: To understand VLSI circuit design processes. To understand basic circuit concepts and designing Arithmetic Building Blocks. To have an overview of Low power VLSI. Course Outcomes: Complete Knowledge about Fabrication process of ICs Able to design VLSIcircuits as per specifications given.

DEDICATION PART ONE Chapter 1 Chapter 2 Chapter 3 Chapter 4 Chapter 5 Chapter 6 Chapter 7 Chapter 8 Chapter 9 Chapter 10 Chapter 11 PART TWO Chapter 12 Chapter 13 Chapter 14 Chapter 15 Chapter 16 Chapter 17 Chapter 18 Chapter 19 Chapter 20 Chapter 21 Chapter 22 Chapter 23 .