ELEC 5200-001/6200-001 Computer Architecture And Design .

2y ago
30 Views
2 Downloads
4.95 MB
54 Pages
Last View : 1d ago
Last Download : 3m ago
Upload by : Mika Lloyd
Transcription

ELEC 5200-001/6200-001Computer Architecture and DesignFall 2013Pipelining (Chapter 4.5, 4.6)Vishwani D. Agrawal & Victor P. NelsonDepartment of Electrical and Computer EngineeringAuburn University, Auburn, AL 36849Fall 2013ELEC 5200-001/6200-001 Pipelining1

ILP: Instruction Level ParallelismSingle-cycle and multi-cycle datapathsexecute one instruction at a time.How can we get better performance?Answer: Execute multiple instructions at atime:Pipelining – Enhance a multi-cycle datapath tofetch one instruction every cycle.Parallelism – Fetch multiple instructions everycycle.Fall 2013ELEC 5200-001/6200-001 Pipelining2

Automobile Team Assembly1 hour1 hour1 hour1 hour1 car assembled every four hours6 cars per day180 cars per month2,040 cars per yearFall 2013ELEC 5200-001/6200-001 Pipelining3

Automobile Assembly LineTask 11 hourMecahnicalTask 21 hourTask 31 hourTask 41 hourElectricalPaintingTestingFirst car assembled in 4 hours (pipeline latency)thereafter, 1 car completed per hour21 cars on first day, thereafter 24 cars per day717 cars per month8,637 cars per yearWhat gives 4X increase?Fall 2013ELEC 5200-001/6200-001 Pipelining4

Throughput: Team AssemblyRed carstartedRed carcompletedMechanical Electrical Painting Testing Mechanical Electrical Painting TestingBlue carstartedBlue carcompletedTime of assembling one carTime nhourswhere n is the number of nearly equal subtasks,each requiring 1 unit of timeThroughputFall 2013 1/ncars per unit timeELEC 5200-001/6200-001 Pipelining5

Throughput: Assembly LineCar 1 Mechanical ElectricalCar 2PaintingTestingMechanical ElectricalCar 3PaintingMechanical ElectricalPaintingMechanical ElectricalCar 4Car 1complete.TestingTime to complete first car nCars completed in time T T–n 1Throughput 1 – (n – 1)/ TThroughput (assembly line) Throughput (team assembly)Fall 2013TestingPaintingTestingtimeCar 2completetime units (latency) cars per unit time1 – (n – 1)/ T 1/nELEC 5200-001/6200-001 Pipeliningn(n – 1)n – Tnas T 6

Some Features of Assembly LineElectrical partsdelivered (JIT)Task 11 hourMechanicalStall assembly lineto fix the cause ofdefectFall 2013Task 21 hourTask 31 hourTask 41 hourElectricalPaintingTesting3 cars in the assembly line are suspects,to be removed (flush pipeline)ELEC 5200-001/6200-001 PipeliningDefectfound7

Pros and ConsAdvantages:Efficient use of labor.Specialists can do better job.Just in time (JIT) methodology eliminates warehouse cost.Disadvantages:Penalty of defect latency.Lack of flexibility in production.Assembly line work is monotonous and boring.http://www.youtube.com/watch?v c8LxscnmdNY&feature n with a spannerset modern times/http://www.metacafe.com/watch/762944/crazy chaplin screwingup everything modern timesFall 2013ELEC 5200-001/6200-001 Pipelining8

Pipelining in a ComputerDivide datapath into nearly equal tasks, to beperformed serially and requiring non-overlappingresources.Insert registers at task boundaries in thedatapath; registers pass the output data fromone task as input data to the next task.Synchronize tasks with a clock having a cycletime that just exceeds the time required by thelongest task.Break each instruction down into the set of tasksso that instructions can be executed in astaggered fashion.Fall 2013ELEC 5200-001/6200-001 Pipelining9

Pipelining a Single-Cycle ALUaccessfetch Decode(IF) (also reg. Operation) (MEM)file lw2ns1ns2ns2ns1ns 8nssw2ns1ns2ns2ns8nsR-format2ns1ns2ns1ns 8ns2ns1ns2ns8nsadd, sub, and, or, sltB-format, beqNo operation on data; idle times equalize instruction lengths.Fall 2013ELEC 5200-001/6200-001 Pipelining10

Execution Time: Single-Cycle0lw 1, 100( 0)lw 2, 200( 0)lw 3, 300( 0)2IF ID4EX6810121416.Time (ns)MEM WBIF IDEX MEM WBIF IDEX MEM WBClock cycle time 8 nsTotal time for executing three lw instructions 24 nsFall 2013ELEC 5200-001/6200-001 Pipelining11

Pipelined Decodetion(also reg. (ALUfile read) ewrite)(WB)2ns1ns2ns2ns1ns2nsR-format: add,sub, and, or, ns10ns10nsNo operation on data; idle time inserted to equalize instruction lengths.Fall 2013ELEC 5200-001/6200-001 Pipelining12

Execution Time: Pipeline0lw 1, 100( 0)lw 2, 200( 0)lw 3, 300( 0)2IF468IDEXMEMIFIDEXIFID10121416.Time (ns)RWMEM RWEXMEM RWClock cycle time 2 ns, four times faster than single-cycle clockTotal time for executing three lw instructions 14 nsPerformance ratioFall 2013 Single-cycle time Pipeline timeELEC 5200-001/6200-001 Pipelining 24 1.71413

Pipeline PerformanceClock cycle time 2 ns1,003 lw instructions:Total time for executing 1,003 lw instructionsPerformance ratio 2,014 nsSingle-cycle time Pipeline time 8,024 3.982,01480,024 / 20,014 3.998 Clock cycle ratio (4)10,003 lw instructions:Performance ratio Pipeline performance approaches clock-cycle ratio for long programs.Fall 2013ELEC 5200-001/6200-001 Pipelining14

Single-Cycle DatapathInstr.mem.16-2011-15RegWriteALU1 mux 0MEM: .0 mux 1PC1 mux 021-25BranchALU26-31EX: Execute,address calc.1 mux 0opcodeReg. FileAdd4CONTROLID: Instr. decode,reg. file readIF: Instr. fetchWB:writebackRegDst0-15Signext.Shiftleft 2ALUOpALUCont.0-5Fall 2013ELEC 5200-001/6200-001 Pipelining15

Pipelining of RISC Instructions(From Lecture 3, Slide odeinstruction andFetch operandsExecuteMemoryOperationWriteBackto RegfileAlthough an instruction takes five clock cycles,one instruction is completed every cycle.Fall 2013ELEC 5200-001/6200-001 Pipelining16

This requires aCONTROL not toodifferent tamem.0 mux 1Instr.mem.ALUSrcALUPC1 mux 021-25MemtoRegMEM/WBRegWrite1 mux 026-31EX/MEMBranchReg. FileopcodeCONTROL4ID/EXAddIF/ID1 mux 0Pipeline RegistersRegDst0-15Signext.Shiftleft 2ALUOpALUCont.0-5Fall 2013ELEC 5200-001/6200-001 Pipelining17

Pipeline Register FunctionsFour pipeline registers are added:RegisternameData heldIF/IDPC 4, Instruction word (IW)ID/EXPC 4, R1, R2, IW(0-15) sign ext., IW(11-15)EX/MEMPC 4, zero, ALUResult, R2, IW(11-15) or IW(16-20)MEM/WB M[ALUResult], ALUResult, IW(11-15) or IW(16-20)Fall 2013ELEC 5200-001/6200-001 Pipelining18

EX/MEMShiftleft 2opcodeALU4ID/EXAddIF/ID1 mux 0Pipelined DatapathMEM/WB26-3116-20memDatamem.0 mux 1PCALUInstr1 mux 021-25Reg. FilezeroSignext.11-15 for R-type16-20 for I-type lw0-15Fall 2013ELEC 5200-001/6200-001 Pipelining19

Five-Cycle PipelineFall 2013ELEC 5200-001/6200-001 D, REG.FILEREADID/EXCC2IF/IDIMPCCC120

Add Instructionadd t0, s1, s2Machine instruction word000000 10001 10010 01000 00000 100000opcode s1 s2 t0functionIFFall 2013IDread s1read s2EXadd s1 s2MEMELEC 5200-001/6200-001 D, REG.FILEREADID/EXCC2IF/IDIMPCCC1WBwrite t021

EX/MEMShiftleft 2opcodePCmem16-20s2Fall 2013 s2Signext.11-15 for R-type16-20 for I-type lwt0 s1addrDatamem0 mux 1InstrzeroALU21-25MEM/WBs1Reg. File26-311 mux 04ALUID/EXAddIF/ID1 mux 0Pipelined Datapath Executing adddata0-15ELEC 5200-001/6200-001 Pipelining22

Load Instructionlw t0, 1200 ( t1)100011 01001 01000 0000 0100 1000 0000opcode t1 t01200IFFall 2013IDread t1sign ext1200CC5MEM/WBREG.FILEWRITEDMCC4EX/MEMCC3ALUID, REG.FILEREADID/EXCC2IF/IDIMPCCC1EXMEMaddread t1 1200 M[addr]ELEC 5200-001/6200-001 PipeliningWBwrite t023

EX/MEMShiftleft 2opcodePC16-20memSignext.11-15 for R-type16-20 for I-type lwt0Fall 2013 t1addrDatamem0 mux 1InstrzeroALU21-25MEM/WBt1Reg. File26-311 mux 04ALUID/EXAddIF/ID1 mux 0Pipelined Datapath Executing lwdata0-151200ELEC 5200-001/6200-001 Pipelining24

Store Instructionsw t0, 1200 ( t1)101011 01001 01000 0000 0100 1000 0000opcode t1 t01200IFFall 2013IDread t1sign ext1200CC5MEM/WBREG.FILEWRITEDMCC4EX/MEMCC3ALUID, REG.FILEREADID/EXCC2IF/IDIMPCCC1EXMEMaddwrite t1 1200 M[addr](addr) t0ELEC 5200-001/6200-001 PipeliningWB25

Shiftleft 2opcodePCmem16-20t0 t1 t0Signext.11-15 for R-type16-20 for I-type lwaddrDatamem0 mux 1InstrzeroALU21-25MEM/WBt1Reg. File26-31ALUEX/MEM1 mux 04ID/EXAddIF/ID1 mux 0Pipelined Datapath Executing swdata0-151200Fall 2013ELEC 5200-001/6200-001 Pipelining26

Executing a ProgramConsider a five-instruction segment:lwsubaddlwaddFall 2013 10, 20( 1) 11, 2, 3 12, 3, 4 13, 24( 1) 14, 5, 6ELEC 5200-001/6200-001 Pipelining27

Fall 2013add 14, 5, ILEWRITEIF/IDID, REG.FILEREADID/EXALUEX/MEMELEC 5200-001/6200-001 EX/MEMMEM/WBREG.FILEWRITEDMDMALUCC5lw 10, 20( 1)sub 11, 2, 3Program instructionsALUALUEX/MEMIF/IDID, REG.FILEREADID/EXCC4IMIF/IDID, REG.FILEREADID/EXIMIF/IDID, REG.FILEREADID/EXIMCC3PClw 13, 24( 1)IF/IDID, REG.FILEREADID/EXCC2PCadd 12, 3, 4PCIMPCCC1IMProgram Executiontime28

CC54ID/EXEX/MEMAddIF/IDShiftleft 2opcode1 mux 0ID: lw 13, 24( 1)ALUIF: add 14, 5, 6MEM:WB:EX: add 12, 3, 4 sub 11, 2, 3 lw 10, 20( 1)MEM/WB26-3116-20memDatamem.0 mux 1PCALUInstr1 mux 021-25Reg. FilezeroSignext.11-15 for R-type16-20 for I-type lw0-15Fall 2013ELEC 5200-001/6200-001 Pipelining29

Advantages of PipelineAfter the fifth cycle (CC5), one instruction iscompleted each cycle; CPI 1, neglecting theinitial pipeline latency of 5 cycles.– Pipeline latency is defined as the number of stages inthe pipeline, or– The number of clock cycles after which the firstinstruction is completed.The clock cycle time is about four times shorterthan that of single-cycle datapath and about thesame as that of multicycle datapath.For multicycle datapath, CPI 3. .So, pipelined execution is faster, but . . .Fall 2013ELEC 5200-001/6200-001 Pipelining30

Science is always wrong. It never solves a problemwithout creating ten more.George Bernard ShawFall 2013ELEC 5200-001/6200-001 Pipelining31

Pipeline HazardsDefinition: Hazard in a pipeline is asituation in which the next instructioncannot complete execution one clock cycleafter completion of the present instruction.Three types of hazards:– Structural hazard (resource conflict)– Data hazard– Control hazardFall 2013ELEC 5200-001/6200-001 Pipelining32

Structural HazardTwo instructions cannot execute due to aresource conflict.Example: Consider a computer with acommon data and instruction memory.The fourth cycle of a lw instructionrequires memory access (memory read)and at the same time the first cycle of thefourth instruction requires instruction fetch(memory read). This will cause a memoryresource conflict.Fall 2013ELEC 5200-001/6200-001 Pipelining33

lwFall 2013 13, 24( 1)ELEC 5200-001/6200-001 WBREG.FILEWRITECC5lw 10, 20( 1)sub 11, 2, 3Program MEX/MEMALUMEM/WBEX/MEMIM/DMALUCC4ID, REG.FILEREADID/EXIF/IDEX/MEMALUCC3IM/DMIF/IDID, REG.FILEREADID/EXIM/DMID/EXID, REG.FILEREADID/EXIF/IDCC2PCCommon dataand instr. Mem.add 12, 3, 4PCIF/IDID, REG.FILEREADCC1IM/DMPCIM/DMExample of Structural HazardtimeNedded by two instructions34

Possible Remedies for StructuralHazardsProvide duplicate hardware resources indatapath.Control unit or compiler can insert delays(no-op cycles) between instructions. Thisis known as pipeline stall or bubble.Fall 2013ELEC 5200-001/6200-001 Pipelining35

Fall 2013lw 13, 24( 1)ELEC 5200-001/6200-001 PipeliningREG.FILEWRITEIM/DMStall (bubble)Program instructionsMEM/WBREG.FILEWRITEMEM/WBEX/MEMCC5lw 10, 20( 1)sub 11, 2, 3MEM/WBREG.FILEWRITEIM/DMEX/MEMALUIF/IDID, C4IM/DMEX/MEMALUEX/MEMALUIF/IDID, REG.FILEREADID/EXCC3ID, REG.FILEREADID/EXIF/IDIM/DMIF/IDID, REG.FILEREADID/EXCC2PCadd 12, 3, 4PCIM/DMPCCC1IM/DMStall (Bubble) for Structural Hazardtime36

Data HazardData hazard means that an instructioncannot be completed because the neededdata, being generated by anotherinstruction in the pipeline, is not available.Example: consider two instructions:add s0, t0, t1sub t2, s0, t3Fall 2013# needs s0ELEC 5200-001/6200-001 Pipelining37

Example of Data HazardtimeCC5Write s0 in CC5sub s0, t0, t1 t2, s0, t3Program WRITEEX/MEMALUALUID/EXIF/IDID, REG.FILEREADID/EXIMDMCC4EX/MEMCC3IF/IDID, REG.FILEREADCC2PCIMPCCC1Read s0 and t3 in CC3We need to read s0 from reg file in cycle 3But s0 will not be written in reg file until cycle 5However, s0 will only be used in cycle 4And it is available at the end of cycle 3Fall 2013ELEC 5200-001/6200-001 Pipelining38

Forwarding or BypassingOutput of a resource used by aninstruction is forwarded to the input ofsome resource being used by anotherinstruction.Forwarding can eliminate some, but notall, data hazards.Fall 2013ELEC 5200-001/6200-001 Pipelining39

Forwarding for Data HazardtimeCC5sub s0, t0, t1 t2, s0, t3Program EM/WBWrite s0 in CC5EX/MEMDMALUEX/MEMCC4ID, REG.FILEREADID/EXIF/IDIMALUCC3IF/IDID, REG.FILEREADID/EXCC2PCIMPCCC1Read s0 and t3 in CC3Fall 2013ELEC 5200-001/6200-001 Pipelining40

Forwarding Unit HardwareSource reg.IDs fromopcodeFall 2013MEM/WBDataMem.MUXALUDatato reg.fileFORW. MUXEX/MEMFORW. MUXID/EXDestination registersForwardingUnitELEC 5200-001/6200-001 Pipelining41

Forwarding Alone May Not WorktimeCC5lw s0, 20( s1)sub t2, s0, t3Program WBWrite s0 in CC5EX/MEMDMALUALUEX/MEMIMID/EXCC4IF/IDID, REG.FILEREADID/EXCC3IF/IDID, REG.FILEREADCC2PCIMPCCC1Read s0 and t3 in CC3data needed by sub(data hazard)data available from memoryonly at the end of cycle 4Fall 2013ELEC 5200-001/6200-001 Pipelining42

Use Bubble and ForwardingtimeCC5MEM/WBREG.FILEWRITEALUWrite s0 in CC5ID/EXDMCC4EX/MEMALUCC3ID/EXCC2IF/IDID, REG.FILEREADIMPCCC1lw s0, 20( s1)Fall 2013ELEC 5200-001/6200-001 PipeliningProgram instructionsREG.FILEWRITEMEM/WBDMEX/MEMIF/IDID, REG.FILEREAD t2, s0, t3IMsubPCstall(bubble)43

Hazard Detection Unit HardwareInstructionIF/IDregisterIDs fromprev. instr.Fall 2013to reg.fileALU0PCEX/MEMFORW. MUXControlID/EXFORW. MUXHazardDetectionUnitNOP MUXDisablewriteForwardingUnitELEC 5200-001/6200-001 PipeliningMEM/WBDataMem.Controlsignals44

Resolving HazardsHazards are resolved by Hazard detectionand forwarding units.Compiler’s understanding of how theseunits work can improve performance.Fall 2013ELEC 5200-001/6200-001 Pipelining45

Avoiding Stall by Code ReorderC code:A B E;C B F;MIPS code:lw t1,lw t2,add t3,sw t3,lw t4,add t5,sw t5,Fall 20130( t0)4( t0) t1, t212( t0)8( t0) t1, t416( t0).ELEC 5200-001/6200-001 Pipelining t1 written t2 written t1, t2 needed. t4 written t4 needed.46

Reordered CodeC code:A B E;C B F;MIPS code:lw t1,lw t2,lw t4,add t3,sw t3,add t5,sw t5,Fall 20130( t0)4( t0)8( t0) t1, t212( t0) t1, t416( t0)no hazardno hazardELEC 5200-001/6200-001 Pipelining47

Control HazardInstruction to be fetched is not known!Example: Instruction being executed isbranch-type, which will determine the nextinstruction:40Fall 2013add 4, 5, 6beq 1, 2, 40next instruction.and 7, 8, 9ELEC 5200-001/6200-001 Pipelining48

Fall 2013next instruction orand 7, 8, 9ELEC 5200-001/6200-001 PipeliningProgram ll (bubble)IF/IDID, REG.FILEREADID/EXDMCC3ALUEX/MEMALUCC2IMIF/IDID, REG.FILEREADID/EXIMIF/IDID, REG.FILEREADID/EXCC1PCbeq 1, 2, 40PCIMPCStall on Branchtime 4, 5, 649

Why Only One Stall?Extra hardware in ID phase:Additional ALU to compute branch addressComparator to generate zero signalHazard detection unit writes the branch addressin PCFall 2013ELEC 5200-001/6200-001 Pipelining50

Ways to Handle BranchStall or bubbleBranch prediction:– HeuristicsNext instructionPrediction based on statistics (dynamic)Hardware decision (dynamic)– Prediction error: pipeline flushDelayed branchFall 2013ELEC 5200-001/6200-001 Pipelining51

Delayed Branch ExampleStall on branchadd 4, 5, 6beq 1, 2, skipnext instruction.skip or 7, 8, 9Delayed branchbeq 1, 2, skipadd 4, 5, 6next instruction.skip or 7, 8, 9Instruction executed irrespectiveof branch decisionFall 2013ELEC 5200-001/6200-001 Pipelining52

next instruction orskip or 7, 8, 9Fall 2013ELEC 5200-001/6200-001 PipeliningDMALUProgram EMCC4ID, REG.FILEREADIF/IDALUCC3IMID/EXIF/IDID, REG.FILEREADID/EXCC2IMIF/IDID, REG.FILEREADPCadd 4, 5, 6IMCC1PCbeq 1, 2, skipPCDelayed Branchtime53

Summary: HazardsStructural hazards– Cause: resource conflict– Remedies: (i) hardware resources, (ii) stall (bubble)Data hazards– Cause: data unavailablity– Remedies: (i) forwarding, (ii) stall (bubble), (iii) codereorderingControl hazards– Cause: out-of-sequence execution (branch or jump)– Remedies: (i) stall (bubble), (ii) branch prediction/pipelineflush, (iii) delayed branch/pipeline flushFall 2013ELEC 5200-001/6200-001 Pipelining54

Fall 2013 ELEC 5200-001/6200-001 Pipelining 1 ELEC 5200-001/6200-001 Computer Architecture and Design Fall 2013 Pipelining (Chapter 4.5, 4.6)

Related Documents:

holux hxe-w01 hp 290483-b21 310798-b21 311314-001 311314-002 311315-b21 311340-001 311349-003 311949-001 343110-001 343117-001 350579-001 359498-001 35h00013-00 35h00014-00 35h00063-00m 377358-001 382877-001 382878-001 383745-001 383858-001 395780-001 398687-001 399858

1 elec-626-4a drive motor shaft seal,dum 1 2 elec-626-4b drive motor housing gasket,dum 1 3 elec-758-g drive motor gears,hb,dum 1 4 elec-758-c drive motor gearbox,dum ht (no gears) 1 5 elec-627-5l drive motor,brake right brake on hb-1230 1 6 elec-627-5r drive motor,brake left brake on hb-1230 1 item # part number part description notes qty .

8 elec-626-4c drive motor gearbox,dum dumore: housing all silver 1 9 elec-627-5l drive motor brake l,hb,dum dumore: housing all silver 1 10 elec-626-4a drive motor shaft seal,dum dumore: housing all silver 1 11 elec-626-4b drive motor housing gasket,dum dumore: housing all silver 1 12 elec-626-4g drive motor gears,hb,dum dumore: housing all .

001 10 001 055 dyson andrew pass 001 10 001 056 gomani only pass 001 10 001 057 jasten wonderful pass 001 10 001 058 jobo yona pass . 001 10 003 083 wanda kastom l pass 001 10 003 084 y

TTP 5200/5250 Kiosk Printer Sub-system Service Manual Publ. No. 101469, Ed. A. TTP 52x0 Series Kiosk Printer Sub-system — Service Manual December 2001 Related manuals TTP 5200/5250 Getting Started (01451-000) TTP 5100/5200/5250 Operating Instructions (01434-000)

airport safety self-inspection checklists are contained in Appendices 1–5). While format of checklists vary, it is important to develop a checklist that is useful for the airport and its operation. If certain 4 : 04/23/04 AC 150/5200-18C. 04/23/04 AC 150/5200-18C: AC 150/5200-18C 04/23/04 : www.faa.gov. #

Abe, Genki 064 31798 001 039 230 86/05/05 Abendroth, Walter 100 325769 001 001 230 86/11/03 Aberg, Einar 105 009428 001 155 230 86/16/05 Abetz, Otto 100 004219 001 022 230 86/11/06 Abjanic, Theodore 105 253577 001 132 230 86/16/01 Abrey, Richard See Sovloot (100-382419) Abs, Hermann J. 105 056532 001 167 230 86/16/06 Abualy, Aldina 105 007801 001 183 230 86/17/02 Abwehr 065 37193 001 122 230 .

Coronavirus disease 2019 (COVID-19) emerged in December 2019 in Wuhan, the capital of Hubei province, China. This highly contagious disease is currently spreading across the world and throughout EU/EEA Member States, with a daily increase in the number of affected countries, confirmed cases and infection -related deaths. Updated data are publisheddaily on the ECDC and World Health Organization .