FPGAs A Short Introduction - Agenda (Indico)

2y ago
18 Views
2 Downloads
2.73 MB
30 Pages
Last View : 26d ago
Last Download : 3m ago
Upload by : Mariam Herr
Transcription

FPGAsA short introductionM.BerettaLNF-INFN

EVOLUTIONOF INTEGRATEDThe evolutionof Integrated CIRCUITSCircuits1948: invention of transistors (Bell Labs)1948:Invention of transistor (Bell Labs)u1958:of Integrated1958:InventionInvention of IntegratedCircuits CircuitsSource Bell LabsuThe idea of making a whole circuit-transistors, wires, and everything else-wasinventedby JackKilby circuit-transistors,at Texas Instrumentsand andRobertNoyce atFairchild— The ideaof makinga wholewires,everythingelse-wasSemiconductorat thesame timeinventedby Jack Kilbyalmostat TexasInstrumentsand Robert Noyce at FairchildSemiconductor almost at the same time.u1965: Moore’s Law1965:Moore’sLawMoore at Intel made a prediction that semiconductoruIn 1965, Gordontechnology will double its effectiveness every 18 months— In 1965, Gordon Moore at Intel made a prediction that semiconductor technologywill double its effectiveness every 18 monthsu1-2M.BERETTA – G.FELICI – P.ALBICOCCOEDIT 2015 – FRASCATI OCTOBER 20-292

Microprocessor Transistor Counts 1971-2011 & Moore's LawMOORE’SLAW16-Core SPARC T3Six-Core Core i7Six-Core Xeon 74002,600,000,000Dual-Core Itanium 2AMD K10POWER6Itanium 2 with 9MB cacheAMD K101,000,000,000AMD K8Pentium 4Transistor count8-core POWER7Quad-core z196Quad-Core Itanium Tukwila8-Core Xeon Nehalem-EXSix-Core Opteron 2400Core i7 (Quad)Core 2 DuoCellItanium 2100,000,00010-Core Xeon Westmere-EXAtomAMD K7AMD K6-IIIcurve shows transistorcount doubling everytwo years10,000,000BartonAMD K6Pentium IIIPentium IIAMD 0868085680010,0006809Z808080MOS 650280082,3004004RCA 18021971M.BERETTA – G.FELICI – P.ALBICOCCO8018619801990EDIT 2015 – FRASCATI OCTOBER 20-29Date of introduction200020113

MICROPROCESSORS CLOCK FREQUENCIESM.BERETTA – G.FELICI – P.ALBICOCCOEDIT 2015 – FRASCATI OCTOBER 20-294

CLASSIFICATION OF INTEGRATED CIRCUITSuMicroprocessorsuMemory chips (SRAM, DRAM, Flash, ROM, PROM)uStandard Components (74LS.)uApplication-Specific Integrated Circuitsu Widely used in communication, network, and multimedia systemsu For a given application, ASIC solutions are normally more effective than thesolutions based on running software on microprocessorsuMany chips in cellular phones, network routers, and game consoles are ASICsu Most SoC (Systems-on-a-Chip) chips are ASICsu Programmable devices (PLA/PAL-CPLD-FPGA)M.BERETTA – G.FELICI – P.ALBICOCCOEDIT 2015 – FRASCATI OCTOBER 20-295

ASIC DESIGN METHODOLOGIESASIC DESIGNM.BERETTA – G.FELICI – P.ALBICOCCOEDIT 2015 – FRASCATI OCTOBER 20-29FPGA DESIGN6

Full-Custom Design MethodologyFULL-CUSTOM DESIGN METHODOLOGYFunction PartitionLayout DesignIncluding placement & routingSchematic DesignIncluding transistor sizingFailFailFunctionAnd TimingverificationPost-LayoutsimulationPassGo to fabricationPassASIC Chipsis a timeconsuming MANUALmanual process,not pre-developedlibraries needed.IT IS AItTIMECONSUMINGPROCESS.NO PRE-DEVELOPEDLIBRARY AREREQUIRED1-7Pros: complete flexibility, high degree of optimization in performance, power consumptionand application areaCons: large amount of design effort, expensive, time to marketM.BERETTA – G.FELICI – P.ALBICOCCOEDIT 2015 – FRASCATI OCTOBER 20-297

CELLBasedBASED DESIGNHigh-level (RTL or behavioral-level) designFailHigh-level verificationVHDL or Verilog codingVHDL or Verilog simulationPassLogic gate libraryLogic synthesisFailGate-level verificationPassCell layout libraryPlacement & RoutingFailPost-Layout verificationPassGo to fabricationIt is highlyautomated,need pre-developedlibraries.IT IS HIGHLYAUTOMATED.NEEDbutPRE-DEVELOPEDLIBRARIESPros: save design Nme and money. Reduce risk compared to a full-custom designCons: sNll incurs high non-recurring-engineering (NRE) cost and long manufacture NmeM.BERETTA – G.FELICI – P.ALBICOCCOEDIT 2015 – FRASCATI OCTOBER 20-291-98

Gate-Array Based Design MethodologyGATE-ARRAY BASED DESIGN METHODOLOGYGenerating schematic (netlist)Placement & RoutingThe netlist can be designedusing full-custom orstandard-cell baseddesign methodCell layout libraryPost-Layout verificationMake the final connections for thepre-fabricated gate array basePre-fabricated gate array templateIt contains transistorswithout connectionsASIC ChipsFASTERThisTHANTHE STANDARD-CELLSBASEDAPPROACHAS PARTOF THEFABRICATIONapproachis faster than thestandard-cellbasedapproachbecausepart of PROCESS1-11HAS BEENthe DONEfabrication process has been complete.Pros: cost saving (fabricaNon cost of a large number of idenNcal templates wafers is amorNzed overdifferent customers), shorter manufacture lead NmeCons: performances not as good as full-custom or standard-cell-based ICsM.BERETTA – G.FELICI – P.ALBICOCCOEDIT 2015 – FRASCATI OCTOBER 20-299

GYHDL coding &Logic SynthesisSchematicCapturenetlistFPGA ology mappingPlacement & routingTiming verificationDownloadGenerate FPGA Bit StreamFPGAThis approach has extremely fast turn-out time since FPGA devices has beenTHISAPPRACH HAS EXTREMELY FAST TURN-OUT TIME SINCE THE FPGA DEVICES HAS BEENfabricated.ALREADY FABRICATED1-13M.BERETTA – G.FELICI – P.ALBICOCCOEDIT 2015 – FRASCATI OCTOBER 20-2910

tom Standard-cellStandard-cell Gate-arrayGate-array FPGA-basedFPGA-basedFull-customdesignbaseddesign nSpeedSpeed - -- -- --- ----- -- All AllAll -- ----- --Timeto MarketTimeto Market--- ----- --RiskRiskreductionreduction--- ----- ---- - ----- ---- vicecostcostdeviceCustommaskCustommasklayerlayer desirable;- not- notdesirable desirable;desirableM.BERETTA – G.FELICI – P.ALBICOCCOEDIT 2015 – FRASCATI OCTOBER 20-29-- - 1-141-1411

FPGA ADVANTAGES AND APPLICATIONSuFPGAsPros: Fast turn-out time, re-programming capability, dynamic reconfigurationcapabilityCons: performances and integration are not as good as full-custom or standardcell-based ICs, power consumptionNB: integration issue mitigated by SoC technology (microprocessors FPGA in thesame device)uFPGA APPLICATIONSu Ideal platform for prototypingu Providing fast implementation to reduce time-to-marketu Cost effective solutions for products with small volumes on demandu Implementing hardware systems requiring re-programming flexibilityu Implementing dynamically re-configurable systemsM.BERETTA – G.FELICI – P.ALBICOCCOEDIT 2015 – FRASCATI OCTOBER 20-2912

FPGA MARKETPLD Market Segment ShareCalendar Year 2011Xilinx47%Others12%Altera41%Source: iSuppliM.BERETTA – G.FELICI – P.ALBICOCCOEDIT 2015 – FRASCATI OCTOBER 20-2913

FPGAsMOORE’SFPGAsandANDMoore’sLaw LAW10,000x More Logic– Plus Embedded IP Memory Microprocessor DSP Gigabit Serial I/O100x Faster5000x Lower Power10,000x Lower CostM.BERETTA – G.FELICI – P.ALBICOCCOAges 3EDIT 2015 – FRASCATI OCTOBER 20-2914

TYPICAL FPGA ARCHITECTURE - 1CONFIGURABLE LOGIC BLOCKVERTICAL ROUTING CHANNELSDIGITAL CLOCK MANAGERI/O BLOCKHORIZONTAL ROUTING CHANNELSM.BERETTA – G.FELICI – P.ALBICOCCOBLOCK RAMEDIT 2015 – FRASCATI OCTOBER 20-2915

TYPICAL FPGA ARCHITECTURE - 2Basic FPGA architectureM.BERETTA – G.FELICI – P.ALBICOCCOEDIT 2015 – FRASCATI OCTOBER 20-2916

CONFIGURABLE LOGIC BLOCKS (CLBs)BASIC CONFIGURABLE LOGIC BLOCK STRUCTUREuuuCLB is the basic logic unit in a FPGAEvery CLB consists of a configurable switch matrix with 4 or 6 inputs, some selecNon circuitry(MUX, etc), and flip-flopsThe switch matrix is highly flexible and can be configured to handle combinatorial logic, shidregisters or RAMM.BERETTA – G.FELICI – P.ALBICOCCOEDIT 2015 – FRASCATI OCTOBER 20-2917

the memory.CLBs - DETAILSuuuuuConfigurable Logic Block:CLBs contain RAM memory cellsCLBs can be configured to realize anyfunction of 5 or 4 variablesFunctions are stored in the true table formTrapezoidal blocks represent multiplexerMultiplexer can be programmed toimplement to select one inputM.BERETTA – G.FELICI – P.ALBICOCCOEDIT 2015 – FRASCATI OCTOBER 20-2918

INTERCONNECTuFlexible interconnect rouNng routes the signals between CLBs and to and from I/OsSix break-points PIPscross-point (connectvertical or horizontalwire segment)Decoded mux (2n crosspoints connected to asingle output)uuubreakpoint (connectsor isolate 2 wiresegments)Flexible interconnect routing routes the signals between CLBs and to and from I/OsRouting comes in several flavors, from that designed to interconnect between CLBs to fasthorizontal and vertical long lines spanning the device to global low-skew routing for Clocking andother global signalsThe design software makes the interconnect routing task hidden to the user unless specifiedotherwise, thus significantly reducing design complexityM.BERETTA – G.FELICI – P.ALBICOCCOEDIT 2015 – FRASCATI OCTOBER 20-2919

SELECTIO (IOBs)BASIC SELECTIO (IOBs) STRUCTUREuuuToday’s FPGAs provide support for dozens of I/O standards thus providing the ideal interfacebridge in your systemI/O in FPGAs is grouped in banks with eachbank independently able to support different I/O standardsToday’s leading FPGAs provide over a dozen I/Obanks, thus allowing flexibility in I/O support.M.BERETTA – G.FELICI – P.ALBICOCCOEDIT 2015 – FRASCATI OCTOBER 20-2920

MEMORY AND CLOCK MANAGEMENTuuMEMORYu Embedded Block RAM memory is available in most FPGAs, which allows for on-chip memoryin your design.u Xilinx FPGAs provide up to 10Mbits of on-chip memory in 36kbit blocks that can support truedual-port operaNonCOMPLETE CLOCK MANAGEMENTu Digital clock management is provided by most FPGAs in the industry (all Xilinx FPGAs havethis feature).u The most advanced FPGAs from Xilinx offer both digital clock management and phase-loopedlocking that provide precision clock synthesis combined with jiher reducNon and filtering.M.BERETTA – G.FELICI – P.ALBICOCCOEDIT 2015 – FRASCATI OCTOBER 20-2921

XC2064- THE FIRST FPGA (1985)XC2064 The First FPGA (1985)64 flip flops128 3-LUTs58 I/O pins18MHz (toggle)2um 2LMAges 6Copyright Xilinx 2014.M.BERETTA – G.FELICI – P.ALBICOCCOEDIT 2015 – FRASCATI OCTOBER 20-2922

Xilinx FPGAs2015- XILINX FPGAsXilinx offers the broadest lineup of FPGAs providing advance features, low-power, high-performance, and high value for any FPGAdesign. Below is an overview of Xilinx leading FPGA families.FeaturesArtix -7Kintex -7Virtex -7Spartan -6Virtex-6Logic 13Mb34Mb68Mb4.8Mb38MbDSP Slices7401,9203,6001802,016DSP Performance(symmetric Transceiver Count163296872Transceiver Speed6.6Gb/s12.5Gb/s28.05Gb/s3.2Gb/s11.18Gb/sTotal Transceiver Bandwidth 211Gb/s(full duplex)800Gb/s2,784Gb/s50Gb/s536Gb/sMemory Interface CI Express Interfacex4 Gen2Gen2x8Gen3x8Gen1x1Gen2x8Analog Mixed Signal(AMS)/XADCYesYesYes-YesConfiguration AESYesYesYesYesYesI/O Pins5005001,2005761,200I/O Voltage1.2V, 1.35V, 1.5V, 1.8V,2.5V, 3.3V1.2V, 1.35V, 1.5V, 1.8V,2.5V, 3.3V1.2V, 1.35V, 1.5V, 1.8V,2.5V, 3.3V1.2V, 1.5V, 1.8V,2.5V, 3.3V1.2V, 1.5V,1.8V, 2.5VYesYes-YesEasyPath Cost Reduction SolutionM.BERETTA – G.FELICI – P.ALBICOCCOEDIT 2015 – FRASCATI OCTOBER 20-29 View all FPGAs from Xilinx23

FPGAs APPLICATIONS Aerospace and Defense Avionics/DO-254 Communications Missiles & Munitions Secure Solutions SpaceMedical ElectronicsASIC PrototypingAudio Connectivity Solutions Portable Electronics Radio Digital Signal Processing (DSP)Automotive High Resolution Video Image Processing Vehicle Networking and Connectivity Automotive InfotainmentBroadcast Real-Time Video Engine EdgeQAM Encoders Displays Switches and RoutersM.BERETTA – G.FELICI – P.ALBICOCCOEDIT 2015 – FRASCATI OCTOBER 20-29Consumer Electronics Digital Displays Digital Cameras Multi-function Printers Portable Electronics Set-top Boxes High Performance CompuNng Servers Super Computers SIGINT Systems High-end RADARs High-end Beam Forming Systems Data Mining Systems Industrial Industrial Imaging Industrial Networking Motor Control Medical Ultrasound CT Scanner MRI X-ray PET Surgical Systems 24

FPGAs APPLICATIONS Scientific Instruments Lock-in amplifiers Boxcar averagers Phase-locked loopsSecurity Industrial Imaging Secure Solutions Image ProcessingVideo & Image Processing High Resolution Video Video Over IP Gateway Digital Displays Industrial ImagingWired Communications Optical Transport Networks Network Processing Connectivity InterfacesWireless Communications Baseband Connectivity Interfaces Mobile Backhaul RadioM.BERETTA – G.FELICI – P.ALBICOCCOEDIT 2015 – FRASCATI OCTOBER 20-2925

Hardware and Software Programmability:HARDWAREASSPSoC AND SOFTWARE PROGRAMMABILITY: ASSP SoCPage 33M.BERETTA – G.FELICI – P.ALBICOCCOCopyright Xilinx 2014.EDIT 2015 – FRASCATI OCTOBER 20-2926

HARDWARE/SOFTWARE and I/O PROGRAMMABILITY: ASSP SoCZynq All-Programmable SoCProcessor System (PS)– 2x ARM9 866MHz-1GHz 32K/32KI/D Caches– 512KB shared L2 Cache– 256KB On-chip memory– Memory controller– Bus interfaces, timers– Libraries, OSs, middlewareProgrammable Logic (PL)– 28K – 440K LCs– 240K – 3MB RAM– 80 – 2020 DSP blocks– I/O, Transceivers, PCIe, Ethernet Programmable ADC– Inputs from Voltage, Temp sensorsAMBA AXI bus fabricM.BERETTA – G.FELICI – P.ALBICOCCOXilinx 2014EDIT 2015 – CopyrightFRASCATIOCTOBER 20-29.27

All-ProgrammableProgrammingALL-PROGRAMMABLE PROGRAMMINGApplication inC/C /OpenCLPlatform InformationKernels using HLSSW-Centric Design EnvironmentBinary for CPUBitstream for PL fabricZynq devicekernel1Data MovementInterconnectARMCPUsKernel2Kernel3FPGA FabricMemoryCopyright Xilinx 2014.M.BERETTA – G.FELICI – P.ALBICOCCOEDIT 2015 – FRASCATI OCTOBER 20-2928

A NICE ZYNQ APPLICATIONPhenox2 camerasMicrophone4 motorsAutonomousAvoids obstaclesResponds to audiosignals and handgesturesProgrammable aerial platformProgrammed with OpenCLhhp://phenoxlab.comM.BERETTA – G.FELICI – P.ALBICOCCO37EDIT 2015 – FRASCATI OCTOBER 20-29Copyright Xilinx 2014.29

BIBLIOGRAPHY[1] Programmable ASIC Design - Haibo Wang - ECE DepartmentSouthern Illinois University[2] Three Ages of FPGAs - Steve Trimberger – Fellow - Xilinx Research Labs[3] IntroducNon to Field Programmable Gate Arrays - IntroducNon to FieldProgrammable Gate Arrays - CERN Accelerator School on Digital SignalProcessingM.BERETTA – G.FELICI – P.ALBICOCCOEDIT 2015 – FRASCATI OCTOBER 20-2930

8-core POWER7 10-Core Xeon Westmere-EX 16-Core SPARC T3 Six-Core Core i7 Six-Core Xeon 7400 Dual-Core Itanium 2 AMD K10 Microprocessor Transistor Counts 1971-2011 & Moore's Law t. MICROPROCESSORS CLOCK FREQUENCIES M.BERETTA – G.

Related Documents:

Downsides of FPGAs FPGAs require some extra infrastructure versus an ASIC are more expensive at high volumes versus an ASIC an chip or a systems design may have to be tailored to accomidate what an FPGA can do instead of an optimal ASICs designs are limited to available FPGAs so exotic system-on-IC combinations are limited. Ex: can't have custom RAM DAC DSP (though FPGAs are getting more and .

blocks, and microprocessors. Interconnections are done by a designer using EDA tools. Some FPGAs can be reconfigured completely or partially during the development phase or during the exploitation phase FPGAs represent a higher level of integration of digital hardware, but they also involve software design. Introduction to FPGAs

Feb. 2002 FPGA Symposium 2003 3 Introduction Existing FPGAs are known to be power inefficient E.g. [Kusse, ISLPED’98] 100X power overhead Need to explore power efficient FPGAs Static CMOS 3.3v 5.5uW/MHz Xilinx XC4003A 5v 4.2mW/MHz Design Vdd Energy Example Table1 8-bit adder

page 1 ELEC222 Les FPGAs Les circuits logiques programmables FPGAs Jean-Luc Danger Atouts et architectures . page 6 ELEC222 Les FPGAs Circuits programmables : PLD "Programmable Logic Devices" X15 F10 F12 F14 F11 F13 1/2 F0 F2 F4 F6 F8 F1 F3 F5 F7 F9 X10 F15 X5 X13 X2 X11 X4 X3 X9 X6 X14 X1 X8 X7 X0 c1 c2 c4 c8

FPGA Workout - 236 pages: I wrote this book back in 1994. It showed how to build electronics using the Intel FLEXlogic FPGAs. (You didn't know Intel built FPGAs? Seems that nobody else did either - they exited the FPGA business around 1995.) I self-published this and had

Lecture 5: FPGAs. EE141 FPGAs are in widespread use Far more different designs are . N control signals. . Timing is independent of function. Latches set during configuration. EE141 28 Virtex 6-L

3. TUNING OPENCL STENCIL CODES FOR FPGAS The OpenCL kernels for Stencil codes on FPGAs can be implemented in either the Single-Task mode or the NDRange mode [2]. We propose somewhat different optimization processes and present them separately. 3.1 Optimizing Single-Task Kernels In the Single-Task mode, a kernel is implemented as a sequential .

The Kintex -7 family is an innovative class of FPGAs optimized for the best price-performance. This guide serves as a technical reference describing the 7 series FPGAs XADC, a dual 12-bit, 1 MSPS a