RT Kintex UltraScale FPGAs For Ultra High Throughput And .

2y ago
54 Views
2 Downloads
1.21 MB
17 Pages
Last View : 1m ago
Last Download : 2m ago
Upload by : Bennett Almond
Transcription

WP523 (v1.0) May 19, 2020RT Kintex UltraScale FPGAs forUltra High Throughput andHigh Bandwidth ApplicationsThe Xilinx Radiation Tolerant (RT) Kintex UltraScale XQRKU060 FPGA enables the next generation of highthroughput satellite services, allowing OEMs to offer reconfigurable payloads with unprecedented levels of onboard processing across all radiation orbits.ABSTRACTXilinx's UltraScale architecture extends FPGA capability for space applications,delivering a step-function increase in I/O and memory bandwidth, capacity,performance, and in-orbit re-configurability. For the first time, the RT KintexUltraScale XQRKU060 FPGA enables the satellite industry to access ultra highthroughput on-board processing of hundreds of Gb/s. This capability allowsspacecraft operators to offer new applications such as real-time streaming ofEarth-Observation remote sensing in super high-resolution, space-basedInternet, and broadband mobile telecommunication with the ability to optimizeand re-deploy in-orbit payload resources in response to real-time user needs.Space-grade FPGAs from other vendors suffer from architectural bottlenecksseverely limiting their use for ultra high-throughput on-board processing.Xilinx's RT Kintex UltraScale fabric has innovative on-chip communication, I/Oand memory bandwidth, DSP capability, clocking, critical paths, andinterconnect, using 20nm technology to deliver best-in-class, ASIC-level systemperformance for the most demanding of satellite applications. Copyright 2020 Xilinx, Inc. Xilinx, the Xilinx logo, Alveo, Artix, Kintex, Spartan, UltraScale, Versal, Virtex, Vivado, Zynq, and other designated brands included herein aretrademarks of Xilinx in the United States and other countries. AMBA, AMBA Designer, Arm, Arm1176JZ-S, CoreSight, Cortex, and PrimeCell are trademarks of Arm in theEU and other countries. PCI, PCIe, and PCI Express are trademarks of PCI-SIG and used under license. All other trademarks are the property of their respective owners.WP523 (v1.0) May 19, 2020www.xilinx.com1

RT Kintex UltraScale FPGAs for Ultra High Throughput and High Bandwidth ApplicationsMarket Challenges & TrendsStarting from 2020, it is predicted that almost 1,000 satellites will be launched every year for thenext decade to provide telecommunication, television-broadcasting, Earth-observation remotesensing, space-based Internet, and navigation services.The demand for high-throughput services is growing 26% annually and is forecast to reach 8,000Gb/s by 2028, generating 15 billion in revenue from a diverse range of applications ingeosynchronous equatorial orbit (GEO) and non-GEO orbits.The majority of the satellites will be launched on behalf of Space 2.0 operators planningconstellations of small spacecraft targeting the lucrative space-based Internet and EarthObservation data-analytics markets. Typically, these reside in LEO for three to five years and areconstrained by power consumption and cost. Space 2.0 operators are becoming more ambitious,starting to diversify and challenge traditional providers by offering competing services in otherorbits.The defense industry is also exploiting cheaper smaller satellites and lower launch costs, and isplanning constellations in all orbits to provide the military with high-throughput, low-latencycommunication.To deliver higher performance and added value over competitors, traditional and Space 2.0operators are seeking ultra high-throughput payloads to deliver the next generation of satelliteservices.On-Board Processing Limitations of Existing SpaceGrade FPGAsTo address real-time, high-throughput system performance needs in the range of hundreds ofGb/s, a new architectural approach to programmable logic is required. Regardless of whether theapplication is a telecommunication satellite with a broadband digital payload, an EarthObservation spacecraft streaming live remote-sensing data, or a Martian rover beaming backimages in super high-resolution, a large amount of information needs to be processed.The problem of moving vast quantities of information through a system is illustrated in Figure 1:data streams on the order of hundreds of Gb/s enter and exit from the left and right using I/Obanks and/or high-speed serial transceivers. Future payloads are required to process these in realtime and as soon as the data enters the FPGA, it must fan out to match the data flow, routing, andprocessing capabilities of the on-chip resources.WP523 (v1.0) May 19, 2020www.xilinx.com2

RT Kintex UltraScale FPGAs for Ultra High Throughput and High Bandwidth ApplicationsX-Ref Target - Figure 1Fast ExternalMemoryMassive MemoryBandwidthMassive I/OBandwidthMassiveRoutingMassive Data Flow& RoutingFastest Packet Processing,Fastest DSP ProcessingMassiveRoutingMassive I/OBandwidthMassive MemoryBandwidthFast InternalMemoryWP523 01 050420Figure 1: High-Performance Systems Require Massive BandwidthAs an example, assume the I/O bandwidth for the ports on the left and right is 400Gb/s. This meansthat the FPGA's logic, arithmetic, and memory resources must also process at least 400Gb/s traffic.Design engineers typically use a wide bus ranging in size from 512 to 2,048 bits to manage thisthroughput.Designs with narrow datapaths operating at high frequencies often suffer from performancecompromising clock skew, which in extreme cases can approach 50% of the total clock period. Thisleaves little time to perform actual computation, leading designers to heavily pipeline theirdesigns. Beyond consuming large amounts of register resources, extensive pipelining has asignificant impact on overall system latency, which is unacceptable in today's high-performancesystems.While a wider bus implementation might lead to the need for a lower system clock frequency,significant timing-closure challenges now arise due to a lack of routing resources required tosupport large buses. The situation is further aggravated by the fact that some FPGA vendors useantiquated place-and-route algorithms based on simulated annealing, which are blind to globaldesign metrics, such as the level of congestion or the total wire length. Thus, designers are forcedto consider trade-offs that require lowering the performance of the system (typically not anoption), extensive pipelining at the expense of latency, or gross underutilization of the availabledevice resources. In all cases, these solutions prove to be inferior or inadequate.The fundamental limitation in routing resources required to address applications on the order ofhundreds of Gb/s exists with all current space-grade FPGAs. This all but guarantees that addressingthe next generation of ultra high-throughput satellite applications will be out of the realm ofpossibility, or will come at the expense of very poor device utilization or latency.WP523 (v1.0) May 19, 2020www.xilinx.com3

RT Kintex UltraScale FPGAs for Ultra High Throughput and High Bandwidth ApplicationsThe challenge is how to reliably manage huge dataflows: the incoming high-speed informationneeds to be fanned-out and routed with low clock skew to processing logic, and to handle the largedata rates, computed in real-time by massively-wide functional blocks, e.g., high-throughputarithmetic or DSP. Incoming data or intermediate results must be stored quickly within the system,close to the processing elements, or in fast, external bulk memory located next to the payloadusing interfaces with huge memory bandwidths. After processing, the data must be routed to I/Obanks or the high-speed output transceivers to be passed along as illustrated in Figure 1.As designs become more complex with wider internal data buses and more physical signals toprocess, (often brought on-chip by the dramatically increasing number of high-speed serialtransceivers), three major challenges become clear:1. Routing dominates overall delay in the system2. Clock skew consumes a greater proportion of the available timing margin3. Sub-optimal logic placement reduces system performanceRT Kintex UltraScale FPGA for Space ApplicationsXilinx's RT Kintex UltraScale FPGA is designed to address next-generation system-levelperformance requirements associated with applications such as high-throughput, massivebandwidth satellites. Many innovations and enhancements went into developing the new fabric tosolve all of these issues.To efficiently receive, buffer, process, and transmit the vast amounts of data required by the nextgeneration of ultra high-throughput satellite payloads, Xilinx's UltraScale FPGA fabricfundamentally improves on-chip communication, I/O and memory bandwidth, DSP capability,clocking, critical path optimization, and interconnect to address massive data flow and real-timepacket and image processing. Innovations include: Massive data flow optimized for wide buses supporting hundreds of Gb/s throughput with lowlatency Highly optimized critical paths and built-in high-speed memory, cascading to removebottlenecks in DSP and packet processing Enhanced DSP slices, incorporating 27x18-bit multipliers and dual adders that enable amassive jump in fixed-point and IEEE Std 754 floating-point arithmetic performance andefficiency Massive I/O and memory bandwidth, including support for DDR3 and DDR4 interfacing withdramatic reduction in latency Multi-region ASIC-like clocking, delivering low-power clock networks with extremely low clockskew and high-performance scalability Power management with significant static- and dynamic-power gating across a wide range offunctional elements, yielding significant power savings Massive routing capacity while intelligently resolving typical bottlenecks in ways never beforepossible. This significantly mitigates routing congestion, allowing for greater utilization withlittle or no performance degradation.WP523 (v1.0) May 19, 2020www.xilinx.com4

RT Kintex UltraScale FPGAs for Ultra High Throughput and High Bandwidth Applications Latency-producing pipelining is virtually unnecessary in systems with massively parallel busarchitectures, increasing system speed and capability Potential timing-closure problems and interconnect bottlenecks are eliminated, even insystems requiring 90% or more resource utilization Next-generation security with advanced approaches to AES bitstream decryption andauthentication, key-obfuscation, and secure device programmingThese enhancements synergistically combine to enable design teams to create systems that havegreater functionality, run faster, and deliver greater performance per watt than ever before.X-Ref Target - Figure 2VivadoVitisRadiation sceiversHigh-RangeHigh-PerformanceI/OsDDR3DDR4True On-Orbit ReconfigurableWP523 02 051320Figure 2: RT Kintex UltraScale Platform Block DiagramNext-Generation Routing for Utilization, Performance,and Run TimeWith conventional FPGA architectures, logical resources are laid out in a matrix with rows andcolumns of interconnect. As FPGA device density increases into the multi-million logic cell capacity(multi-tens-of-millions of equivalent ASIC gates), the disparity between the logic (increasing by afactor of N2) and the number of interconnect tracks (increasing by a factor of N), becomes alimiting factor in successfully routing a design at the required system performance level.The UltraScale architecture addresses this challenge by increasing the interconnect track count inall devices, providing more direct routes from A to B and giving the software tools more options toconnect logic resources in the fastest, lowest-power configuration.WP523 (v1.0) May 19, 2020www.xilinx.com5

RT Kintex UltraScale FPGAs for Ultra High Throughput and High Bandwidth ApplicationsX-Ref Target - Figure 3Logic Outgrowing Routing TracksNSmall DeviceMedium DeviceLarge DeviceNMore & Faster Paths Analytical PlacementClose the Gap and Deliver Full RoutabilityLogic: 4Tracks: 2LogicO(N2)Logic: 9Tracks: 4Effect of RoutingResources andAnalyticalPlacementLogic: 16Tracks: 6Interconnect TracksO(N)NWP523 03 050520Figure 3: Adding Routing in the UltraScale ArchitectureASIC-Like Clocking Maximizes PerformanceFPGA architectures prior to the UltraScale architecture relied on a fan-out of a geometricallycentered clocking scheme with global resources in the middle of the device. These were fanned outto the extremities of the FPGA, which accumulates skew. With increasing capacity and systemperformance, chip-wide, clock skew can have a detrimental impact on the overall timing budget ofa design.The clock routing and buffers within the UltraScale architecture have been entirely redesigned toprovide vastly more flexibility than competing FPGA fabrics. With an abundance of clock routingand distribution tracks in both the horizontal and vertical direction, the UltraScale architectureprovides 20X the number of global clock buffers than previous generations. In essence, the centerof the clock network, i.e., from where skew starts to accumulate, can be placed in any clock regionwithin the FPGA. This enables clock networks to only span where they are needed, consuming onlythe power needed to get clock signals from their source to all their destinations just like an ASIC.From the system designer's perspective, the placement of a large number of independent, highperformance clock sources eliminates the skew problem. This removes the need for extensivepipelining and the associated latency that comes with it. See Figure 4.WP523 (v1.0) May 19, 2020www.xilinx.com6

RT Kintex UltraScale FPGAs for Ultra High Throughput and High Bandwidth ApplicationsX-Ref Target - Figure 4UltraScale Clocking ArchitectureClock Domain 1BanlacedayelDClock Domain 2BanladceayelDClock Domain 3ednclaBaayelDClockingI/OLogicGTClock RootDistributionClocksRoutingClocksWP523 04 050730Figure 4: UltraScale Clocking ArchitectureDesigns Use Fewer CLBs Resulting in Shorter WireLengthAfter the clock and data signals arrive at the logic resources, the UltraScale architecture provides anenhanced CLB to make the most efficient use of its available resources, with the goal of reducinginterconnect, i.e., the total wire length. Every aspect of the existing CLB structure was analyzed toexplore how its components can be used more efficiently. The resulting enhancements collectivelyenable the Vivado Design Suite to place many more, often unrelated, components in a CLB toachieve tighter placement. Operating at high performance, such designs consume the lowestpossible power by achieving the best overall device utilization.Numerous changes within the CLB structure have added flexibility to the possible packing options:every 6-input LUT is combined with two flip-flops with each having dedicated inputs and outputs,enabling all the components to be used together or completely independent of one another. Theflip-flops benefit from the increased quantity and flexibility of their control signals, with double thenumber of available clock-enable signals, optional "ignore" on the clock enable and reset ports,reset inversion allowing both active-High and active-Low reset flip-flops in the same CLB, and anadditional clock signal for shift registers and distributed-RAM functions.Together with the UltraScale architecture's improvement in routing resources and a highly flexibleclocking architecture, the dramatic increase in CLB connectivity enables high-performance designsWP523 (v1.0) May 19, 2020www.xilinx.com7

RT Kintex UltraScale FPGAs for Ultra High Throughput and High Bandwidth Applicationsthat are tightly packed together, improving FPGA utilization and lowering total device power. SeeFigure 5.X-Ref Target - Figure 5Optimal CLB PackingSub-optimal CLB PackingOptimal CLB PackingSub-optimal CLB PackingWP523 05 050420Figure 5: Efficient Placement of Logic ResourcesVivado Design SuiteThe RT Kintex UltraScale XQRKU060 FPGA implements IP using the Vivado Design Suite, whichhas been developed to optimize the physical realization of large, Tb/s, low-latency, ultra highthroughput I/O, wide bus, massive memory-bandwidth applications. Competing place and routingtools use simulated-annealing algorithms, which do not scale for million-LUT designs nor accountfor total wire length or congestion. The Vivado Design Suite uses analytical place and routetechnology to find a routable solution at device utilizations of greater than 90% without impactingperformance.The Vivado Design Suite was created specifically to analyze designs to determine wherebottlenecks and problems can occur and solve these issues before they arise. By packing logic closetogether, the wire length between elements is reduced, resulting in shorter routing delays andlower power consumption. Additionally, the clock signals driving these closer elements have lessdistance to travel to span the design, yielding less clock skew. The Vivado Design Suite provideshigher device utilization and improves user productivity.The XQRKU060 is supported in Vivado Design Suite 2019.1 (or later), which is capable ofprogramming the FPGA and also storing device configuration in supported, external nonvolatilememories.Xilinx's XQRKU060 UltraScale Radiation-Tolerant FPGACompared to previous space-grade FPGAs from Xilinx, the RT Kintex UltraScale XQRKU060 FPGAoffers a major increase in processing resources. For the first time, the space industry can exploit theadvantages of 20nm fabrication and implement logic optimized for the highest on-boardperformance together with low power consumption. See Table 1.WP523 (v1.0) May 19, 2020www.xilinx.com8

RT Kintex UltraScale FPGAs for Ultra High Throughput and High Bandwidth ApplicationsTable 1: Comparison of Xilinx Space-Grade FPGAsVirtex-4QVXQRV4QVVirtex-5QVXQRV5QVRT Kintex UltraScaleXQRKU060TolerantHardTolerantProcess (nm)906520Memory (Mb)4.1 to 9.912.338System Logic Cells (K)55 to 200131726CLB Flip-Flops (K)49.1 to 178.181.9663CLB LUTs (K)49.1 to 178.181.9331TransceiversNone18 at 3.125Gb/s32 at 12.5Gb/s640 to 96083662032 to 1923202,760Radiation HardnessUser I/ODSP SlicesIn comparison with previous generation 65nm FPGAs, the XQRKU060 has a reduced power budgetof 70%, while delivering a 12X increase in transceiver capability and 5X increase in logic cells. Thisdecrease in dynamic and static power dissipation is achieved by applying power reductionstrategies at every level.Radiation-Effects Mitigation and HardnessCMOS scaling of planar transistor technology has intrinsically made the XQRKU060 FPGA lesssusceptible to total-dose and latch-up effects. The layout of the configuration memory cells hasbeen optimized using SEU design rules to protect against multiple-bit upsets together with the useof innovative circuit techniques to reduce soft-error rates. Users can triplicate logic and add EDACto the FPGA's memory to bolster overall radiation hardness, manually or automatically, usingindustry-standard tools from Mentor Graphics or Synopsys. Xilinx's SEM IP offers further mitigationand can be used to detect, correct, and classify SEUs in configuration memory. For upsets, the SEMIP uses the Readback CRC feature to locate and correct errors. For SEU classification, the SEM IPuses Xilinx's Essential Bits technology to further increase system availability allowing users tomanage a system-level response to reduce downtime.Scrubbing can also be used to improve reliability, ranging from periodic device re-configurationduring each orbit for LEO spacecraft to transparently checking and re-writing individual frames inthe background throughout FPGA operation for GEO missions. Xilinx is providing an external RTLsolution to configure the XQRKU060 FPGA, scrub the device to prevent the accumulation of SEUs,as well as detect and correct upsets. See Figure 6.WP523 (v1.0) May 19, 2020www.xilinx.com9

RT Kintex UltraScale FPGAs for Ultra High Throughput and High Bandwidth ApplicationsX-Ref Target - Figure 6XQRKU060 FPGA“Go”“Reset”“Status” Byte“Force Prog B”“SEFI Alarm”ConfigurationEngine(Scrubber)Slave Sele

XQRKU060 FPGA enables the next generation of high-throughput satellite services, allowing OEMs to offer re-configurable payloads with unprecedented levels of on-board processing across all radiation orbits. WP523 (v1.0) May 19, 2020 RT Kintex UltraScale FPGAs for Ultra High Throughput and High Bandwidth Applications ABSTRACT

Related Documents:

Kintex UltraScale FPGAs Data Sheet: DC and AC Switching Characteristics DS922 (v1.3) May 8, 2017 www.xilinx.com Preliminary Product Specification 2 VBATT Key memory battery backup supply. –0.500 2.000 V IDC Available output current at the pad. –20 20 mA IRMS Available RMS ou

Kintex-7 FPGA Electrical Characteristics Kintex -7 FPGAs are available in -3, -2, -1, and -2L speed grades, with -3 having the highest performance. The -2L devices can operate at either of two VCCINT voltages, 0.9V and 1.0V and are screened for lower maximum static power. When operated at V CCINT 1.0V, the speed specification ofFile Size: 1MBPage Count: 63

UltraScale architecture-based FPGAs support si milar configuration interfaces as the 7 series FPGAs, with most improvements targeted at improving configuration performance. Table 1-1 summarizes the key differences in available configuration modes. Table 1-1: Configuration Modes in UltraScale Architecture-based F

Kintex-7 devices are half the cost of the Virtex-6 HXT device and offer essentially equivalent performance. With similar fabric architecture, the Kintex-7 family is an attractive option for Virtex

UltraScale Architecture CLB User Guide www.xilinx.com 5 UG574 (v1.5) February 28, 2017 Chapter 1 Overview Introduction to UltraScale Architecture The Xilinx UltraScale architecture is a revo lutionary approach to creating programmable devices capable of addressing the massive I/O and memory bandwidth requirements of

Bruksanvisning för bilstereo . Bruksanvisning for bilstereo . Instrukcja obsługi samochodowego odtwarzacza stereo . Operating Instructions for Car Stereo . 610-104 . SV . Bruksanvisning i original

This user guide describes the UltraScale archit ecture DSP Slice resources and is part of the . The UltraScale architecture DSP48E2 slice is backwards compatible with the 7 series FPGA . Designs created for the 25 x 18 multiplier in the 7 series FPGAs . UG57

AMERICAN BOARD OF RADIOLOGY, ) ) CLASS ACTION ) Trial by Jury Demanded Defendant. ) CLASS ACTION COMPLAINT Plaintiff Sadhish K. Siva, (“Plaintiff”), for his Complaint against Defendant American Board of Radiology (“ABR” or “Defendant”) hereby alleges as follows: INTRODUCTION 1. This case is about ABR’s illegal and anti-competitive conduct in the market for initial board .