NP-Separate: A New VLSI Design Methodology For Area,

2y ago
139 Views
2 Downloads
1.62 MB
12 Pages
Last View : 20d ago
Last Download : 3m ago
Upload by : Luis Wallis
Transcription

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 39, NO. 12, DECEMBER 20205111NP-Separate: A New VLSI Design Methodologyfor Area, Power, and Performance OptimizationMonzurul Islam Dewanand Dae Hyun Kim , Member, IEEEAbstract—Use of standard cells in the very-large-scale integration (VLSI) design enables very short time to market even forcomplex microprocessors. Thus, most VLSI layouts are designedusing standard cells. In this article, we propose a new designmethodology, namely, NP-Separate, to reduce the power consumption and area and increase the performance of a VLSI layoutmore effectively than the standard-cell-based design methodology. NP-Separate uses N cells and P cells composed of NFETsand PFETs only, respectively, thereby providing a higher degreeof flexibility than using standard cells. Our simulation resultsfor several benchmark circuits show that NP-Separate reducesthe layout area by 9%, power consumption by 10%, powerdelay product by 18%, and energy-delay product by 26% onaverage while satisfying given timing constraints compared tostandard-cell-based designs.Index Terms—Physical layout design, standard cells, verylarge-scale integration (VLSI).I. I NTRODUCTIONTANDARD-CELL-BASED design methodologies providenumerous advantages in the design of very-large-scaleintegration (VLSI) layouts. For example, drawing long horizontal lines at the top and bottom of the standard cell rows ina layout connects all the power and ground pins of all the standard cell instances in the rows to the main power/ground ringsdrawn around the core area of the layout. Thus, power/groundnetwork design is greatly simplified [1], [2]. Standard cellplacement easily optimizes the locations of all the transistors in the layout by optimizing the locations of the standardcell instances [3]–[5]. Timing and power optimization satisfiesgiven timing constraints and minimizes the power consumption of the design by manipulating (inserting, removing, andrelocating) repeaters, upsizing and downsizing standard cellinstances, and replacing a set of instances by a different setof instances [6]–[8]. For these reasons, most VLSI layouts aredesigned using standard cells.Each standard cell design is optimized to minimize the areaof the cell and satisfy constraints such as a target output resistance. For example, if the smallest inverter is designed, thewidth of the NFET of the inverter is set to the minimum transistor width and that of the PFET is optimized so that the riseSManuscript received February 25, 2019; revised June 11, 2019 and October1, 2019; accepted December 12, 2019. Date of publication January 13, 2020;date of current version November 20, 2020. This article was recommendedby Associate Editor C. Zhuo. (Corresponding author: Dae Hyun Kim.)The authors are with the School of Electrical Engineering and ComputerScience, Washington State University, Pullman, WA 99164 USA (e-mail:monzurulislam.dewan@wsu.edu; daehyun@eecs.wsu.edu).Digital Object Identifier 10.1109/TCAD.2020.2966551TABLE IN OTATIONS AND T ERMINOLOGIES U SED IN T HIS A RTICLEand fall times of the inverter are equal. Library characterization performs SPICE simulations to characterize the standardcells and generate a standard cell library. All the synthesis,placement, and routing software use the standard cell libraryto design VLSI layouts.One of the issues the standard-cell-based VLSI designmethodology has is that all the design and optimization stepsare based on standard cells, so it is impossible to fine-controlthe size of each transistor for further optimization. For example, suppose an optimization algorithm inserts an inverter intoa net. Assuming the optimal sizes of the NFET and the PFETof the inverter are 3wmin and 8kμ wmin , respectively, wherewmin and kμ are defined in Table I, the algorithm will likelyinsert an 8 inverter whose NFET and PFET widths are 8wminand 8kμ wmin , respectively. In this case, the NFET is unnecessarily upsized, which leads to a larger area and higher powerconsumption. Although the standard cell library might havean inverter cell having the optimal NFET and PFET widths inthis case, it would be practically impossible to design and usestandard cells having many different combinations of NFETand PFET widths. In general, we cannot avoid overoptimizingsome parts of a layout unless we can fine control the sizes ofthe transistors in the standard cells.In this article, we propose a new VLSI design methodology, namely, NP-Separate, to optimize area, power, andperformance of a layout by fine-tuning transistor sizes. NPSeparate designs a layout with N cells and P cells composedof NFETs and PFETs only, respectively, thereby providing ahigher degree of flexibility. We design several layouts usingNP-Separate and show that it reduces the layout area by 9%,power consumption by 10%, power-delay product by 18%, andenergy-delay product by 26% on average with shorter criticalpath delays compared to standard-cell-based designs.The rest of this article is organized as follows. In Section II,we review the conventional standard-cell-based physical VLSIlayout design and transistor sizing. In Section III, we presentthe motivation leading to the NP-Separate design methodology.c 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.0278-0070 See l for more information.Authorized licensed use limited to: Washington State University. Downloaded on February 22,2021 at 20:53:07 UTC from IEEE Xplore. Restrictions apply.

5112IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 39, NO. 12, DECEMBER 2020Fig. 1.Three layouts for two-input NAND cells.the standard cell instances in the netlist on a layout, performsclock-tree synthesis (CTS), and routes the instances. Timingand power optimization is performed before placement, CTS,routing, and after routing. Design-rule violations, such asmax. capacitance violations, are also fixed during the physical design. All of these steps use standard cells. For example,gate sizing upsizes or downsizes instances for area, power, andperformance optimization. Upsizing or downsizing an instancereplaces it by a new standard cell instance having the samefunction with a different size (e.g., a NAND2 X4 instanceis replaced by a NAND2 X1 instance). Similarly, repeaterinsertion inserts repeater instances for delay minimization.C. Transistor SizingFig. 2.Simplified standard-cell-based VLSI design process.Section IV explains the details of NP-Separate. In Section V,we show case studies and compare the two design methodologies. We discuss future work to improve NP-Separate inSection VI, then we conclude in Section VII.II. S TANDARD -C ELL -BASED VLSI D ESIGNIn this section, we review the conventional standard-cellbased physical VLSI layout design and optimization process,transistor sizing, and multiheight standard cells.A. Standard Cell Libraries and Standard CellsA standard cell library for automatic placement and routing generally consists of physical libraries such as libraryexchange format (LEF) files, timing and power libraries suchas Liberty format files, and interconnect technology files. Thephysical libraries contain physical information of the standardcells (such as the pin locations of a cell) and interconnectlayers (such as the minimum width of a metal layer) in thestandard cell library. The timing and power libraries contain timing, power, and functional information of the standardcells (such as the delay of a cell for various input slewsand output loads). The interconnect technology files containdetailed information for interconnect resistance and capacitance (RC) extraction. In general, PFETs and NFETs of astandard cell are placed in the upper and lower regions of thecell layout, respectively, as shown in Fig. 1. Placing all theNFETs (or PFETs) in the same area enables the transistors toshare their diffusion regions, which helps reduce the cell area.B. Physical Design of VLSI LayoutFig. 2 shows a simplified standard-cell-based physical VLSIlayout design process. Physical design (including netlist synthesis) of a VLSI layout synthesizes a netlist from givenhardware description language (HDL) source codes, placesThe sizes of the transistors in a standard cell are properlyoptimized for various purposes, such as delay minimizationand fall/rise time matching. Since different transistor sizeshave different input capacitances and output resistances, standard cell libraries generally have multiple cell sizes foreach cell. For instance, a two-input NAND cell has threedefinitions, NAND2 X1, NAND2 X2, and NAND2 X4 inFig. 1. The sizes of the transistors in the NAND2 X2 andNAND2 X4 cells are two and four times as large as thosein the NAND2 X1 cell. Thus, NAND2 X2 and NAND2 X4have lower output resistance and larger input and output capacitance than NAND2 X1. However, they might not occupy alarger area than NAND2 X1 in terms of the cell area becauseincreasing the sizes of the transistors does not necessarily leadto a larger cell area as shown in Fig. 1.Suppose the resistance of an NFET whose width is wmin isRn . In this case, the resistance of a PFET whose width is wminis kμ · Rn . For an inverter, if the load capacitance is CL and agiven timing constraint is τ Rn CL , the widths of the NFETand the PFET of the inverter are set to wmin and kμ · wmin ,respectively. Similarly, if the timing constraint is τ Rn CL /r,the widths of the NFET and the PFET of the inverter are setto r · wmin and r · kμ · wmin , respectively.D. Multiheight Standard CellsRecently, Baek et al. [9] proposed designing VLSI layoutsusing multiheight standard cells. The multiheight-standardcell-based design methodology (MHSC) uses single-heightstandard cells for simple cells, such as inverters and multiheight standard cells for complex cells such as flip-flops.MHSC minimizes the layout area by reducing the height ofthe single-height cells and designing complex cells in doubleheight cells. The restriction of using the metal 1 layer only inthe standard cell design unnecessarily increases the standardcell height and the area of complex cells. Thus, designing complex cells across two rows and using the metal 2layer for power and ground routing helps reduce the layout area [9], [10]. To support the placement of mixed-heightstandard cells, several placement legalization algorithms havebeen proposed [11], [12]. In this article, we reduce the layoutarea by using NFETs and PFETs whose sizes are optimizedseparately. We also demonstrate how to incorporate optimalAuthorized licensed use limited to: Washington State University. Downloaded on February 22,2021 at 20:53:07 UTC from IEEE Xplore. Restrictions apply.

DEWAN AND KIM: NP-SEPARATE: NEW VLSI DESIGN METHODOLOGY FOR AREA, POWER, AND PERFORMANCE OPTIMIZATION(a)(b)(c)(d)Fig. 3. (a) Single path transistor sizing example for a NOR2-NOR3-INVconfiguration. (b) Brute-force. (c) Heuristic. (d) Optimal.transistor sizes in the automatic layout generation to minimize area, power, and performance. Thus, we can applyNP-Separate to MHSC to reduce the layout area further.III. M OTIVATIONThis section shows the motivation of this article withan example. Fig. 3(a) shows a signal path composedof a two-input NOR instance (NOR2), a three-input NORinstance (NOR3), an inverter instance (INV), and some loadcapacitors. Some parasitic capacitances are not shown in thefigure. We assume that all the NFETs (or PFETs) of eachcell have the same width as shown in the figure. For example, the width of all the PFETs in the NOR2 instance is a1 ,which is a1 · wmin . We also assume that the load capacitanceof each instance is CL just for simplification. kμ μn /μp isset to 1.8. The red arrows show the signal flow of NOR2 1,NOR3 0, and INV 1, which means that the outputsof the NOR2, NOR3, and INV instances are 1, 0, and 1,respectively. Similarly, the green arrows show the signal flowof NOR2 0, NOR3 1, and INV 0. A given timingconstraint is τ Rn CL .Fig. 3(b) shows the result of a brute-force transistor sizingalgorithm by which each instance is upsized to 3 so that thepath delay is evenly distributed throughout the three instances.5113Thus, the delay of each instance is (1/3)τ and the total transistor width is 93.6wmin . Fig. 3(c) shows the result of a heuristictransistor sizing algorithm by which NOR2, NOR3, and INVare upsized to a , b , and c , respectively. The algorithmminimizes the total transistor width as follows: Minimize W 2 4kμ a 3 9kμ b 1 kμ c (1)2kμ · Rnkμ · RnRnCL CL CL τSubject to Rising:2kμ abc(2)3kμ · RnRnRnCL CL CL τ. (3)Falling:a3kμ bcSolving the above problem leads to (a, b, c) (3 , 2.1 , 5.4 ). The delays of the NOR2, NOR3, and INVinstances are 0.33τ , 0.48τ , and 0.19τ , respectively. The totaltransistor width is 82.6wmin , which is approximately 11.8%smaller than that of the brute-force algorithm.Fig. 3(d) shows the result of an optimal algorithm by whichthe PFETs and NFETs of NOR2, NOR3, and INV are upsizedto (a1 , a2 ), (b1 , b2 ), and (c1 , c2 ), respectively. Thefollowing formulates the problem:Minimize W 2(a1 a2 ) 3(b1 b2 ) (c1 c2 )(4)kμ · Rn2kμ · RnRnCL CL CL τSubject to Rising:a1b2c1(5)3kμ · RnRnRnCL CL CL τ. (6)Falling:a2b1c2Solving the above problem leads to (a1 , a2 ) (7.7 , 4.6 ),(b1 , b2 ) (8.7 , 3.3 ), and (c1 , c2 ) (7.7 , 6.4 ). Thetotal transistor width is 74.7wmin , which is 20.2% and 9.6%smaller than the sizes of the transistors optimized by the bruteforce and heuristic algorithms, respectively. For the risingpath, the delays of the NOR2, NOR3, and INV instances are0.47τ , 0.3τ , and 0.23τ , respectively. For the falling path, thedelays are 0.22τ , 0.63τ , and 0.15τ , respectively. The delaysare unevenly distributed among the three instances as shownabove and even the PFETs and NFETs of an instance havedifferent delay values.Table II also compares the three transistor sizing algorithmsfor various paths. For example, the total transistor width of theNOR4-NOR4-NOR4-NOR4 path optimized by the brute-forceand heuristic algorithms is 524.80wmin , whereas that optimizedby the optimal sizing is 434.13wmin . Thus, the optimal sizingalgorithm achieves 17.28% smaller transistor width than theother two algorithms. Note that the optimal transistor sizinghas been proposed in [13]–[18], some of which used morecomplicated but accurate delay models such as the Elmoredelay model. In addition, they also minimized the total areaor power consumption and we can also take the internalcapacitances into account [15], [18].IV. NP-S EPARATE : N EW VLSI D ESIGN M ETHODOLOGYIn this section, we propose a new VLSI design methodology,namely, NP-Separate, to minimize the layout area and powerconsumption of a given design.Authorized licensed use limited to: Washington State University. Downloaded on February 22,2021 at 20:53:07 UTC from IEEE Xplore. Restrictions apply.

5114IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS, VOL. 39, NO. 12, DECEMBER 2020TABLE IIT OTAL A REAS (U NIT: wmin ) OF S INGLE PATH C IRCUITS O PTIMIZED BY THE B RUTE -F ORCE , H EURISTIC , AND O PTIMAL S IZINGA LGORITHMS FOR D IFFERENT L OGIC G ATE C OMBINATIONSFig. 5.NAND2 N 1W and NOR4 N 2W cells.B. N Cells and P CellsFig. 4.Standard-cell-based and NP-Separate VLSI design flows.A. OverviewFig. 4 shows an overview of the NP-Separate designmethodology we propose. First, we begin the design from thesynthesis of a given HDL source code using a plain standard cell library. The synthesis generates a netlist composedof standard cell instances. Then, we size the transistors ofthe instances using the optimal transistor sizing explained inthe previous section. When we optimally size the transistors,we estimate the load capacitance of each instance using thestandard cell library and add the capacitance to the timingconstraints. The optimal transistor sizing gives us the size ofeach transistor of each standard cell instance. Then, we createN and P cells having the sizes found by the optimal sizing.The creation of the N and P cells includes layout drawing,design-rule check (DRC), and layout-versus-schematic (LVS).Then, we create an NP cell corresponding to each standardcell instance by merging the N and P cells. The creation ofan NP cell includes physically placing an N and a P cell in alayout editor and routing the input and output ports. The nextstep is to replace the standard cell instances with the NP cellinstances in the layout. We also prepare a physical library inLEF for the NP cells and use the library and a commercialrouter to perform CTS and route the NP cells. The followingsections show more details of each step.N and P cells are similar to standard cells. However, Ncells have NFETs only and P cells have PFETs only. Althoughthe transistors in an N or P cell can have different widths,we apply the same width to all the transistors in an N or Pcell for the following reasons. First, applying different widthsto the transistors in an N or P cell leads to too many N orP cell designs, which will increase the overall design timesignificantly. Second, timing constraints are greatly simplifiedif all the transistors in a cell have the same width.An N cell is named Func N sW, where Func is the functionof the cell such as NAND2, N denotes the type of the cell (Ncell), and sW is the size of each NFET in the cell. For example,NOR4 N 2W is a four-input NOR N cell and the width ofeach NFET in the cell is 2 · wmin . Fig. 5 shows our layoutsfor NAND2 N 1W and NOR4 N 2W. Notice that the twooutput ports in the NOR4 cell are not routed yet, althoughthey could be prerouted in the N cell. In our methodology,they are routed after the creation of an NP cell. A P cell isnamed similarly like Func P sW. Once we design N and Pcells, we can reuse them to create NP cells. Thus, creating Nand P cells in Fig. 4 will create only the N and P cells missingin the NP cell library.C. NP Cell CreationOnce the size of each transistor is optimized in the transistor sizing step, we create all the required N and P cells.Then, we create NP cells by merging and routing the N andP cells as follows. First, each standard cell instance in thesynthesized netlist is replaced by an N and a P cell instancesas shown in Fig. 6. For example, the optimal sizes of theNFETs and PFETs of the NAND2 X1 instance in Fig. 6(a)are 2wmin and 2wmin , respectively. Thus, we create an NPcell NAND2 N 2W P 2W by combining a NAND2 N 2Winstance and a NAND2 P 2W instance as shown in Fig. 6(b).Authorized licensed use limited to: Washington State University. Downloaded on February 22,2021 at 20:53:07 UTC from IEEE Xplore. Restrictions apply.

DEWAN AND KIM: NP-SEPARATE: NEW VLSI DESIGN METHODOLOGY FOR AREA, POWER, AND PERFORMANCE OPTIMIZATION(a)5115(a)(b)(b)(c)(c)(d)Fig. 6. (a) Two standard cells NAND2 X1 and NOR4 X2. (b) NP cellcreation. (c) Input and output port routing. (d) Abstraction.Fig. 7. Optimization in an NP-Separate design. (a) Two NP cell instancesplaced by a placement software. (b) Area minimization by instance overlap.(c) Abstracted view of the instances.the maximum width of the N and P cells in the NP cell and thesum of the heights of the N and P cells. Thus, the shape of anNP cell is always a rectangle like standard cells. The abstraction of the NP cells creates a physical library in LEF formatso that the NP cell instances can be routed automatically usingcommercial tools.D. Placement, CTS, and RoutingSimilarly, the optimal sizes of the NFETs and PFETs ofthe NOR4 X2 instance are 6wmin and 4wmin , respectively.Thus, we create an NP cell NOR4 N 6W P 4W by combining a NOR4 N 6W instance and a NOR4 P 4W instance.The PFETs in the NOR4 P 4W instance are separated intotwo diffusion regions to reduce the complexity of input/outputport routing. Notice that the centers of the NOR4 N 6W andNOR4 P 4W instances a

tion (VLSI) design enables very short time to market even for complex microprocessors. Thus, most VLSI layouts are designed using standard cells. In this article, we propose a new design methodology, namely, NP-Separate, to reduce the power consump-ti

Related Documents:

VLSI Design 2 Very-large-scale integration (VLSI) is the process of creating an integrated circuit (IC) by combining thousands of transistors into a single chip. VLSI began in the 1970s when complex semiconductor and communication technologies were being developed. The microprocessor is a VLSI device.

VLSI IC would imply digital VLSI ICs only and whenever we want to discuss about analog or mixed signal ICs it will be mentioned explicitly. Also, in this course the terms ICs and chips would mean VLSI ICs and chips. This course is concerned with algorithms required to automate the three steps “DESIGN-VERIFICATION-TEST” for Digital VLSI ICs.

VL2114 RF VLSI Design 3 0 0 3 VL2115 High Speed VLSI 3 0 0 3 VL2116 Magneto-electronics 3 0 0 3 VL2117 VLSI interconnects and its design techniques 3 0 0 3 VL2118 Digital HDL Design and Verification 3 0 0 3 VL2119* Computational Aspects of VLSI 3 0 0 3 VL2120* Computational Intelligence 3 0 0 3

Dr. Ahmed H. Madian-VLSI 3 What is VLSI? VLSI stands for (Very Large Scale Integrated circuits) Craver Mead of Caltech pioneered the filed of VLSI in the 1970’s. Digital electronic integrated circuits could be viewed as a set

15A04604 VLSI DESIGN Course Objectives: To understand VLSI circuit design processes. To understand basic circuit concepts and designing Arithmetic Building Blocks. To have an overview of Low power VLSI. Course Outcomes: Complete Knowledge about Fabrication process of ICs Able to design VLSIcircuits as per specifications given.

55:131 Introduction to VLSI Design 10 . Simplified Sea of Gates Floorplan 55:131 Introduction to VLSI Design 11 . SoG and Gate Array Cell Layouts 55:131 Introduction to VLSI Design 12 . SoG and Gate Array 3-in NAND 55:131 Introdu

VLSI Fabrication Process Om prakash 5th sem ASCT, Bhopal omprakashsony@gmail.com Abstract VLSI stands for "Very Large Scale Integration". This is the field which involves packing more and more logic devices into smaller and smaller areas. Thanks to VLSI, circuits that would have

Any dishonesty in our academic transactions violates this trust. The University of Manitoba General Calendar addresses the issue of academic dishonesty under the heading “Plagiarism and Cheating.” Specifically, acts of academic dishonesty include, but are not limited to: