Tutorial On VLSI Partitioning

2y ago
33 Views
2 Downloads
540.22 KB
43 Pages
Last View : 1m ago
Last Download : 3m ago
Upload by : Oscar Steel
Transcription

I207T001015 . 207T001015d.207VLSI DESIGN2000, Vol. 00, No. 00, pp. 1 43Reprints available directly from the publisherPhotocopying permitted by license only# 2000 OPA (Overseas Publishers Association) N.V.Published by license underthe Gordon and Breach SciencePublishers imprint.Printed in Malaysia.Tutorial on VLSI PartitioningSAO-JIE CHENa, y and CHUNG-KUAN CHENG b,*aDept. of Electrical Engineering, National Taiwan University, Taipei, Taiwan 10764; bDept. of Computer Scienceand Engineering, University of California, San Diego, La Jolla, CA 92093-0114(Received 1 March 1999; In nal form 10 February 2000)The tutorial introduces the partitioning with applications to VLSI circuit designs. Theproblem formulations include two-way, multiway, and multi-level partitioning,partitioning with replication, and performance driven partitioning. We depict themodels of multiple pin nets for the partitioning processes. To derive the optimumsolutions, we describe the branch and bound method and the dynamic programmingmethod for a special case of circuits. We also explain several heuristics including thegroup migration algorithms, network ow approaches, programming methods,Lagrange multiplier methods, and clustering methods. We conclude the tutorial withresearch directions.Keywords: Partitioning, clustering, network ow, hierarchical partitioning, replication, performance driven partitioningity of the circuit has become so high that it is verydi cult to design and simulate the whole systemwithout decomposing it into sets of smaller subsystems. This divide and conquer strategy relies onpartitioning to manipulate the whole system intohierarchical tree structure.Partitioning is also needed to handle engineeringchange orders. For huge systems, design iterationsrequire very fast turn around time. A hierarchicalpartitioning methodology can localize the modi cations and reduce the complexity.Furthermore, a good partitioning tool candecrease the production cost and improve the1. INTRODUCTIONAutomatic partitioning [5, 61, 78, 72] is becomingan important topic with the advent of deep submicron technologies. An e cient and e ective partitioning [12, 17, 19, 48, 69, 70, 81, 94, 105, 77] toolcan drastically reduce the complexity of the designprocess and handle engineering change orders in amanageable scope. Moreover, the quality of thepartitioning di erentiates the nal product in termsof production cost and system performance.The size of VLSI designs has increased to systemsof hundreds of millions of transistors. The complex-*Corresponding author. Tel: (858)534-6184, Fax: (858)534-7029, e-mail: kuan@cs.ucsd.eduyTel: (8862)2363-5251 ext. 417, e-mail: csj@cc.ee.ntu.edu.tw1

I207T001015 . 207T001015d.2072S.-J. CHEN AND C.-K. CHENGsystem performance. With the advance of fabrication technologies, the cost of a transistor dropswhile the cost of input/output pads remains fairlyconstant. Consequently, the size of the interfacebetween partitions, e.g., between chips, determinesa signi cant portion of the manufacturing expenses.And the quality of the partitioning has strong e ecton production cost. Furthermore, in submicrondesigns, interconnection delays tend to dominategate delays [8]; therefore system performance isgreatly in uenced by the partitions.Partitioning has been applied to solve thevarious aspects of VLSI design problems [5, 36]: Physical packaging Partitioning decomposesthe system in order to satisfy the physicalpackaging constraints. The partitioning conforms to a physical hierarchy ranging fromcabinets, cases, boards, chips, to modular blocks. Divide and conquer strategy Partitioning is usedto tackle the design complexity with a divide andconqure strategy [21]. This strategy is adopted todecompose the project between team members,to construct a logic hierarchy for logic synthesis,to transform the netlist into physical hierarchyfor oorplanning, to allocate cells into regionsfor placement and RLC extraction, and manipulate hierarchies between logic and layout forsimulation. System emulation and rapid prototyping Oneapproach for system emulation and prototypingis to construct the hardware with eld programmable gate arrays. Usually, the capacity of these eld programmable gate arrays is smaller thancurrent VLSI designs. Thus, these prototypingmachines are composed of a hierarchical structure of eld programmable gate arrays. Apartitioning tool is needed to map the netlist intothe hardware [110]. Hardware and software codesign For hardwareand software codesign, partitioning is used to decompose the designs into hardware and software. Management of design reuse For huge designsespecially system-on-a-chip, we have to managedesign reuse. Partitioning can identify clusters ofthe netlist and construct functional modules outof the clusters.While partitioning is a tool required to managehuge systems in many elds such as e cientstorage of large databases on disks, data mining,and etc., in this tutorial, we focus our e orts onpartitioning with applications to VLSI circuitdesigns. In the next section, we describe thenotations for the tutorial. In section three, theformulations of the partitioning problems arestated. Section four covers the models for multiplepin nets. Section ve depicts the partitioningalgorithms. The tutorial is concluded with researchdirections.2. PRELIMINARIESIn this section, we establish notations used andformulate the partitioning problems addressed inour approaches. A circuit is represented by ahypergraph, H(V, E ), where the vertex setV {vi j i 1, 2, . . . , n} denotes the set of modulesand the hyperedge set E {ej j j 1, 2, . . . , m} denotes the set of nets. Each net ej is a subset of Vwith cardinality jejj 2. The modules in ej arecalled the pins of ej.The hypergraph representation for a circuit with9 modules and 6 signal nets is shown in Figure 1,where nets e1, e3 and e5 are two-pin nets, net e6 is athree-pin net, and nets e2 and e4 are four-pin nets.When the circuit has only two pin nets, we cansimplify the representation to a graph G(V, E ). Anet connecting modules vi and vj is represented byeij with a connectivity cij. We set cij 0 if there is nonet connecting modules vi and vj. We shall showlater that for certain formulations we replacemultiple pin nets with models of two pin nets.The replacement is performed when the partitioning algorithm is devised for graph models.(i) Module Size and Net Connectivity Each module vi is attached with a size si in R , positive realPnumbers. We de ne S Vj vi 2Vj si to be the sizeof a partition Vj. Each net ei is attached with a

I207T001015 . 207T001015d.207VLSI PARTITIONING3FIGURE 1 Hypergraph example.connectivity ci in R . By default, ci 1. For a busof multiple signal lines, we can represent the buswith a net ei of connectivity ci equal to the numberof lines. We can also assign higher weights forsome important nets, this will enable us to keep themodules of these nets in the same partition.In this tutorial, we will assume that circuits arerepresented as hypergraphs except when statedotherwise, hence, the terms circuit, netlist, andhypergraph are used interchangeably throughoutthe tuorial.(ii) Partitions and Cuts The set of hyperedgesconnecting any two-way partition (V1, V2) of twodisjoint vertex sets V1 and V2 is denoted by a cutE(V1, V2) {ej 2 E j 0 jej \ V1j and 0 jej \ V2j},i.e., ej 2 E(V1, V2) if there exist some pins of ej in V1and some di erent pins of ej in V2. We de nePC V1 ; V2 ei 2E V1 ;V2 ci to be the cut count ofthe partition (V1, V2).For a multiway partition (V1, V2, . . . , Vk)where k 2, a cut E(V1, V2, . . . , Vk) {ej 2 Ej 9 is.t. 0 jej \ Vij jejj}. For each subset Vi, wedenote its external cut set E(Vi) {ej 2 E j0 jej \ Vij jejj}. We denote its adjacent net set to bethe nets with some pin contained in Vi, i.e.,I(Vi) {ei j jei \ Vij 0}.(iii) Replication Cuts and Directed Cuts Forreplication cuts and performance driven partitioning, the direction of the nets makes a di erence inthe process. We characterize the pins of each netinto two types: source and sink. A directed net ei isdenoted by (ai, bi) where ai V are the source pinsof the net and bi V are the sink pins of the net.We assume that jai [ bij 2, jaij 1 and jbij 1.Usually, each net has one source pin and multiplesink pins. However, some nets may have multiplesources which share the same interconnect line.Furthermore, one pin can be both a source pin andsink pin of the same net. Therefore, ai and bi mayhave a nonempty intersection.For two disjoint vertex sets X and Y, we shall useE(X ! Y ) to denote the directed cut set from X toY. Net set E(X ! Y ) contains all the nets ei (ai, bi)such that X intersects the source pin set ai and Yintersects the sink pin set bi, i.e., E(X ! Y ) {ei j ei (ai, bi), ai \ X 6 ;, bi \ Y 6 ;}. We use thefunction C(X ! Y ) to denote the total cut countof the nets in E(X ! Y ), i.e., C X ! Y Pei 2E X!Y ci .(iv) Performance Driven Partitioning In performance driven partitioning [106], modules aredistinguished into two types: combinational elements and globally clocked registers. In illustration, we shall use circles to represent the combinational elements and rectangles to represent theregisters in gures (Fig. 13). Each module vi has anassociated delay di.A path of length k from a module vi to a modulevj is a sequence hvi0 ; vi1 ; . . . ; vik i of modules suchthat vi vi0 , vj vik and for each l 2 {1, 2, . . . , k},modules vilÿ1 and vil are a souce pin and a sink pinof a net in E, respectively.(v) Clustering Given a hypergraph H(V, E ),highly connected modules in V can be grouped

I207T001015 . 207T001015d.2074S.-J. CHEN AND C.-K. CHENGtogether to form some single supermodules calledclusters. After this process, a clustering ÿ {V1,V2, . . . , Vk} of the original hypergraph H isobtained and a contracted (i.e., coarser) hypergraphHÿ(Vÿ, Eÿ ) is induced, where Vÿ fv ÿ1 ; v ÿ2 ; . . . ;v ÿk g. For every ej 2 E, the contracted net e ÿj 2 Eÿ ifje ÿj j 2, where e ÿj fv ÿi jej \ Vi 6 ;g, that is, e ÿjspans the set of clusters containing modules of ej. Acontracted hypergraph, of course, can be used toinduce another coarser contracted hypergraphbased on the same clustering process. On the otherhand, a contracted hypergraph Hÿ(Vÿ, Eÿ ) can beunclustered to return to a ner hypergraph H(V, E ).jV j equally spaced slots on a striaght line (Fig. 2).Modules vs and vt are xed at the two extremeends, i.e., vs on the rst slot (left end ) and vt on thelast slot (right end ). The goal is to assign allmodules to distinct slots to minimize the total wirelength. Let us use xi to denote the coordinate ofmodule vi after it is assigned to the slot. The lengthof a net ei can be expressed as the di erence of themaximum coordinate and the minimum coordinate of the modules in the net, i.e., maxvj 2ei xj ÿminvk 2ei xk . The total wire length can be expressedas follows.Xmaxvj 2ei xj ÿ minvj 2ei xj 2 ei 2E3. PROBLEM FORMULATIONSIn this section, we describe di erent formulationsof the partitioning problems addressed in thistutorial. We will cover two-way partitioning,multiway partitioning, multiple level partitioning,partitioning with replication, and performancedriven partitioning.3.1. Two-way Partitioning or BipartitioningWe consider several possible variations on the sizeconstraints and cost functions in the formulation.Additionally, in certain formulations, we x twomodules vs and vt to be on the opposite sides of thecut as two seeds.The relation between partitioning and placement can be derived under the assumption that allnets are two pin nets [50].THEOREM 3.1 Given a graph G(V, E ) with modulesvs and vt in V, let (V1, V2) be a min-cut partitionseparating modules vs and vt. Let vs and vt be the twomodules locating at the two extreme ends of a linearplacement. Then, there exists an optimal linearplacement solution such that all modules in V2 areon the slots right of all modules in V1 (Fig. 2).Thus, we can use the min-cut to partition a linear3.1.1. Min-cut Separating Two Modulesvs and vtGiven a hypergraph, we x two modules denotedas vs and vt at two sides. A min-cut is a partition(V1, V2), vs 2 V1 and vt 2 V2 such that the cut countC (V1, V2) is minimized, i.e.,minvs 2V1 ;vt 2V2 C V1 ; V2 1 where V1 and V2 are disjoint and the union of thetwo sets is equal to V.This partitioning is strongly related to a linearplacement problem. In a linear placement, we haveFIGURE 2 Suppose partition (V1, V2) is a min-cut separatingmodules vs and vt. There exists an optimal linear placement thatmodules in V2 are at the right side of modules in V1.

I207T001015 . 207T001015d.207VLSI PARTITIONINGplacement into two smaller problems and stillmaintain optimality. Conceptually, we can conceivethat modules in V1 or V2 have stronger internalconnection within the set than its mutual connection to the other set. Thus, if the span of modules inV1 and in V2 are mixed in a linear placement, we canslide all modules in V1 to the left and all modules inV2 to the right to reduce the total wire length. Infact, this is the procedure to prove the theorem.The min-cut with no size constraints can befound in polynomial time using classical maximum ow techniques [1]. However, it may happen thatthe optimal solution separates only vs or vt fromthe rest of the modules, i.e., V1 {vs} or V2 {vt}.This result is very likely to happen because mostVLSI basic modules have very small degrees ofconnecting nets (e.g., the degree of a 3-inputNAND gate 4).3.1.2. Minimum Cost Ratio CutThe cost ratio cut formulation supplies a partitiondi erent from the min-cut that separates two xedmodules. Thus, if the min-cut cannot provide anynontrivial solution, we may adopt the cost ratiocut to perform another trial.In cost ratio cut, we x two modules vs and vt attwo di erent sides. Our objective is to nd a vertexset A to minimize a cost ratio function:5C A; V ÿ A ÿ fvs g ÿ C A; fvs g S A 3 where vertex set A does not contain vs and vt.Vertex set A is non-empty, i.e., S(A) 0.Cost ratio cut is also strongly related to a linearplacement. Assuming that all nets are two pin nets,we can derive the following theorem [22]:THEOREM 3.2 Given a graph G(V, E ) with modulesvs and vt in V, let (V1, V2) be an optimal cost ratiocut partition. There exists an optimal linearplacement solution such that all modules in A areon the slots left of all modules in V ÿ A ÿ {vs}.Conceptually, we can conceive that C(A, V ÿA ÿ {vs}) is the force to pull A to the right andC (A, {vs}) is the force to push A to the left. Thedenominator S(A) is the inertia of the set A. A set Awith the minimum cost ratio moves with the fastestacceleration toward left end of the slotsExample In Figure 3, the circuit contains sixmodules. The optimum cost ratio cut solution hasA {v1, v2, v3} The cost ratio value isC A; V ÿ A ÿ fvs g ÿ C A; fvs g 4 ÿ 3 1 :S A 334 The cost ratio value of any other choice of set A islarger than expression 4.FIGURE 3 A six module circuit to illustrate the cost ratio cut.

I207T001015 . 207T001015d.2076S.-J. CHEN AND C.-K. CHENGThe cost ratio cut solution can be found in polynomial time for a special case of serial parallelgraphs [22]. We are unaware of algorithms forgeneral cases. Note that, the solution may haveV ÿ A ÿ {vs} equal to set {vt}. In such case, thepartitioning result is not useful for decomposing thecircuit.3.1.3. Min-cut with Size ConstraintsFor min-cut with size constraints, we have lowerand upper bounds on the partition size Sl and Su,where 0 Sl Su S(V ) and Sl Su S(V ). Thebipartitioning problem is to divide vertex set Vinto two nonempty partitions V1, V2, whereV1 \ V2 ; and V1 [ V2 V, with the objective ofminimizing cut count C (V1, V2) and subject to thefollowing size constraints:S l S Vb S ufor b 1; 25 The min-cut problem with size constraints is NPcomplete [43]. However, because of the importanceof the problem in many applications, manyheuristic algorithms have been developed.Random Partitioning We use a random partition estimation of min-cut with size constraints todemonstrate that the quality variation of partitioning results can be signi cant. Let us simplifythe case by assigning the modules with uniformsize, i.e., si 1 for all vi in V, and the nets withuniform connectivity, i.e., ci 1 for all ei in E.Let us assume that the modules are partitionedinto two sets V1, V2 with equal sizes: S(V1) S(V2).The partition is performed with an independentrandom process [10] so that each module has a50% chance to go to either side. For a net ei of twopins, we can derive that net ei belongs to the cut setE(V1, V2) with a 0.5 probability (Fig. 4). Similarly,we can derive that for a net ei of k pins (k 2), theprobability that net ei belongs to cut set E(V1, V2)is 2k ÿ 2 2k . This probability is larger than 0.5and approaches one as k increases. In other words,the expected cut count C (V1, V2) is equal to orlarger than half the number of nets. For example, aFIGURE 4 Four possible con gurations of net ei {a, b} in arandom placement.circuit of one million modules usually has anasymptotic number of nets, i.e., jEj O(jV j ) 1,000,000. The expected cut count would beC (V1, V2) 500,000. This number is much worsethan the results we can achieve. In practice, the cutcounts on circuits of a million of modules areusually no more than several thousands [34, 36]. Inother words, the probability that a net belongs to acut set is small, below one percent for a circuit ofone million gates.Suppose the two bounds of partitioned sizes arenot equal, Sl 6 Su. Using the proposed randomgraph model, the expected cut count C (V1, V2) isproportional to the product of two sizes, i.e.,S(V1) S(V2). Consequently, the expected cutcount is smallest if the size of one partition approaches the upper bound S(Vi) Su and the size ofanother partition approaches the lower boundS(Vj) Sl. In practice, we do observe this behavior.One partition is fully loaded to its maximumcapacity, while another partition is under utilizedwith a large capacity left unused. This phenomena isnot desirable for certain applications.3.1.4. Ratio CutRatio cut formulation integrates the cut count anda partition size balance criterion into a singleobjective function [87, 109]. Given a partition(V1, V2) where V1 and V2 are disjoint andV1 [ V2 V, the objective funtion is de ned as

I207T001015 . 207T001015d.207VLSI PARTITIONINGC V1 ; V 2 S V1 S V 2 6 The numerator of the objective function minimizesthe cut count while the denominator avoidsuneven partition sizes. Like many other partitioning problems, nding the ratio cut in a generalnetwork belongs to the class of NP-completeproblems [87].Example Figure 5 shows a seven module example.The modules are of unit size and the nets are of unitconnectivity. Partition (V1, V2) has a cost CV1 ; V2 S V1 S V2 2 4 3 1 6 . Anyother partition corresponds to a much larger cost.The Clustering Property of the Ratio Cut Theclustering property of the ratio cut can beillustrated by a random graph model. Let usassume that the circuit is a uniformly distributedrandom graph. with uniform module sizes, i.e.,si 1. We construct the nets connecting each pairof modules with identical independent probabilityf. Consider a cut which partitions the circuit intotwo subsets V1 and V2 with comparable sizes jV j and (1 ÿ ) jV j respectively, where 1.The expected cut count equals the probability fmultiplied by the number of possible nets betweenV1 and V2.Expec C V1 ; V2 f jV1 j jV2 j 1 ÿ jVj2 f :7 On the other hand, if another cut separates onlyone module vs from the rest of the modules, theexpected cut count is7Expec C fvs g; V ÿ fvs g jVj ÿ 1 f8 As jV j approaches in nity, the value of Eq. (7)becomes much larger than 8.This derivation provides another explanationwhy the min-cut separating two xed modules tendsto generate very uneven sized subsets. The veryuneven sized subsets naturally giv

VLSI DESIGN # 2000 OPA (Overseas Publishers Association) N.V. 2000, Vol. 00, No. 00, pp. 1–43 Published by license under Reprints available directly from the publisher the Gordon and Breach Science Photocopying permitted by license only Publishers imprint. Printed in Malaysia. Tutorial on VLSI

Related Documents:

VLSI Design 2 Very-large-scale integration (VLSI) is the process of creating an integrated circuit (IC) by combining thousands of transistors into a single chip. VLSI began in the 1970s when complex semiconductor and communication technologies were being developed. The microprocessor is a VLSI device.

VLSI IC would imply digital VLSI ICs only and whenever we want to discuss about analog or mixed signal ICs it will be mentioned explicitly. Also, in this course the terms ICs and chips would mean VLSI ICs and chips. This course is concerned with algorithms required to automate the three steps “DESIGN-VERIFICATION-TEST” for Digital VLSI ICs.

VL2114 RF VLSI Design 3 0 0 3 VL2115 High Speed VLSI 3 0 0 3 VL2116 Magneto-electronics 3 0 0 3 VL2117 VLSI interconnects and its design techniques 3 0 0 3 VL2118 Digital HDL Design and Verification 3 0 0 3 VL2119* Computational Aspects of VLSI 3 0 0 3 VL2120* Computational Intelligence 3 0 0 3

Dr. Ahmed H. Madian-VLSI 3 What is VLSI? VLSI stands for (Very Large Scale Integrated circuits) Craver Mead of Caltech pioneered the filed of VLSI in the 1970’s. Digital electronic integrated circuits could be viewed as a set

15A04604 VLSI DESIGN Course Objectives: To understand VLSI circuit design processes. To understand basic circuit concepts and designing Arithmetic Building Blocks. To have an overview of Low power VLSI. Course Outcomes: Complete Knowledge about Fabrication process of ICs Able to design VLSIcircuits as per specifications given.

55:131 Introduction to VLSI Design 10 . Simplified Sea of Gates Floorplan 55:131 Introduction to VLSI Design 11 . SoG and Gate Array Cell Layouts 55:131 Introduction to VLSI Design 12 . SoG and Gate Array 3-in NAND 55:131 Introdu

VLSI Fabrication Process Om prakash 5th sem ASCT, Bhopal omprakashsony@gmail.com Abstract VLSI stands for "Very Large Scale Integration". This is the field which involves packing more and more logic devices into smaller and smaller areas. Thanks to VLSI, circuits that would have

10 Annual Book of ASTM Standards, Vol 02.02, in the Related Material section (gray pages). 11 Available from Standardization Documents, Order Desk, Bldg. 4, Section D, 700 Robbins Ave., Philadelphia, PA 19111-5094, Attn: NPODS. TABLE 1 Chemical Composition Limits NOTE 1—When single units are shown, these indicate the maximum amounts permitted. NOTE 2—Analysis shall be made for the elements .