Technology Mapping Using Flow Networks

Technology Mapping using Flow Networks
Supervised Research Exposition, Autumn 2011

By Uttam Sikaria (08D07042) Fourth Year Undergraduate, Microelectronics
Guide: Prof. Sachin Patkar Department of Electrical Engineering Indian Institute of Technology Bombay
Autumn 2011 Supervised Research Exposition
Acknowledgment
I would like to thank my guide, Prof. Sachin Patkar, for supporting me on the academic and personal fronts during my seminar work. His constant encouragement, valuable suggestions and patience have played instrumental role in this area of research. -Uttam Sikaria
2 Technology Mapping using Flow Networks
Supervised Research Exposition Autumn 2011
Contents
Acknowledgment ................................................................................................ 2 Contents ............................................................................................................. 3 List of Figures ...................................................................................................... 5 Introduction........................................................................................................ 6 1 Technology Mapping..................................................................................... 7
1.1 1.2 Technology Mapping in Lookup-Table based FPGA Architecture.................................... 7 Algorithms for LUT-based FPGA Mapping ....................................................................... 8
FlowMap....................................................................................................... 9
2.1 Problem Formulation ....................................................................................................... 9 2.2 Assumptions and Preliminaries ........................................................................................ 9 2.1 2.3 Difficulties................................................................................................................. 10 2.3.1 Monotone Clustering Constraint ............................................................................ 10 2.4 The Algorithm ................................................................................................................. 11 2.4.1 Labeling Phase ........................................................................................................ 11 2.4.2 Mapping Phase........................................................................................................ 13 2.5 Area optimization ........................................................................................................... 13 2.5.1 Maximising the Cut Volume During Mapping......................................................... 13 2.5.2 Post processing Operations for K-LUT Reduction ................................................... 14
Flow Networks ............................................................................................ 16

3.1 Flow ................................................................................................................................ 16 3.1.1 Capacity Rule ................................................................................................................ 16 3.1.2 Conservation Rule ......................................................................................................... 17
Technology Mapping using Flow Networks 3
Autumn 2011 Supervised Research Exposition 3.2 3.3 3.4 3.5 Maximum Flow ............................................................................................................... 17 Flow and Cut................................................................................................................... 18 Augmenting Path and Flow Augmentation .................................................................... 18 Ford-Fulkersons Algorithm for computing maximum flow........................................... 19
Conclusion .................................................................................................. 20
4.1 4.2 Complexity Analysis........................................................................................................ 20 Further Scope ................................................................................................................. 20
Appendix I - Terminologies ............................................................................... 21 Bibliography ..................................................................................................... 22
List of Figures
Figure 1: Mapping a Boolean network to a 3-LUT network ........................................................... 9 Figure 2: Constraint on the number of inputs to LUT is not monotone (K=3) ............................. 10 Figure 3: Computing the label l(t) of node t (K=3). (a) partial network (b) construction of N t and the highest 3-feasible cut. (c) Determining l(t)............................................................................. 11 Figure 4: Network transformations in computing a minimum height K-feasible cut in (K=3) 13
Figure 5: Predecessor Packing ...................................................................................................... 14 Figure 6: Gate decomposition....................................................................................................... 15 Figure 7: the flow-pack operation ................................................................................................ 15 Figure 8: A Flow Network. Numbers along the edges are their respective capacities ................. 16 Figure 9: A Flow in a network. =6............................................................................................. 17 ...................................................................... 17
Figure 10: Maximum Flow in a network. Figure 11: Flow across and
is same and equal to the flow in the network ...................... 18
Figure 12: Augmenting Path ......................................................................................................... 18 Figure 13: Flow Augmentation...................................................................................................... 19
Introduction
The seminar work consist of two aspects implementation and reading. As part of implementation, a compilation of various VLSI CAD algorithms have been created to serve as demonstration object and to help understand the algorithms better through actual execution on varied inputs. This consists of Calculation of Minimum Sum of Products for a Boolean function, interchanging between SoP and PoS form of a Boolean function, creating BDD for a given Boolean Function, complimenting an SoP, stuck-at-fault detector, static hazards. As part of reading work, I read about flow networks and their application in technology mapping. Specifically, Jason Congs work in the field of delay optimum mapping for FPGA based Boolean networks was explored. FlowMap, an algorithm for Delay optimization in Lookup-Table based FPGA designs, was studied thoroughly. The rest of the report is a highlight of reading work carried out. The implementation work is software and needs no explanation herein.
1 Technology Mapping
Logic synthesis is often taken as a two-step process technology independent optimization of a set of logic equations followed by technology dependent mapping into a feasible circuit. The later one, Technology Mapping caters to two essential aspects of logic implementation area minimization of the circuit and satisfying the maximum critical path-delay. It finishes the synthesis of the circuit by performing the final gate selection from a particular library. Technology mapping doesnt change the structure of the circuit radically which in fact is achieved by the precedent technology independent take on the problem. This simplifies the process of logic synthesis radically. In general, a good technology mapping algorithm must: 1. 2. 3. 4. Adapt easily to different libraries Support irregular collections of logic functions Handle detailed technology-dependent cost-functions Be time efficient
With the advent of FPGAs and their popularity in VLSI ASIC designs, often the library of logic functions available is well known. For instance, one popular FPGA design is that based on Lookup-Tables (LUTs) and is several FPGA manufacturers including Xilinx and AT&T. This makes it advantageous to explore for efficient and fast technology mapping algorithm meant specifically for LUT-based FPGAs. This renders the first two requirements of a good technology mapping algorithm a little unimportant atleast for certain specific purposes leaving us to concentrate more on the other two.
1.1
Technology Mapping in Lookup-Table based FPGA Architecture

In an LUT-based FPGA chip, the basic programmable logic block is a K-input look-up table (K-LUT) which can implement any Boolean function of up to K variables. The technology mapping problem in LUT-based FPGA designs is to cover a general Boolean network (obtained by technology independent synthesis) using K-LUTs to obtain a functionally equivalent K-LUT network. Taking into consideration the requirements in a
Autumn 2011 Supervised Research Exposition technology mapping algorithm, we need an algorithm which optimizes the critical path delay as well as the area of the chip. The first constraint translates to obtaining a minimum depth K-LUT network while the second one translates minimizing the number of K-LUTs used.
1.2
Algorithms for LUT-based FPGA Mapping

For the LUT based FPGA mapping algorithm, there are broadly 3 approaches: 1. Minimize number of LUTs in the mapping solutions 2. Minimize the delay of the mapping solutions 3. Maximize the routability of the mapping solutions Each of these approaches have witnessed development of good number of algorithms [see References]. The approaches are heuristic in nature and how far they are from the optimal solution is difficult to determine. An important theoretical work in this field is a combined effort of Jason Cong and Yuzheng Ding which resulted in an algorithm named FlowMap for optimally solving the LUT based FPGA technology mapping problem in polynomial time for general Kbounded Boolean networks. Additionally, the algorithm also effectively minimizes the number of K-LUTs used in the solution. The conventional technology mapping problem in library-based designs is NP-hard for general Boolean networks. Due to inherent difficulty, most conventional technology mapping algorithms decompose the input network into a forest of trees and then map each tree optimally. This methodology have been used in some algorithms before FlowMap carried out the K-LUT mapping directly on general K-bounded Boolean networks to achieve depth-optimal solutions.
2 FlowMap
FlowMap is arguably a major accomplishment in the field of technology mapping as it presents an alternative solution to the NP hard problem of technology mapping in general Boolean networks in just polynomial time. This chapter discusses the algorithm in details
2.1
Problem Formulation
The problem of technology mapping in K-LUT based FPGA Design can be best be described as Covering a given K-bounded Boolean network with K-feasible cones or KLUTs. The solution is a Directed Acyclic Graph (DAG) where: - Each node is a K-feasible cones (or KLUT) - An edge (Cu, Cv) exists only if u is in input(Cv) where Cv is the KLUT rooted at v The primary objective is to minimize the critical path delay by minimizing the depth of the resultant DAG. A secondary goal is Figure 1: Mapping a Boolean network to a 3-LUT network to minimize the chip area of the solution by minimizing the number of K-LUTs used in the solution.
2.2
Assumptions and Preliminaries

The first and foremost assumption is that each programmable logic block in an FPGA is a K-input 1-output Lookup Table. This is equivalent to saying that the technology library consists solely of K-LUTs. A K-LUT can implement any logic function of upto K-inputs. Thus it is not a restriction on the library rather a relaxation.
Autumn 2011 Supervised Research Exposition Next, The sources of delay are assumed to be two viz. the propagation time of K-LUTs and the delay in interconnection paths. To simplify the model a reasonable unit delay model is assumed: Each K-LUT is assumed to contribute a constant delay, equal to its propagation time, independent of the function implemented by it Each edge or interconnection path contributes a constant delay irrespective of how it is routed.
This model enforces the fact that the delay of the circuit, determined by the critical path, is now solely dependent on the depth of the mapped solution. Before discussing the algorithm, please note that all terminologies and definitions thereof (other than the most obvious ones in networks) have been included in Appendix I. The FlowMap algorithm is applicable only to K-bounded Boolean network. This however is not a constraint as any Boolean network can be transformed into a K-bounded Boolean network using Roth Karp Decomposition [2]. A network can always be be transformed into a simple gate network by representing each complex gate in the sumof-products form. Thereon, DMIG[3] can be used to decompose each multiple-input simple gate into a tree of two input simple gates. Such a transformation arguably enables the mapping algorithm to pack more gates along critical paths to one K-LUT, resulting in smaller depths in the solution. Henceforth in the discussion, we shall assume the network to be K-bounded. Although a transformation into a network of 2-input simple gates is performed, optimality of the solution doesnt rely on it. The solution is optimal as long as the network is K-bounded.
2.1
2.3
Difficulties
2.3.1 Monotone Clustering Constraint

A clustering constraint is monotone if a network H satisfying implies any sub-network of H satisfies . For example, a constraint on maximum number of gates for a programmable logic block is monotone clustering constraint. Unfortunately, limiting the number of inputs of each logic block is not a monotone constraint. For a monotone clustering constraint, clustering is easier. Lawlers labeling algorithm [4] produces a minimum depth clustering solution for monotone constraint in polynomial time. DAG-Map algorithm developed by Cong. et al [3] applied a modified version of Lawlers 10 Technology Mapping using Flow Networks
Figure 2: Constraint on the number of inputs to LUT is not monotone (K=3)
Supervised Research Exposition Autumn 2011 to K-LUT based FPGA mapping achieiving significant results but was not optimal.
2.4
The Algorithm
The FlowMap algorithm runs in two phases. In the first phase, it computes a label for each node reflecting the level of K-LUT that implements it in optimal solution. In the second phase, mapping solution is generated based on node labels computed in phase I.
2.4.1 Labeling Phase

Let N be a K-bounded Boolean network with node t which is to be labeled. Let Nt denote the subnetwork rooted at t and comprising of all its predecessors. Label of node t gives the depth of optimal K-LUT mapping solution of Nt. The K-LUT covering t (rooted or otherwise) in optimal solution will have a level or higher. Moreover, maximum of the labels of Primary outputs is the optimal depth for N. This phase computes the labels for all nodes in the topological order starting from PIs to POs. So, a node is processed only after all its predecessors have been. So, the labeling problem reduces to that of labeling a root node t of tree Nt given that all its other nodes are labeled. A s
Figure 3: Computing the label l(t) of node t (K=3). (a) partial network (b) construction of N t and the highest 3-feasible cut. (c) Determining l(t)
s In Figure 3(a), node t is to be labeled. We modify Nt by including an auxillary node s and connecting it to all PIs thus serving as the only source. Nt now has one source s and once sink t. Fig. 3(b) shows the construction of network Nt rooted at t. Let LUT(t) in Figure 3(c) be the 3-LUT implementing t in an optimal solution of Nt. If is the set of nodes in forms a 3-feasible cut between LUT(t) and be the set of remaining nodes. Then s and t. Let u be the node with maximum label in . Level of LUT(t) is then in ) is in the optimal mapping solution of Nt. Now, Height of the cut . Therefore,
Autumn 2011 Supervised Research Exposition minimizing the level of LUT(t) requires finding a minimum height cut so,
in Nt. And
Label computed thus is the minimum depth of any mapping solution of N t.In Figure 3(b), we have a minimum height 3-feasible cut in Nt of height 1 and so we have . In the preceeding discussion, we have essentially brought down the problem of labeling the nodes to finding the minimum height K-feasible cut. An important contribution of Cong. et al through FlowMap is an time algorithm for finding the minimum height K-feasible cut in Nt, where is the number of edges and is the number of nodes in Nt. An important property of the label thus obtained is that , where p is the maximum label of the nodes in . A rigorous proof can be found in [1]. Intuitively, for every input, node t can either belong to the same LUT as the input or in the successive LUT. Hence label of a node is greater than or equal to the maximum label of its inputs. Besides the worst case can be when t is implemented in the next LUT as the maximum label input and hence the upper bound on is So, to find the minimum height K-feasible cut, it suffices to check for existence of a cut with height in Nt. if there is such a cut, we assign We use one KLUT for entire . Otherwise we assign and the minimum height cut (height { } { } . A new K-LUT is used for node t. ) is Whether or not there is a K-feasible cut of height or not can be tested as follows. is the maximum label of nodes { }. We apply a network transformation on Nt that callapses all the nodes in with label , together with t, into a sink t. This gives a modified network . Now, it can be seen that has a K-feasible cut of height p-1 iff has a K-feasible cut. Above mentioned modification of network enforces that only exist in those cuts which have all nodes with label grouped into . Thus any cut in has corresponding cut of height in . Figure 3(a) and 3(b) respectively show and . Further, to determine a K-feasible cut in , we apply another standard network transformation, called the node-splitting transformation to obtain thus reducing the node cut-size constraint to an edge cut-size constraint. For each node in other than or , we introduce two nodes and connected by an edge of capacity 1. All other asedges are given a capacity of . Figure 3(c) shows corresponding to and in figures 3(a) and 3(b). Now, a K-feasible cut in corresponds to a cut in with cut-size no more than K. Infact, with edge capacities as mentioned above, this further reduces to whether the maximum flow from to in is of value or smaller.
Figure 4: Network transformations in computing a minimum height K-feasible cut in (K=3)
Thus the labeling phase in FlowMap algorithm reduced to the well eastablished problem of finding the maximum flow in a flow network. FlowMap uses augmenting path algorithm to compute the maximum flow.
2.4.2 Mapping Phase

This phase generates the K-LUTs in the optimal mapping solution from the node labels calculated in the labeling phase. A set of nodes of network which are to be implemented using K-LUTs is maintained. Inititally, contains all the PO nodes. The nodes in are processed one by one in the following manner. For each node , be the minimum height K-feasible cut in . A K-LUT is generated to assume implement the function of gate using inputs from to . That is, the K-LUT . Finally, we update includes all nodes in and to be { } This process is repeated for each node in until consists of only the PI nodes
2.5
Area optimization
The secondary objective of the algorithm is area optimization i.e. to minimize the number of K-LUTs in the mapping solution. This is done by maximizing the volume of each cut during the mapping process and by post-processing operations for K-LUT reduction.
2.5.1 Maximising the Cut Volume During Mapping

is the minimum height K-feasible cut in . Then the nodes in will be Suppose packed into a K-LUT if a K-LUT is generated to implement . In general, this minimum is, the more height K-feasible cut is not unique. Inuitively, the larger nodes we can pack into the K-LUT and hence fewer K-LUTs are used in total. It can be is maximized when easily deduced that is
Autumn 2011 Supervised Research Exposition is the corresponding cut in maximized where . Thus we need to find a cut in is maximum. In such that and other words, we want a min-cut in of the maximum volume. It turns out that there is a unique maximum volume min-cut in any network and this shall be our choice of . Proof to its unique existence can be found in [1]
2.5.2 Post processing Operations for K-LUT Reduction

Two depth preserving operation have been used in [3] to minimize the number of KLUTs in the mapping solutions of DAG-Map. One is called the predecessor packing. If a K{ } LUT, has a fanin K-LUT , is fanout-free and , then can be merged into . See Figure 5 for an example. Another operation is the gate decomposition. If a K-LUT has two fanin K-LUTs and , both are fanout-free and { } , then and can be merged into one K-LUT that has a single fanout to by carrying out the Roth Karp decomposition on w.r.t. its input and . See Figure 6 for example. A more generalized idea of predecessor packing could be that instead of packing with one of its fanins into a K-LUT, we try to pack a set of its predecessors (including ), denoted , into a single K-LUT. This is called flow-pack and is illustrated in Figure 7. Let be the current mapping solution and be the subnetwork of consisting of and its predecessors. Then ca be packed into a single K-LUT iff forms a K-feasible cut in . The larger | is, the more K-LUTs are reduced. Hence, a maximum volume K-feasible cut is wanted in . However, using maximum volume min-cut is less effective as the resulting K-LUT network produced by FlowMap is much denser than the original 2-input simple gate network. Hence, has a unique min-cut in many cases. Hence a larger K-feasible cut is required. [1] provides a heuristic algorithm to do so. Thus FlowMap can be enhanced further to minimize the number of K-LUTs used and thus achieve area optimization.
Figure 5: Predecessor Packing
Figure 6: Gate decomposition
Figure 7: the flow-pack operation
3 Flow Networks
a flow network is a directed graph where each edge has a capacity and each edge receives a flow. The amount of flow on an edge cannot exceed the capacity of the edge. A flow network consists of: - A weighted directed graph G with non-negative integer weights called capacity of an edge e, c(e) - Two distinct vertices S and T, source and sink of the network respectively U 6 5
7
W
S
2
7
9
1 V
Figure 8: A Flow Network. Numbers along the edges are their respective capacities
3.1
Flow
A flow for a network satisfies: is an integer assignment to each edge e such that it
3.1.1 Capacity Rule

For each edge e
3.1.2 Conservation Rule

For each vertex
The value of a flow f, denoted flow into the sink. U
, is the total flow from the source or equivalently, total
1/5 1/2
W
2/6
S
3/5 1/2
V
1/7 4/7
Z
T 5/9
1/1
Figure 9: A Flow in a network. f =6
3.2
Maximum Flow
A flow for a network N is said to be maximum if its value is the largest of all flows for N. Figure 9 shows an example of a flow while Figure 10 gives maximum flow of the same flow network.
6/6
S
4/5 2/2
W
5/5 1/2
4/7 7/7
Z
8/9
1/1
Figure 10: Maximum Flow in a network. f
3.3
Flow and Cut

Across any cut , flow is same as the flow in the network. Also, capacity of the cut is given by . By fundamental requirement of a flow:
Thus we have the following theorem:
2/6
S
1/5 1/2
W
3/5
1/2
V
1/7 4/7
Z
1/1
5/9
Figure 11: Flow across and is same and equal to the flow in the network
Value of any flow ,
is less than or equal to the capacity of any cut of the network.
3.4
Augmenting Path and Flow Augmentation

Residual capacity of edge e from u to v: Residual capacity of e from v to u: Residual capacity of a path from s to t: A path from s to t is an augmenting path if { }
2/6
U
3/5
1/5 1/2
1/7
S
1/2
4/7 Z 5/9
1/1
Figure 12: Augmenting Path
Supervised Research Exposition Autumn 2011 Let be an augmenting path for flow f in network N. There exists a flow for N of value
Figure 13 shows the flow augmentation of the example in figure 12. 2/5 0/2
2/6
U 4/5
1/7
S
1/2
4/7
Z 5/9
1/1
Figure 13: Flow Augmentation
3.5
Ford-Fulkersons Algorithm for computing maximum flow

The algorithm calculates the maximum flow by repeatedly carrying out flow augmentation until there is no augmenting path left. The algorithm outline is: Initially, f(e) = 0 for each edge e Repeatedly Search for an augmenting path Augment by f() the flow along the edges of A specialization of DFS (or BFS) searches for an augmenting path An edge e is traversed from u to v provided
4 Conclusion
Flow map achieves technology mapping for depth minimization is LUT-based FPGA desigs quite efficiently in polynomial time. To get a fair idea of how fast the algorithm works, we will carry out brief complexity analysis
4.1

Complexity Analysis
Let the network, have edges and nodes
Ford Fulkersons Algorithm takes ( ) time
Since we need to find a flow of no more than K, labeling of each node takes
Since there are n nodes, the FlowMap Algorithm arrives at the optimal technology mapping in time Thus FlowMap algorithm gives optimal mapping for delay optimization in LUT-based FPGA in polynomial time.
4.2
Further Scope
In 2.2 we assumed a unit delay model for the Boolean Network. One immediate extension of the FlowMap algorithm could be to use a more general delay model other than the unit delay model. For example [5] exhibits use of nominal delay model in FPGA designs where the interconnection delay of a single net is estimated by the number of fanouts of the net. The publication shows that this model estimates the interconnection delay quite well. Another possible extension is to combine area and depth optimization in the mapping procedure. In a FlowMap solution, depth of every node is minimum while only depths of nodes on critical path need to be minimized. This slack of non-critical node depths can be exploited for area minimization without affecting the depth optimality through certain delay relaxation operations on non-critical nodes [6].
Appendix I - Terminologies
K-feasible cone at a node v, Cv is a subgraph containing v and its predecessors such that: Also any path connecting a node in Cv to v should lie entirely in Cv Level of a node length of largest path to any node from any Primary Input Depth of a network is the largest node level K-bounded boolean network is one where:
Height of a Cut (X,X) maximum node label in X { } that are adjacent to some node in }
Node cut-size of a cut
is # of nodes in {
where E(N) is the edge set of the network N K-feasible cut is a cut with node cut-size less than or equal to K Edge cut-size sum of the capacities of the forward edges Volume of a cut is number of nodes in
Bibliography
[1] [2] [3] [4] [5] J. Cong and Y. Ding (1994). FlowMap: An Optimal Technology Mapping Algorithm for Delay Optimization in Lookup-Table Based FPGA Designs. IEEE Trans. J. P. Roth and R.M. Karp (1962). Minimization over Boolean Graphs. IBM J. Res. Devel. K.C. Chen, J. Cong, Y. Ding, A.B. Kahng, and P. Trajmar (1992). DAG-map: Graph-based FPGA technology mapping for delay optimization. IEEE Design and Test of Computers E. L. Lawler, K. N. Levitt, and J. Turner (1969). Module clustering to minimize digital networks. IEEE Trans. Computers M. Schlag, P. Chan and J. Kong (1991). Empirical evaluation of multilevel logic minimization tools for a field programmable gate array technology. Proc. 1st Int. Workshop on Field Programmable Logic and Applications J.Cong and Y. Ding (1993). On area/depth trade-off in LUT-based FPGA technology mapping. Proc. 30th ACM/IEEE Design Automation Conf.
[6]

Technology Mapping Using Flow Networks

Diunggah oleh

Informasi Dokumen

Deskripsi Asli:

Judul Asli

Hak Cipta

Format Tersedia

Bagikan dokumen Ini

Bagikan atau Tanam Dokumen

Opsi Berbagi

Apakah menurut Anda dokumen ini bermanfaat?

Apakah konten ini tidak pantas?

Hak Cipta:

Format Tersedia

Technology Mapping Using Flow Networks

Diunggah oleh

Hak Cipta:

Format Tersedia

Technology Mapping using Flow Networks

Supervised Research Exposition, Autumn 2011

Autumn 2011 Supervised Research Exposition

2 Technology Mapping using Flow Networks

Supervised Research Exposition Autumn 2011

Flow Networks ............................................................................................ 16