CHIP: Clustering Hotspots in Layout Using Integer Programming

Clustering algorithms have been explored in recent years to solve hotspot clustering problems in integrated circuit design. With various applications in design for manufacturability flow such as hotspot library generation, systematic yield optimization, and design space exploration, generating good quality clusters along with their representative clips is of utmost importance. With several generic clustering algorithms at our disposal, hotspots can be clustered based on the distance metric defined while satisfying some tolerance conditions. However, the clusters generated from generic clustering algorithms need not achieve optimal results. In this paper, we introduce two optimal integer linear programming formulations based on triangle inequality to solve the problem of minimizing cluster count while satisfying given constraints. Apart from minimizing cluster count, we generate representative clips that best represent the clusters formed. We achieve a better cluster count for both formulations in most test cases as compared to the results published in the literature in the ICCAD 2016 contest benchmarks as well as the reference results reported in the ICCAD 2016 contest website.


LIST OF FIGURES
vii the distance metric defined while satisfying some tolerance conditions. However, the clusters generated from generic clustering algorithms need not achieve optimal results. In this paper, we introduce two optimal integer linear programming formulations based on triangle inequality to solve the problem of minimizing cluster count while satisfying given constraints. Apart from minimizing cluster count, we generate representative clips that best represent the clusters formed. We achieve better cluster count for both formulations in most test cases as compared to the results published in literature on the ICCAD 2016 contest benchmarks as well as the reference results reported in the ICCAD 2016 contest website CHAPTER 1. OVERVIEW As the feature size decreases rapidly, the problem of manufacturability in integrated circuits increases due to limitations in lithographic wavelength used during fabrication stage. These problems identified as hotspots are a set of problematic patterns in the layout that have printing issues. These are detected either using traditional lithographic simulations or machine learning based detection methods that have been proposed in recent years. When such defects are found, finding patterns of similar kind is of high interest. It becomes useful to cluster these clips of interest into groups and process them together. This is called layout pattern classification Topaloglu (2016)  There are several other works such as Ding et al. (2009);Wuu et al. (2011);Yu et al. (2012Yu et al. ( , 2013 which focus on hotspot detection frameworks, whereas hotspot clustering has been rarely explored but plays an important role in various applications in Design for Manufacturability flow. In previous work Chang et al. (2017); Ma (2009); Tam and Blanton (2015), a few generic clustering algorithms such as k-means Tam and Blanton (2015), hierarchical and incremental clustering Ma (2009), markov clustering v. Dongen (2000, Chang et al. (2017) were explored to solve this problem. In k-means clustering, the value of k needs to be provided by the user, but the user may not know the cluster count apriori. Therefore it does not solve the purpose of finding good quality clusters automatically. In hierarchical clustering algorithm, starting from each data point as a cluster, the data is hierarchically grouped together based on different types of linkages. Since, hierarchical clustering finds groups of data in an hierarchical manner, it is again user dependent to get the clusters. In incremental clustering, in the order of processing data, either new clusters are created or existing clusters are grown incrementally.
This algorithm depends on the order of processing data, and therefore doesn't produce good quality solutions. Markov Clustering v. Dongen (2000) is known to find good quality clusters in a short time, but the clustering depends on fine tuning several parameters in the algorithm.
There are several other clustering algorithms in the literature to cluster any kind of data, however, the problem formulations' of those algorithms are different from that of the hotspot clustering problem. Therefore, a post processing step is required while using those algorithms to regroup clusters in order to satisfy the given constraints.
In this report, we discuss our tool called CHIP which solves the given hotspot clustering problem optimally. We formulate two integer linear programs to solve for the optimal number of clusters, i.e., the objective of both formulations is to minimize cluster count. With some tolerance given by area constraint or edge constraint, our tool classifies given clips into clusters without assuming the representative clips must be from the given data set. Since the representative clip is not required to be one of the given clips, we generate the representative clip based on the cluster data and the tolerance provided. This framework can achieve optimal cluster count while satisfying the constraints as per the results from ICCAD 2016 Contest Problem C -Pattern Classification for Integrated Circuit Design Space Analysis Topaloglu (2016).
The report is further organized as follows: In chapter 2 we describe the problem, define the terminology and elaborate the two modes in clustering. In chapter 3 we discuss the overview of our tool flow. In chapter 4 we define our integer linear programming formulations which exactly represent the problem statement and in chapter 5, the framework to generate representative clips is elaborated. Further, in chapter 6 we report the results of the formulation and compare with existing algorithms. We conclude the work in chapter 7.
CHAPTER 2. PROBLEM DESCRIPTION

Overview
This problem is taken from the ICCAD 2016 Contest -Problem C. Given a GDS file with markers, clip size and the constraints as inputs, the hotspot classification tool has to cluster the clips formed around the markers and output the corresponding cluster identities and a set of representative clips which represent the clusters. There are two types of constraints given to the tool, i.e., area constraint (a) and edge constraint (e). Based on the type of constraint, the tool has to perform clustering in the respective mode. The tool takes either area constraint or edge constraint but not both as the input.  is similar to all its clips, where the degree of similarity is constrained by a tolerance parameter given as input. For practical purposes, representative clips can be chosen from existing clips for each cluster. But it need not necessarily exist in the layout.
Additional Specifications: Mirroring of clips is allowed i.e., 180 rotation along the axes passing through the clip's center. Therefore, there are 4 possible combinations for each clip. This is depicted in Figure 2.3. Also, since the clip's center need not be at the center of the marker, clip shifting can be performed to generate a set of clips for one marker. A sample set of possible clips are depicted in Figure 2.4 by shifting a clip's center. In this work, for simplicity, we consider the center of clip to be at the center of the marker i.e., we only consider the clip (a) in Figure 2.4. In general, clustering algorithms require pairwise similarity relation of data points in order to group the data into clusters. Pairwise distances of data points is one of the ways to establish the similarity measure, i.e., greater the distance, greater the dissimilarity. There are various types of distances used for different applications such as L1-norm for images, L2-norm for any d-dimensional set of points, Hamming distance for distance between two strings, etc.
In hotspot classification, each clip ( For any two clips x 1 and x 2 , XOR(x 1 , x 2 ) produces a clip which depicts the dissimilarity between the given two clips. Further, based on the two constraints -area constrained clustering and edge constrained clustering, the distance metric is defined for each mode by imposing the respective constraints on the resultant clip. In the following sections, the two constraint based clustering modes are explained in detail.

Area Constrained Clustering
In area constrained clustering (ACC) the distance metric is computed based on the area of the resultant clip from exclusive OR operation applied to two clips x 1 and x 2 i.e., i.e., D( Given this distance function between a pair of clips, ACC constrains the distance between any clip S in a cluster and its representative clip R as follows: where w × h is the area of the clip and 0 ≤ a ≤ 1 . Here, a is the parameter given to the tool which constraints the distance between the clips.
If a = 1, the tool has to perform exact clip matching. For practical purposes a is close to 1.
This constraint need not enforce two clips to be clustered together if they satisfy it. However, if two clips do not satisfy the constraint, then they cannot be clustered together.

Edge Constrained Clustering
In edge constrained clustering (ECC), the distance between two clips x 1 and x 2 is given by the maximum shift along an edge either inward or outward in clip x 1 with respect to clip x 2 , i.e., if e i is i th shift along one edge out of all possible edge shifts in clip x 1 with respect to the clip x 2 , then D(x 1 , x 2 ) = max(e 1 , e 2 , ...) For any clip S in a cluster and its representative clip R, then according to ECC, the following should be satisfied: where e is given as a parameter. Here e is nonnegative real number. For practical purposes e is close to 0. Similar to ACC, ECC does not enforce the clips to be clustered together if they satisfy the constraint. If the clips do not satisfy the constraint, they should not be clustered together.

CHAPTER 3. OVERVIEW OF THE TOOL
The proposed tool flow is discussed in this chapter. Figure 3.1 shows our proposed tool flow with the steps. In layout data processing step, we convert all the polygons into rectangles for easier data processing. We then handle the layout data (in rectangles) using a grid structure in order to speed up the process of clip extraction. In distance computation step, we reorient all clips in a canonical way to consider mirroring of the clips. Exact pattern matching is performed to reduce data size and therefore redundant computations are avoided in the subsequent steps. Then we compute the pairwise distances between these reduced data according to the constraint type. Using this distance matrix (D) and given tolerance (D c , which is determined by either a or e depending on the constraint type), in clustering step, an optimizer is called to solve the optimization problem based on the formulations discussed in Chapter 4, and arrive at an optimal solution along with the cluster indices. Further, since we assume each cluster need not have its representative amongst given data, we use ILP formulation again to search feasible solution space to generate the representative clip. The following sections and chapters discuss each step in our proposed flow in detail.

Layout Data Processing
In this step, firstly, the polygons are converted into rectangles using a standard algorithm.
Note that this conversion need not be optimal in nature. Then, the entire layout is divided into a grid structure where each unit is of width w and height h as shown in figure 3.2. With this grid structure, the rectangles overlapping with each grid are stored in a data structure. While extracting the clip for a given marker, we use the information stored in the data structure to take relevant rectangles to form the clip. This process avoids scanning all rectangles and finding intersection between them and the clips of interest. This is illustrated in figure 3.2. At most 4 grid structures and correspondingly the rectangles present in them are scanned for any clip to be extracted.

Reorientation
Since we consider reflections along x-axis or y-axis or both, in this step, before computing distances between the clips based on area or edge constraint, we perform reorientation of the clips in a canonical way. We compute the center of mass (COM) for a given clip and divide the clip into 4 quadrants. Here center of mass metric is defined as follows: Let a i be a clip which is mapped to a R 2 space with w × h number of data points, with the range -w/2 to w/2 on x-axis, -h/2 to h/2 on y-axis, and the center of the clip at (0,0). With this mapping, if there is a pixel at (x,y) then it's value is 1 i.e., a i (x, y) = 1 and 0 otherwise.
Let (x c , y c ) represent the center of mass of this notation. Therefore,

Clip matching
Once the clips are reoriented in a canonical way, clip matching step is performed in order to merge exact clips in the given data. In an IC with millions of gates, it is most likely to find identical patterns in the layout and hence this step would reduce the amount of data to be processed. Exact clip matching can be performed with pattern matching algorithms or by string comparison if each clip is encoded into a string as proposed in Yu et al. (2015) In this work, exact clip matching is performed in two levels. First, the given data is divided into different bins, where a bin contains all the clips of same area. Then, the clips in each of the bins are iterated through, with new clusters formed whenever there is a mismatch with the existing clusters in the bin i.e., incremental clustering is performed, where two or more clips are clustered together if the pairwise distance between them is zero.
To compute the distance between the two clips, each clip is divided into non uniform grid where the grid lines are along the boundaries of the polygons on the two clips. Therefore each grid in clip is either completely covered by a polygon or completely empty, and hence can now be represented by a binary value. As a result, the distance of the two clips can be easily computed based on the binary values for each grid and its corresponding area, as shown in

Distance Computation
In this final step, pairwise distances are computed between the reduced set of clips. To compute the distance between the two clips, each clip is divided into non uniform grid where the grid lines are along the boundaries of the polygons on the two clips as discussed in section 3.2.2. Therefore, the distance between a pair of clips can be easily computed based on the binary values for each grid and its corresponding area, as shown in Figure 3.4.

CHAPTER 4. ILP FORMULATIONS
One of the objectives of the problem is to minimize the cluster count while satisfying the tolerance in terms of ACC/ECC. In the following formulations, we define the objective of the ILP as, minimizing the number of clusters. Therefore the optimizer solves for optimal number of clusters for a given constraint. Also, we leverage the idea of triangle inequality, as defined in 4.1, in order to generate minimal cluster count, i.e., the representative clip need not be chosen from the given clips and therefore we explore the solution space without unnecessary restrictions while satisfying the given constraints. We formulate two integer linear programming approaches describing the given problem in different ways. Both these formulations are described in the sections below.

CHIP-Node
In this formulation, we describe the clustering problem using nodes as variables, where each node is assigned a cluster identity based on the distance metric and the constraints. We define C i as variable representing each data point i, and its value indicates the cluster index of that data point, i.e., ∈ same cluster ∀i, j = 1, 2, ....n n = number of data points Here, the variables C i are upper bounded by another variable, K, representing the cluster count i.e., 1 ≤ C i ≤ K, ∀i = 1, 2, ..., n and K ≥ 1. With this setup, the objective to minimize cluster count is to minimize K in our formulation.
Let D(i,j) be the distance between i th clip and j th clip. And the constrained distance be D c .
Definition 4.1 Triangle Inequality for clustering: Given a cluster of clips and the distance constraint D c , if D(i, j) ≤ 2 × D c , ∀i, j ∈ same cluster, then ∃ r such that D(i, r) ≤ D c ∀i.

ILP Formulation:
Objective: minimize K Here, H is a huge constant, C i is integer ∀i and S ij is 0 or 1 ∀ i,j The above two constraints enforce the condition that if the distance between two clips i and j, D(i, j) > 2D c , then the two clips (nodes) cannot be clustered together i.e., C i = C j . However, the constraints can be ignored whenever the distance constraint is satisfied i.e., the clips can be either clustered together or not. This is elaborated in the following two cases: Case 1: If D(i, j) > 2D c : Constraints: Note: Here, is a small value Note that a preprocessing step elaborated in Section 3.2.2 is applied to eliminate exactly matched patterns. Hence D(i,j) will never be zero in this formulation.

Area Constrained Clustering
In case of area constrained clustering, D(i, j) = Area(XOR(x i , x j )) as defined in section 2.2.1 and D c = w × h × (1 − a), where a is the area constraint ranging between 0 and 1. Notice that, for a = 1, D c = 0 =⇒ C i = C j , ∀i, j

Edge Constrained Clustering
In case of edge constrained clustering, D(i, j) = max(e 1 , e 2 , ..) as defined in section 2.2.2 and D c = e, where e is the given edge constraint (in nm).

CHIP-Edge
In the 2nd formulation, we describe the clustering problem using edges, where two nodes connected by an edge are clustered together. We define the objective of the ILP as, to minimize the number of clusters. Similar to the previous formulation, we leverage the idea of triangle inequality in order to generate minimal cluster count i.e., the representative clip need not be chosen from the given clips and therefore we explore the solution space without unnecessary restrictions while satisfying the given constraints.
We define a graph where the nodes are clips and the edges between them indicate whether the clips can be clustered together. We define s ij as a variable indicating whether two clips i and j are clustered together, i.e., s ij = 1 if i, j are clustered together and 0 otherwise ∀i, j.
In other words, These s ij variables are given as input (constant value = 0) if two clips cannot be clustered together. Else, they can take either 0 or 1 (variable in the formulation). This is based on the condition that two clips cannot be clustered together if the distance constraint is not satisfied.
However, they can either be clustered or not, if the distance constraint is satisfied.

ILP Formulation:
Objective: minimize n − ( i<j t ij ) Constraints: ∀i, j and k where 1 ≤ i, j, k ≤ n, i = j = k Here constraint (4.5) enforces the condition that if i,j are in same cluster; j,k are in same cluster then i,k has to be in same cluster.
Apart from s ij , binary variables t ij are introduced.
Constraint (4.3) implies that t ij must be 0 if s ij is 0. Even if s ij = 1, if there exists a k such that k<i<j and both s ki and s kj are 1, then t ij must be 0 too. Therefore, t ij can be 1 iff i(<j) is the node with the smallest index in the cluster defined by s=1 containing the edge ij.
As the sum, i,j t ij is maximized, the edges with t ij = 1 will define a spanning forest (i.e., collection of trees) which is a subgraph of the graph defined by the edges with s ij = 1.
Here the summation, i,j t ij indicates the summation of [number of cluster members -1] of all the clusters. Therefore, it can be observed that, the objective n -( i<j t ij ) = K, where K is the number of clusters as per the 1st formulation. Example for CHIP-Edge: Let there be 9 clips (nodes). Given, a pairwise distance relation amongst these 9 clips, the graph formed (with s ij as edges) at an instance during the optimization is shown in Figure 4.1.
According to the constraints, the variables t ij take the values 0 or 1. The resultant graph with t ij as edges is shown in Figure 4.2. From this figure, the objective value can be computed, which is, 9 − (2 + 3 + 1) = 9 − 6 = 3 ( = Number of clusters)

Area Constrained Clustering
In case of area constrained clustering, D(i, j) = Area(XOR(x i , x j )) as defined in section 2.2.1 and D c = w × h × (1 − a), where a is the area constraint ranging between 0 and 1.

Edge Constrained Clustering
In case of edge constrained clustering, D(i, j) = max(e 1 , e 2 , ..) as defined in section 2.2.2 and D c = e, where e is the given edge constraint (in nm).

CHAPTER 5. REPRESENTATIVE CLIP GENERATION
In this chapter, we discuss the framework to generate representative clips for the clusters formed in clustering step. Firstly, each cluster is checked whether there exists any clip among the cluster members that satisfies the constraints to be a representative clip. If a representative clip doesn't exist amongst the given clips, then we proceed to the following steps: 1. Data Preprocessing 2. MILP Formulation 3. Representative Clip Generation, and these steps are discussed in the following sections.

Data Preprocessing
In this step, we build a grid data structure formed along the edges of polygons of all the clips in the cluster. This structure is similar to that used in distance computation in Section 3.2.2, where only two clips are used to form the grid data structure as compared to considering all the clips in the cluster in this step. Using this structure, we can represent each clip in the cluster using a vector where each dimension represents the area covered by a polygon in a particular grid. Each grid data structure is unique with respect to the clusters.

MILP Formulation
Using the grid data structure, we formulate an mixed integer linear program to find a feasible solution that satisfies the given constraints. This feasible solution is then used to generate the representative clip.

Formulation:
Let c 1 , c 2 , c 3 , ..., c q be a set of clips which belong to a cluster and c r be its representative clip. Therefore, as per the clustering formulations, ∃ c r such that D(c r , c i ) ≤ D c ∀ i = 1, 2, ...n where D c is the given constraint.
Let the number of grids in the grid data structure of a particular cluster be d, i.e., the given clips and the representative clip of the cluster can be represented by a d-dimensional vector with corresponding areas (A j ) as upper bound for each dimension. Let the vector be represented by For a cluster, we define another d-dimensional vector called area vector(A) where A = , each A i is area of a grid in the grid data structure.
Based on the grid structure formulation, each given clip in the cluster can be represented by either 0 (empty) or A j (filled) ∀j. Therefore, distance between c r and c i can be written as a linear function. Bounds: c r j ≤ A j As per the constraints and bounds, each variable (c r j ) takes values from 0 to A j ∀ j, i.e., it takes continuous values rather than discrete. These values are then used to fill the grids using heuristics discussed in next section.
Finding feasible solution step can be further sped up by removing redundant dimensions (grids) which are either always empty or always filled in all the clips of a cluster.

Representative Clip Generation
In this section, a heuristic is proposed to generate the representative clip as described in if c r j < A j then 6: if preference(l,r,t,b) = x then 7: while fill < c r j do 8: fill the grid with horizontal rows of pixels 9: if preference(l,r,t,b) = y then 10: while fill < c r j do 11: fill the grid with vertical rows of pixels In Algorithm 1, c r l ,c rr ,c rt ,c r b represent the neighboring grids (left, right, top and bottom respectively) of a grid in c r . In this algorithm, if c r j = A j , we fill the grid entirely. If c r j < A j , then the grid has to be filled partially. This can either be done along x-axis or y-axis, until the condition is satisfied. For uniformity, we design a heuristic (function PREFERENCE in Algorithm 1) to capture the local neighborhood and fill the grid accordingly.

CHAPTER 6. EXPERIMENTAL RESULTS
We implemented our approach using C++ programming language with STL and Boost libraries. We use IBM'S CPLEX Optimizer IBM CPLEX to solve the integer linear program.
We performed the experiments based on the benchmarks provided by ICCAD 2016 Contest as shown in where each iteration of the optimization is limited by time threshold.  Table 6.3, it can be observed that the ILP formulations which solve the constrained clustering problem, scale well for the testcases, due to the reduction in data size after exact pattern matching is performed in prior steps. Also, we observe that default case (exact pattern matching) takes majority of the runtime (from Table 6.2). It can be easily reduced with the parallelization of the exact pattern matching tasks. In future work, the preprocessing steps could be further optimized in order to reduce the bottleneck of our tool and therefore achieve even faster overall runtime for the tool.
We achieve better results in most of the test cases in terms of cluster count as compared to previous work. Even though we do not adopt clip shifting, we achieve results as good as the results in Chen et al. (2017), which is best in terms of cluster count so far but employs clip shifting. Also, clip shifting could be easily added to our formulations to further reduce the cluster count.

CHAPTER 7. CONCLUSION
In this report, we introduce the problem of layout pattern classification in integrated circuit design. With several applications in design for manufacturability flow such as hotspot library generation, hierarchical data storage and systematic yield optimization, clustering the hotspots optimally with good quality representative hotspots is important.
We formally introduce the hotspot clustering problem and briefly discuss the overview of our proposed tool. Then, we introduce the two integer linear program formulations which solve for optimal clusters for the pattern classification problem in IC layout, subject to constraints given by ACC or ECC. Apart from minimizing cluster count, we generate representative clips that best represent the clusters.
We achieve better results in majority of the test cases as compared to the existing results published in literature and the reference results reported in ICCAD 2016 contest website.
Although the runtime of the ILP is more than the other methods, our main focus in this work is to develop a generic framework to cluster the hotspots in a layout optimally. These formulations describe the given problem exactly unlike other works in literature which try to adapt the existing clustering algorithms to this problem, with some post processing steps. In future work, clip shifting can be adopted to the tool flow to increase the solution space and thereby further reduce the cluster count.