# Electrical–Thermal Cosimulation With Nonconformal Domain Decomposition Method for Multiscale 3-D Integrated Systems

Jianyong Xie and Madhavan Swaminathan, *Fellow, IEEE*

*Abstract***— Because of the multiple scales in 3-D integrated systems, numerical simulation methods that are able to handle multiscale problems efficiently are strongly required. In this paper, electrical–thermal cosimulation of multiscale integrated systems using Mortar finite element-based domain decomposition method is proposed. Using the nonconformal domain decomposition approach, integrated systems can be divided into separate subdomains. Individual subdomains can be discretized independently based on its detailed feature size and formulated using the finite element method with associated boundary conditions. The coupling between domains is captured using Lagrange multipliers. For large multiscale 3-D problems, the cascadic multigrid method combined with the subspace confined preconditioned conjugate gradient method is used to accelerate the convergence of the solution with hierarchical meshing grids. Several examples are simulated and the results validate the accuracy and efficiency of the electrical–thermal cosimulation using nonconformal domain decomposition.**

*Index Terms***— Cascadic multigrid (CMG) method, domain decomposition, Joule heating, Lagrange multipliers, Mortar finite element method (FEM), multiscale, preconditioned conjugate gradient (PCG) method, through-silicon via (TSV), voltage drop.**

## I. INTRODUCTION

**3**-D INTEGRATION technology provides the capability to continuously miniaturize integrated systems using advanced interconnect schemes such as through-silicon vias (TSVs). With the TSV fabrication process maturating and IC continuing to scale toward 22-nm node and beyond, the power density of 3-D integrated systems are expected to increase dramatically as compared with a 2-D approach [1]. In addition, due to stacking of dies vertically using TSVs and interposers, thermal coupling can cause sharp chip temperature increase and the thermal effect on electrical performance such as voltage drop can become important. Coupled thermal-DC voltage drop analysis has been proposed in [2]–[4]. Moreover, due to the temperature-dependent electrical resistivity, locationdependent Joule heating effect is becoming an essential contributor for the temperature increase in the package [11].

Manuscript received June 29, 2012; revised April 9, 2013; accepted October 4, 2013. Date of publication November 8, 2013; date of current version March 28, 2014. This work was supported by the National Science Foundation under Grant CMMI-1129918. Recommended for publication by Associate Editor K. Ramakrishna upon evaluation of reviewers' comments.

The authors are with the School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, GA 30332 USA (e-mail: jianyong.xie@gatech.edu; madhavan.swaminathan@ece.gatech.edu).

Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TCPMT.2013.2286403



Fig. 1. Multiscale 3-D integrated system.

For on-chip power distribution, its temperature mainly depends on the transistor power dissipation (chip power map) due to the shorter heat transfer path to a heat sink. However, 3-D integrated systems are expected to comprise of stacked dies, interposer, TSVs, microbumps, multilayer package, and printed circuit board (PCB), as shown in Fig. 1. The dies are stacked on top of each other using thousands of TSVs and microbumps for power delivery and chip-to-chip communication. The TSV and microbump diameters are in the range  $2-50 \mu$ m, depending on the process used. However, the size of the interposer, package, and PCB is usually in the scale of centimeters. Because of the large-size PCB and package, small size TSVs and microbumps, the scale ratio can reach 1:50 000 and beyond. Thus, millions of cells need to be discretized using conventional finite element or finite volume-based approaches and thus long simulation time is required for a single DC IR drop and thermal simulation [5]. Therefore, to obtain accurate voltage drop with temperature effect and system temperature increase due to Joule heating effect, fast electrical–thermal cosimulation method, which can handle complex multiscale structures efficiently, is required. It is noteworthy that the focus of this paper is to determine the IR drop and temperature gradient within a system as opposed to computing these effects within the chip, which provides insight into these effects within the interposer, package and PCB. At these levels, the Joule heating effect of the interconnections becomes an important determining factor for obtaining the necessary accuracy.

In the past, for thermal modeling of IC chip and package, finite element-based method [6], [7], and finite difference-based method [8], [9] have been used. To capture

2156-3950 © 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications\_standards/publications/rights/index.html for more information.

thermal effect on voltage drop as well as Joule heating effect on thermal profile, the electrical–thermal cosimulation method with Joule heating, heat conduction, and air convection effects has been developed [3]. This was extended to include microfluidic cooling effect for 3-D integrated systems in [4]. However, for thermal simulation, [3]–[5] and [6]–[9] employ conventional finite volume, finite difference or finite element discretization approaches. This causes the fine meshing grids to propagate through the 3-D multiscale structure. As a result, to capture all the geometric details for solving the multiscale 3-D problem, millions of unknowns are required and leading to long simulation time. Though methods without volumetric meshing such as Green's function-based thermal simulation [10] can be applied, the accuracy is limited due to 3-D inhomogeneous problems containing complex materials and structures including stacking chip using TSV, microbumps, interposer, package, and PCB. To circumvent these problems, multigrid-based solution can be adopted to accelerate thermal [9] and electrical–thermal cosimulation [11]. However, due to the large-scale difference arising in 3-D integration, a large number of meshed cells is required even for the initial coarsest mesh used and thus is inefficient for simulating 3-D structures, as shown in Fig. 1.

Domain decomposition method (DDM) allows the dividing of a large-complex problem into many subdomains, which are smaller and thus easier to handle. For nonoverlapping domain decomposition with geometrical conformal meshing grid at the interface, the coupling between domains can be captured using the relationship between interface nodes and interior nodes [14]. However, due to the conformal mesh used, the meshed cells cannot be reduced effectively. The nonconformal DDM has therefore been proposed for eddycurrent calculation in [24], electromagnetic simulations [25], transient thermal simulation in [27], and thermal-aware IR drop analysis using the interior penalty formulation and Robin transmission condition in [31].

This paper focuses on electrical–thermal modeling of a multiscale 3-D system with emphasis on the interposer, package and PCB using Mortar finite element method (FEM)-based domain decomposition with nonconformal gridding and cascadic multigrid (CMG) solving approach. The effect of extra unknowns due to Lagrange multiplier on simulation efficiency is addressed both theoretically and experimentally. Since the on-chip power grid itself has a complex structure with millions of nodes, special consideration and simulation algorithms are required to handle them [26], [28], [29]. Thus, analyzing the on-chip power grid is not the focus of this paper. Instead the goal of this paper is to analyze a 3-D system holistically to capture the IR drop and thermal gradients across the system, which in turn can provide better boundary conditions for chip level simulations. Hence, the chip in this paper is considered as a heat dissipater where uneven power maps exist. In this paper, to handle the multiscale 3-D problem efficiently and also to ensure the capturing of all the geometric details, the integrated system is divided into separate subdomains and each subdomain is meshed independently based on the detailed features in it. Thus, the resulting system has much fewer meshed cells and unknowns as compared with using FEM or FVM.



Fig. 2. Flowchart of electrical–thermal cosimulation.

For both DC voltage drop and thermal simulations, individual domains can be formulated based on FEM with associated boundary conditions and the coupling between domains is captured using Lagrange multipliers. To ensure solution accuracy, the hierarchical mesh refinement in subdomains is required. Because of the interaction between DC voltage drop and thermal characteristics, the additional system variables namely Joule heating and temperature require special consideration and treatment for the joint DC voltage drop and thermal problem to be solved efficiently using the CMG method.

The organization of this paper is as follows. In Section II, an overview of the iterative electrical–thermal cosimulation method is presented. The finite element formulations for DC voltage simulation and thermal simulation with air convection boundary condition are explained in detail. The nonconformal DDM with Mortar FEM formulations using Lagrange multipliers is discussed in Section III. Section IV presents the electrical–thermal coanalysis using the CMG approach with special consideration for Joule heating and temperature gradient. The effect of extra interface unknowns on simulation cost is discussed. In Section V, several experimental examples are simulated and discussed. Finally, the conclusion is summarized in Section VI.

## II. COSIMULATION METHOD OVERVIEW

Because of the temperature-dependent electrical resistivity and Joule heating generated in the power delivery network (PDN), the electrical and thermal characteristics couple to each other and form a nonlinear system at the interposer, package, and PCB level. To consider thermal effect on DC voltage drop and Joule heating effect on thermal profile, an iterative electrical–thermal cosimulation is required [2]–[4].

# *A. Cosimulation Flow*

The flowchart of electrical–thermal cosimulation procedure [4] is shown in Fig. 2. The input includes geometric information including chip, package and PCB layout parameters and material parameters including thermal conductivities and temperature sensitive electrical resistivity parameters. To perform electrical–thermal cosimulation, the electrical and thermal excitations including voltage, current, chip power map, and corresponding boundary conditions need to be assigned. In each iteration, the corresponding Joule heating and temperature-dependent resistivity are updated for thermal and dc IR drop simulation, respectively, as shown in Fig. 2. The iteration between the dc IR drop and the thermal simulation is continued until convergence is reached, which corresponds to steady-state conditions. In addition to estimating voltage drop and temperature, the electrical–thermal cosimulation also allows estimating and identifying Joule heating effect caused by PDN current crowding at an early design stage. The convergence criterion is met when the maximum temperature variation is less than 0.1% in our simulation. The proposed method solves coupled dc voltage drop-thermal problem iteratively as the classic Newton's method for a nonlinear problem, which has been shown in [4]. In general, the Joule heating generated by the PDN in an electronic system can cause limited temperature increases and convergence can be achieved. However, for designs without careful considerations, the Joule heating can cause sharp temperature increases that lead to nonconvergence, which can also be captured using the proposed method.

The details of the governing equations and boundary conditions for dc IR drop and thermal simulations are explained in this section.

# *B. DC Voltage Drop Analysis*

In the steady state, the governing equation for voltage distribution can be expressed as

$$
\nabla \cdot \left( \frac{1}{\rho(x, y, z, T)} \nabla \phi(x, y, z) \right) = 0 \tag{1}
$$

where  $\phi(x, y, z)$  represents voltage distribution and  $\rho(x, y, z, T)$  is the temperature-dependent electrical resistivity described by

$$
\rho = \rho_0 [1 + \alpha (T - T_0)] \tag{2}
$$

where  $\rho_0$  is the electrical resistivity at  $T_0$ , which is 20 °C, and  $\alpha$  is the temperature coefficient of the electrical resistance.

By solving (1) with the boundary conditions given below in (3), the voltage distribution of the PDN can be computed [5]

$$
\phi|_{\Gamma_1} = V_{\text{input}}
$$
  
\n
$$
\frac{1}{\rho} \frac{\partial \phi}{\partial n}|_{\Gamma_2} = I_{\text{output}}
$$
  
\n
$$
\frac{\partial \phi}{\partial n}|_{\Gamma_3} = 0
$$
 (3)

where  $\Gamma_1$  and  $\Gamma_2$  represent the voltage supply boundary and current source boundary,  $\Gamma_3$  represents all the other homogenous Neumann boundaries in the PDN, and *V*input and *I*output represent the voltage excitation and output current density, respectively.

#### *C. Thermal Analysis*

For steady-state thermal analysis, the governing heat equation can be expressed as

$$
\nabla \cdot [k(x, y, z) \nabla T(x, y, z)] = -P(x, y, z) \tag{4}
$$

where  $k(x, y, z)$  and  $T(x, y, z)$  represent the thermal conductivity and temperature distribution, respectively;  $P(x, y, z)$ denotes the total heat excitation, which arises from the chip power map and Joule heating from the PDN. The Joule heating can be expressed as

$$
P_{\text{Joule}}(x, y, z) = \vec{J} \cdot \vec{E}(x, y, z) = -\vec{J} \cdot \nabla \phi \tag{5}
$$

where  $\vec{J}$  is the current density and  $\vec{E}$  (*x*, *y*, *z*) is the electrical field distribution in the PDN. It is important to note that the chip power map can be temperature dependent due to leakage [30]. However, in this paper, the power map of the chip is assumed to be temperature independent and a known quantity, to be able to estimate the IR drop and temperature gradients in the rest of the system. A temperature-dependent power map for the chip can also be used in the formulation presented, which has not been included in this paper.

To obtain the temperature distribution, the following boundary conditions:

$$
T|\mathbf{r}_1 = T_{\text{contant}}
$$
  
\n
$$
k \frac{\partial T}{\partial n} |\mathbf{r}_2 = P_{\text{input}}
$$
  
\n
$$
k \frac{\partial T}{\partial n} \Big|_{\mathbf{r}_3} = -h_c (T - T_a)
$$
  
\n
$$
\frac{\partial T}{\partial n} |\mathbf{r}_4 = 0
$$
 (6)

need to be considered. In (6),  $T_a$  and  $h_c$  represent the ambient temperature and air convection coefficient, respectively.  $\Gamma_1$ ,  $\Gamma_2$ , and  $\Gamma_3$  represent the constant temperature boundary, power source excitation boundary, and air convection boundary, respectively; and  $\Gamma_4$  represents all the other homogenous Neumann boundaries. Since both dielectric and conductors need to be considered in thermal simulation, the thermal simulation usually requires more unknowns than dc voltage drop analysis.

#### *D. Handling Inhomogeneous Material*

3-D systems contain inhomogeneous materials, as shown in Fig. 3(a). To handle the inhomogeneity, cell-based FEM formulation needs to be used [12]. With nonuniform meshing, each hexahedral cell only contains a single material with eight nodes, as shown in Fig. 3(b). Because of nonuniform meshing, the cell length is different in *x*-, *y*- and *z*-direction. The elementary stiffness matrix due to conduction can be formed, described by

$$
K_D^{(e)} = \frac{\Delta y \Delta z}{2\Delta x} K_x^{(e)} + \frac{\Delta x \Delta z}{2\Delta y} K_y^{(e)} + \frac{\Delta x \Delta y}{2\Delta z} K_z^{(e)} \tag{7}
$$

where  $K_x^{(e)}$ ,  $K_y^{(e)}$ , and  $K_z^{(e)}$  represent the 8 × 8 standard elementary stiffness matrix in *x*-, *y*-, and *z*-direction with cell thermal conductivity *k* and  $\Delta x = \Delta y = \Delta z = 2$ .



 $(b)$ 

Fig. 3. (a) Layer stacking with inhomogeneous material and (b) eight-node hexahedral element (cell) and trilinear basis functions.

 $\Delta$ 

 $n1$ 

With the relationship between the elementary cell node number and its global node order, the global stiffness matrix can be formed using the superposition rule, described as

$$
K_D = \sum_{e=1}^{n} K_D^{(e)}.
$$
 (8)

# *E. Finite Element Formulation for Single Domain*

In this section, the finite element formulation is explained for 3-D thermal simulation with air convection boundary conditions with 3-D nonuniform rectangular grids. By multiplying testing function *N* at both sides of (4) and integrating over the volume, after using the divergence theorem, the weak form [12] of the heat equation can be obtained as

$$
\iint_{\Omega} k \nabla N \cdot \nabla T \, dx dy dz - \int_{S} k N \frac{\partial T}{\partial n} ds = \iint_{\Omega} -N P dx dy dz.
$$
\n(9)

Using the convection boundary condition in (6), (9) can be converted as

$$
\iint_{\Omega} k \nabla N \cdot \nabla T \, dx \, dy \, dz + \int_{S} h_{c} N T \, ds
$$
\n
$$
= \iint_{\Omega} -N P \, dx \, dy \, dz + \int_{S} N h_{c} T_{a} \, ds. \tag{10}
$$

For 3-D simulations, 3-D nonuniform rectangular meshes are adopted and eight-node (trilinear) hexahedral elements are used as basis functions [Fig. 3(b)]. For the testing function,



Fig. 4. (a) 3-D integrated system and (b) nonconformal mesh of chip and package domains.

the same basis function can be used for simplicity. Thus, with *n* meshed cells, the system equation can be written as

$$
\sum_{e=1}^{n} \left( K_D^{(e)} + K_g^{(e)} \right) \varphi = \sum_{e=1}^{n} \left( f_P^{(e)} + b^{(e)} \right) \tag{11}
$$

where

$$
K_D^{(e)} = \iint_e k \nabla N \cdot \nabla N dx dy dz
$$
  
\n
$$
K_g^{(e)} = \int_{\Gamma e} h_c N N ds
$$
  
\n
$$
f_P^{(e)} = \iint_e -N P dx dy dz
$$
  
\n
$$
b^{(e)} = \int_{\Gamma e} N h_c T_a ds.
$$
\n(12)

In (11) and (12),  $K_D^{(e)}$  and  $K_g^{(e)}$  represent the elementary stiffness matrix for each element due to heat conduction and heat convection.  $f_P^{(e)}$  and  $b^{(e)}$  represent the external heat excitation and temperature gradient due to convection, respectively. For homogenous Neumann boundary condition (natural boundary condition) [13], we can simply let  $b^{(e)}$  and  $K_g^{(e)}$  equal to zero in (11). Since the voltage distribution equation and the heat equation share the same form except the air convection boundary condition, the same finite element formulation can be used.

# III. NONCONFORMAL DOMAIN DECOMPOSITION

In this section, the Mortar finite element-based nonoverlapping DDM with geometrical nonconformal meshing grid is explained.

# *A. Mortar FEM Formulation*

An integrated system includes 3-D stacked dies using TSVs, thermal interface material (TIM), microbumps and package, as shown in Fig. 4(a). Because of the feature scale difference in chip and package regions, millions of meshing cells are

required. To alleviate this problem, the integrated system can be divided into separate subdomains including chip and package domains, as shown in Fig. 4(b). The chip domain and package domain can be meshed independently using 3-D nonuniform grids. Thus, the meshing grids from chip domain do not overlap with grids from package domain and therefore the required meshing cells are greatly reduced. For simplicity, the DDM is explained with 2-D rectangular grids, as shown in Fig. 4(b).

At the interface, the continuity of electrical current and heat transfer needs to be ensured for dc voltage drop and thermal analysis, respectively. For two subdomains with a common interface [Fig. 4(b)], by assuming  $\lambda^{(i)} = k \partial T^{(i)}/\partial n_i (i = 1, 2)$ , where  $\lambda^{(i)}$  denotes a function from Lagrange multiplier space, we have the relationship of  $-\lambda^{(1)} = \lambda^{(2)} = \lambda$  [15], [18], then the weak continuity across the interface can be established and the following equations for domains and interface can be derived as:

$$
\iint_{\Omega_1} k \nabla N_1 \cdot \nabla T_1 dx dy - \int_{\Gamma_1} k N_1 \frac{\partial T_1}{\partial n} dt + \int_{\Gamma_{\text{inter}}} \lambda N_1 dt
$$

$$
= \iint_{\Omega_1} -N_1 P_1 dx dy \qquad (N_1 \in V^{(1)})
$$

$$
\iint_{\Omega_2} k \nabla N_2 \cdot \nabla T_2 dx dy - \int_{\Gamma_2} k N_2 \frac{\partial T_2}{\partial n} dt - \int_{\Gamma_{\text{inter}}} \lambda N_2 dt
$$

$$
= \iint_{\Omega_2} -N_2 P_2 dx dy \qquad (N_2 \in V^{(2)})
$$

$$
\int_{\Gamma_{\text{inter}}} (T_1 - T_2) \psi dt = 0 \qquad (\psi \in \Lambda) \qquad (13)
$$

where  $N_1, N_2$ , and  $\psi$  represent the basis functions for domain1, domain2, and the Lagrange multiplier, respectively [18], [27]. With temperature *T* being expressed as a linear combination of basis functions and  $\lambda = \sum_{i=1}^{N_{\lambda}} b_i \psi_i$ , the system equation for the problem with two subdomains [Fig. 4(b)] can be written as

$$
Kx = \begin{bmatrix} A_1 & 0 & B_1^T \\ 0 & A_2 & -B_2^T \\ B_1 & -B_2 & 0 \end{bmatrix} \begin{bmatrix} u_1 \\ u_2 \\ b \end{bmatrix} = \begin{bmatrix} f_1 \\ f_2 \\ 0 \end{bmatrix} = f \qquad (14)
$$

where the matrix entries for *k*th domain can be expressed as

$$
A_{k(ij)} = \iint_{\Omega_k} k \nabla N_i^{(k)} \cdot \nabla N_j^{(k)} dx dy - \int_{\Gamma_{\text{inter}}} k N_i^{(k)} \frac{\partial N_j^{(k)}}{\partial n} dt
$$
  
\n
$$
B_{k(ij)} = \int_{\Gamma_{\text{inter}}} \psi_i N_j^{(k)} dt \qquad (k = 1, 2)
$$
  
\n
$$
f_{k(j)} = \iint_{\Omega_i} -N_j^{(k)} P dx dy. \qquad (15)
$$

In (14) and (15),  $A_1$ ,  $A_2$ ,  $f_1$ , and  $f_2$  represent the stiffness matrix and excitation for domain1 and domain2, respectively.  $B_1$  and  $B_2$  represent the coupling matrix between the two domains. To obtain the stiffness matrix for each domain, the associated boundary conditions need to be used for the corresponding subdomains. In addition, the homogeneous Neumann boundary condition needs to be assigned at the common interface. For 3-D problem, the interface becomes a surface. Since the interface surface can have several thousands



Fig. 5. Basis functions for 1-D interface.



Fig. 6. (a) 2-D interface for 3-D problem and (b) interface basis functions in two directions.

of nodes, the four-point Gaussian quadrature for rectangular element is used to calculate *B* matrix effectively.

For integrated system, which is divided into *N* subdomains, the generalized system equation can be written as

$$
Kx = \begin{bmatrix} A & B^T \\ B & 0 \end{bmatrix} \begin{bmatrix} x_d \\ x_{\text{inter}} \end{bmatrix} = f \tag{16}
$$

where *A* is the block diagonal matrix described by

$$
A = \begin{bmatrix} A_1 & & \\ & A_2 & \\ & & \ddots & \\ & & & A_N \end{bmatrix} . \tag{17}
$$

#### *B. Interface Basis Function*

For Lagrange multipliers of the interface, the basis function can be constructed base on the interface grids from either side. To reduce the number of unknowns for the interface, it can be constructed based on a coarser meshing grid. However, in order to satisfy the inf-sup condition [15], [16] and thus the coupling matrix *B* for the interface has full rank, the basis function for the interface cannot be randomly selected. For 2-D problem with four-node (bilinear) elements [Fig. 4(b)], since the interface becomes a line, the interface basis function (Fig. 5) can be constructed based on linear shape functions, expressed as

$$
\psi_i = \begin{cases} \varphi_1 + \varphi_2, & (i = 1) \\ \varphi_{i+1}, & (1 < i < n-2) \\ \varphi_{n-1} + \varphi_n, & (i = n-2) \end{cases} \tag{18}
$$

where  $\varphi_i$  is the linear shape function associated with node *i*. Therefore, for 1-D interface with n nodes, the total number of basis functions for the interface is  $(n - 2)$ .

For 3-D problems, the interface becomes a surface connecting two subdomains, as shown in Fig. 6(a). Since adjacent domains are usually meshed independently, the meshing grid does not overlap at the common interface. For 2-D interface with  $N_x \times N_y$  nodes, the interface basis function can be

obtained based on 2-D bilinear shape functions. For easier representation, it can be described based on 1-D basis function in two directions [Fig. 6(b)] as follows:

$$
\psi_{ij} = \psi_{ix} \psi_{jy} \quad (1 \le i \le N_x - 2, 1 \le j \le N_y - 2). \tag{19}
$$

For domain decomposition with total of  $n_{\text{inter}}$  interfaces, assuming each interface has  $M_i$  basis functions, the dimension of *B* matrix is of  $N_B \times N_A$ .  $N_B$  can be expressed as

$$
N_B = \sum_{i=1}^{n_{\text{inter}}} M_i.
$$
 (20)

Using nonconformal gridding, meshing cells for subdomains can be greatly reduced. However, extra interface unknowns are added to the system (16). The additional computational cost due to the introduced interface unknowns is explained in Section IV.

For electrical–thermal cosimulation, the stiffness matrix *A* and *B* do not change with iterations for thermal simulation. However, for dc voltage simulation, due to the temperaturedependent resistivity, the stiffness matrix *A* varies with iterations while *B* stays the same. To reduce the simulation cost, the stiffness matrix *A* and *B* for thermal simulation and *B* matrix for dc voltage simulation are only calculated once and stored. The Joule heating for thermal simulation and stiffness matrix *A* for dc voltage simulation are updated with iterations.

# IV. CMG SOLVER

With nonconformal domain decomposition, the system unknowns can be greatly reduced. However, for a complex multiscale system, with matrix size approaching millions, the matrix condition number can increase dramatically and therefore fast iterative methods with preconditioner are required.

## *A. Hierarchical Meshing*

For simulation of multiscale systems, the meshing grids usually need to be refined in order to improve the solution accuracy. One way is to use hierarchical meshing grids. First, each subdomain can be meshed using coarse grids. To reduce the error by choosing smaller mesh size, finer mesh grids can be obtained using mesh refinement. Since the gridding of each domain is independent, different level of mesh refinement can be used by each domain. With systematic mesh refinement, a series of hierarchical meshing grids are obtained. For 3-D structures, mesh refinement can be carried out in *x*-, *y*and *z*-direction. For package and PCB board, since the lateral dimension is usually much larger than the vertical dimension, the mesh refinement is carried out only in *x*- and *y*-direction for both dc IR drop and thermal simulations.

# *B. CMG Method*

For thermal and dc IR drop simulations using the nonconformal domain decomposition formulation, the system matrix *K* becomes symmetric indefinite. Therefore, standard multigrid method cannot be applied directly. In this paper, instead of using standard multigrid method [17], the CMG method [18] is used to solve the linear system (16). It is important to note



Fig. 7. CMG solving flow.

that for the CMG to be applied successfully to the electrical– thermal iteration (Fig. 2), due to the coupling between the voltage drop and thermal characteristics, special consideration and treatment of the Joule heating and temperature are required considering the multilevel grids, which will be addressed in the following section.

The CMG solving flow is shown in Fig. 7. As shown in Fig. 7, the problem on the coarsest grid with fewer unknowns is solved exactly. Then, the solution is interpolated to next mesh level. For each mesh level except the initial mesh level, the iterative subspace confined conjugate gradient (CG) method is used as a smoother to accelerate the convergence of the solution before it is interpolated to finer grids. Since the initial approximation is interpolated from the previous level, the starting residual is small and thus the convergence can be reached efficiently. Since the stiffness matrix  $K$  is symmetric indefinite, a constraint preconditioner *M* needs to be used to accelerate the convergence of the CG method [19], [20]

$$
M = \left[ \begin{array}{cc} D & B^T \\ B & 0 \end{array} \right] \tag{21}
$$

where *D* is a positive definite matrix that satisfies the inequality of  $(Dv, v) > (Av, v)$ .

The pseudoalgorithm of CMG solving method is shown in Fig. 8. Since the subspace confined PCG method is used for each mesh level, the stop criteria  $\varepsilon$  needs to be used to check the convergence. Instead of using the energy norm-based error stop criteria as in [18], the L2 residual norm-based criteria is used as in standard PCG methods [5], which is described by

$$
||r_{ut}|| < \varepsilon ||r_{u0}|| \tag{22}
$$

where  $\|r_{u0}\|$  and  $\|r_{ut}\|$  represent the L2 norm of the residual for initial and *t*th PCG iteration, respectively. Since the residual is already calculated in each PCG iteration, no extra matrix–vector multiplication is needed; therefore, the computational cost is reduced.

Since *M* is used as the preconditioner in PCG iteration, the following equation needs to be solved:

$$
\begin{bmatrix} D & B^T \\ B & 0 \end{bmatrix} \begin{bmatrix} s_{11} \\ s_{12} \end{bmatrix} = \begin{bmatrix} r_{11} \\ r_{12} \end{bmatrix} . \tag{23}
$$

Input: K, A, B, f, M of each mesh level,  $\varepsilon$ ,  $n_{max}$ ,  $n_{direct}$ Output :  $u$ 

Do  $i = 0$ : level<sub>max</sub> if  $(i = 0)$ , solving on coarsest grid if size(K) <  $\overline{n}_{\text{direct}}$ <br>x =  $\left[\overline{u}^0, \lambda^0\right]$  = K<sup>-1</sup>f else PCG iteration with  $v_0$  set to be 0

end else

1) In each domain, interpolate from level  $(i-1)$  to level i  $\widetilde{u}^i = I u^{i-1}, \ \widetilde{\lambda}^i = I \lambda^{i-1}, \ \widetilde{v} = [\widetilde{u}^i, \widetilde{\lambda}^i]$ 2) PCG iteration for level i  $V_{00} = [u_{00}^i, \lambda_{00}^i] = \tilde{v} + M^{-1}(f - A\tilde{v})$  $v_0 = [u_0^i, \lambda_0^i] = [u_{00}^i, \tilde{\lambda}^i]$  $r_0^i = [r_{u0}, r_{\lambda 0}] = (f - Av_0)$ <br>  $s_0^i = [s_{u0}^i, s_{\lambda 0}^i] = M^{-1}r_0^i$  $p_0^1 = s_{u0}^1, \sigma_0 = (s_0^1, r_0^1)$  $\alpha_0 = \sigma_0 / (Ap_0^i, p_0^i)$ <br>for(t = 1 : n<sub>max</sub>)  $u_t^i = u_{t-1}^{i \text{ max.}} + \alpha_{t-1} p_{t-1}^i$ <br>  $\lambda_t^i = \lambda_{t-1}^{i-1} + s_{\lambda(t-1)}^{i-1}, v_t^i = [u_t^i, \lambda_t^i]$  $r_t^i = [r_{ut.}r_{\lambda t}] = f - Kv_t^i$  $s_t^i = [s_{ut}^i, s_{\lambda t}^i] = M^{-1}r_t^i$  $\sigma_t = (s_t^i, r_t^i)$  $\gamma_{t-1} = \sigma_t / \sigma_{t-1}$  $p_t^1 = s_{ut}^1 + \gamma_{t-1} p_{t-1}^1$  $\alpha_{t} = \sigma_{t} / (Ap_{t}^{i}, p_{t}^{i})$ check convergence if  $(\Vert r_{ut} \Vert < \varepsilon \Vert r_{u0} \Vert)$ break endif end for end

Fig. 8. Pseudocode for CMG method.

Instead of solving it directly, the following algorithm is used:

1) 
$$
C = -BD^{-1}B^{T}
$$
  
\n2)  $s_{12} = C^{-1}(r_{12} - BD^{-1}r_{11})$   
\n3)  $s_{11} = D^{-1}(r_{11} - B^{T}s_{12}).$  (24)

where *C* is the Schur complement associated with the interface Lagrange multiplier variables. Since the inverse of *D* needs to be used to calculate the Schur complement, *D* matrix that has simpler structure is preferred. In this paper, the diagonal matrix  $\alpha$  diag(A) is used for *D* matrix, where  $\alpha$  is a positive number, a scaling factor. As a result, the inverse of *D* becomes trivial and *C* is also a sparse matrix. Using a diagonal preconditioner is not necessarily robust but is a good compromise between engineering efficiency and good performance. For dc voltage simulation and thermal simulation, it is found out that the choosing  $\alpha$  value between 1 and 2 can benefit the convergence. Note that for the initial coarsest mesh which has thousands of unknowns, direct solving approach cannot be used due to finite memory. Instead of using direct solving approach, the subspace confined PCG method can also be used for the coarsest level, as shown in Fig. 8.



Fig. 9. (a) DC voltage drop and thermal iteration flow using CMG and (b) temperature averaging and Joule heating lumping from level-*n* to level-  $(n - 1)$ .

# *C. CMG for Electrical–Thermal Iteration*

Because of the interaction between voltage drop and thermal characteristics, Joule heating and temperature become additional variables, which need to be updated in each iteration. For CMG to be applied successfully to the electrical–thermal iteration with multilevel grids, special consideration and treatment are required. The dc voltage drop and thermal iteration flow using CMG is shown in Fig. 9(a).

It is assumed the thermal simulation is carried out first without considering Joule heating. For thermal simulation, only temperature distribution at the finest mesh level is obtained using CMG solver. However, due to the multilevel meshing grid used, the temperature profile at other coarser level needs to be calculated. Thus, the temperature profile on multilevel grids can be used to update the temperature-dependent stiffness matrices for dc voltage drop at different meshing level. On the other side, the Joule heating at the finest mesh level is obtained from dc voltage drop simulation. Similarly, the Joule heating at other coarser mesh levels also needs to be formed. Thus, the heat excitation vectors at different mesh level can be updated accordingly for CMG to be applied to thermal simulation. The temperature and Joule heating profile calculation from mesh level-*n* to level- $(n - 1)$  is shown in Fig. 9(b) using cell-based temperature averaging and Joule heat lumping.

#### *D. Computational Cost for Interface Unknowns*

Using nonconformal meshing, unknowns for subdomains can be reduced compared with conventional FEM method. Due to the introduced interface basis functions used to ensure the weak continuity between domains, extra nonzero entries of *B* matrix and unknowns for interfaces are added to the system. The effect of the extra unknowns on system computational cost needs to be investigated for CMG method. For 3-D system, assuming the total number of unknowns for domain and interface are  $N_A$  and  $N_B$ , it can be categorized into two cases based on the size of *NB*.

*1) Case A:* When *NB* is much smaller than *NA*, direct sparse solving method can be used to solve  $s_{12} = C^{-1}(r_{12} - r_{12})$  $BD^{-1}r_{11}$ ). The total computational cost for each subspace confined PCG iteration is of  $O(\alpha N_A + \beta N_B + (N_B)^p)$ . Since matrix  $B$  is for 2-D interface, the estimated order  $p$  is between 1.5 and 2.  $\alpha$  and  $\beta$  are scaling factors for matrix–vector multiplication depending on the matrix nonzero entries. Since  $N_B$  is much smaller compared with  $N_A$ , smaller fraction of computational cost is added due to introduced interface unknowns.

*2) Case B:* When *NB* is larger and comparable with *NA*, direct solving cannot be used due to finite computer memory. To solve  $s_{12} = C^{-1}(r_{12} - BD^{-1}r_{11})$ , iterative solving approach such as PCG method is required. For each subspace confined PCG iteration, the estimated computational cost is of  $O(\alpha N_A + \beta N_B + N_B \log(N_B))$ . Since *N<sub>B</sub>* is comparable with *NA*, large amount of computational overhead is added for each iteration. As a result, the system cannot be solved efficiently.

For 3-D system including die, package and PCB, system is divided vertically into domains based on scale difference. In general, since each domain is meshed using a 3-D grid and the interface is meshed using a 2-D grid, the number of interface unknowns is much smaller than that for the subdomains. Thus, using the CMG method provides an effective solution in terms of memory and computational complexity. The efficiency of the nonconformal domain decomposition with CMG approach is demonstrated by using experiments and comparison with conventional FEM in Section V.

## V. EXAMPLES AND DISCUSSION

# *A. Model Verification Examples*

To verify the correctness and accuracy of the nonconformal DDM for dc IR drop and thermal simulations as described in Section III, two model verification examples have been simulated.

*1) Multilayer PCB Example:* A three-layer PCB board with size of 9 cm  $\times$  9 cm is shown in Fig. 10(a). The thicknesses of the copper plane and FR-4 dielectric layer are 30 and 350  $\mu$ m, respectively. As shown in Fig. 10(a), the three-layer copper planes are shunted together using a  $40 \times 40$  via array. The dimension of via is 0.3 mm  $\times$  0.3 mm. In this example, the minimum and maximum scales in the lateral direction are 300  $\mu$ m and 9 cm, respectively. Using DDM, the PCB board is divided into 9 subdomains in *x*- and *y*-directions, as shown in Fig. 10(b). The fifth subdomain contains the via array. Since the coupling between



Fig. 10. (a) Three-layer and (b) domain decomposition of the PCB board.



Fig. 11. Voltage distribution of the PCB board with (a) DDM and (b) FEM method.

domains is captured using Lagrange multipliers, each domain can be meshed independently and thus the fine mesh does not project from the fifth domain to other adjacent domains.

This example is simulated using DDM. The voltage distribution on the first layer of the PCB board is shown in Fig. 11(a). This example is also simulated using FEM without domain decomposition. For comparison, the maximum mesh size is set to be the same for the two methods. Using FEM, the voltage distribution on the first layer is shown in Fig. 11(b). The voltage at the current source location is 2.4811 V. With domain decomposition approach, the voltage at the



Fig. 12. (a) Package example and (b) nonuniform chip power map (unit: W).

current source location is 2.4816 V. The 0.5 mV discrepancy comes from the different meshing grids adopted for the two methods. Due to the mesh projection from the via array, 60 K unknowns are required for the FEM method. However, only 49.2 K domain unknowns and 1.7 K interface unknowns are needed for DDM.

*2) Package Example:* To verify the accuracy of the DDM for thermal simulation, a package example is simulated, as shown in Fig. 12(a). The package includes 5 metal layers, TIM, 1600 package vias and  $20 \times 20$  microbump array. The package size is 30 mm  $\times$  30 mm and the chip is 10 mm  $\times$  10 mm. In this example, the minimum and maximum scales in the lateral direction are 200  $\mu$ m and 3 cm, respectively. The total power consumption of chip is 50 W and its nonuniform power map is shown in Fig. 12(b). The thermal conductivity of TIM is  $2 W/(mK)$ . The heat sink is modelled as an ideal heat sink with constant room temperature of 25 °C. This example has been simulated with convection coefficient of 5  $W/(m^2K)$  on both sides of the package. The material thickness and thermal conductivities are listed in Table I.

To simulate this package example effectively, it is divided into two subdomains including chip domain and package domain. The chip domain has a meshing grid of  $70 \times 70 \times 6$ and package domain has a meshing grid of  $80 \times 80 \times 10$ . The total number of unknowns is 99.6 K. The total number of interface unknowns is 4.9 K. Compared with the FEM method which requires 183.2 K unknowns, the number of unknowns is greatly reduced due to the nonconformal domain decomposition approach used. The generated system equations for FEM and DDM are all solved using direct sparse solver. The total solving time for DDM is 22.3 s, ∼34% reduction compared with the FEM, which takes 33.6 s. The simulated chip and package temperature distribution is shown in Fig. 13.

The chip temperature distribution at the location of  $y =$ 12.75 mm with and without domain decomposition is shown in Fig. 14. The maximum difference is ∼0.4°, which is due to the different meshing grids used for the two methods. The good agreement between the results from two methods validates the accuracy of the DDM.

# *B. 3-D Integration Example*

To demonstrate the capability of handling multiscale problems, a 3-D integration example, as shown in Fig. 15, is simulated. This example includes stacked dies, eight-layer

TABLE I MATERIAL THICKNESS AND THERMAL CONDUCTIVITY

|                    | Material         | Thermal Conductivity |
|--------------------|------------------|----------------------|
|                    | <b>Thickness</b> | (W/mK)               |
|                    | (mm)             |                      |
| Package dielectric | 0.35             | 0.8                  |
| Copper Plane       | 0.03             | 400                  |
| Chip               | 0.3              | 110                  |
| Underfill          | 0.2              | 0.4                  |
| C <sub>4</sub>     | 0.2              | 174                  |
| Package via        | 0.35             | 400                  |
| <b>TIM</b>         | 0.2              | 2.0                  |



Fig. 13. (a) Chip and (b) package temperature distribution.

package and 10-layer PCB board. The die size is 12 mm  $\times$ 12 mm and the package size is 30 mm  $\times$  30 mm. The PCB board size is 10 cm  $\times$  10 cm. The dies are stacked together using 400 TSVs ( $20 \times 20$  array). To reduce the IR drop, two PCB metal layers are shunted together using 100 PCB vias. In this example, the minimum and maximum scales in the lateral direction are 200  $\mu$ m and 10 cm, respectively. The material layer thickness and thermal conductivity are listed in Table II. Air convection with convection coefficient of  $15 W/(m^2 K)$  is applied to both sides of the PCB board. In this example, on-chip power grid is not included. The power consumption of stacked dies is 80 W and uniform power map is used. Note that in practical design, the power maps of dies need to be extracted using chip CAD tools based on chip layout design. Due to the scale difference between the die, package and PCB board, this example is divided



Fig. 14. Chip temperature distribution comparison (at *y* = 12.75 mm).



Fig. 15. 3-D integration example.

TABLE II MATERIAL THICKNESS AND THERMAL CONDUCTIVITY

|                      | Thickness (mm) | Thermal Conductivity |
|----------------------|----------------|----------------------|
|                      |                | (W/mK)               |
| PCB dielectric       | 0.35           | 0.8                  |
| PCB copper plane     | 0.03           | 400                  |
| Package dielectric   | 0.35           | 5                    |
| Package copper plane | 0.02           | 400                  |
| Die                  | 0.15           | 110                  |
| Underfill            | 0.2            | 0.4                  |
| C <sub>4</sub>       | 0.2            | 174                  |
| Solder bump          | 0.3            | 174                  |
| via                  | 0.35           | 400                  |
| <b>TIM</b>           | 0.2            | 1.6                  |
| <b>TSV</b>           | 0.15           | 400                  |





vertically into three domains including chip domain, package domain and PCB board domain. Therefore, two interfaces are needed to capture the coupling between chip-package and package-board. In this example, the basis functions for Lagrange multiplier for chip–package interface are selected from the package side while the basis functions for package– board interface are selected from the board side.

The subdomains of die, package and PCB board are meshed independently. For the initial (level-0) mesh for thermal simulation, the meshing grid arrays for die, package and PCB domains are  $42 \times 42 \times 8$ ,  $44 \times 44 \times 19$  and  $24 \times 24 \times 21$ , respectively. The total number of domain unknowns is 63.0 K and the number of interface unknowns is 0.4 K, which can be solved exactly using a direct sparse solver in 15.4 s. Without domain decomposition, the meshed cell number in *x*, *y* and *z* direction are 106, 106, and 46, respectively, resulting in about 402 K unknowns for FEM, which cannot be solved directly. FEM requires 398.9 s iterative solving time using CG method with a diagonal preconditioner.

For the level-2 mesh refinement, it has 968 K unknowns for the thermal simulation. However, using FEM with similar mesh size, the total number of unknowns is about 6.3 million, which requires long simulation time using iterative solving method such as preconditioned conjugate gradient (PCG) method [5]. Based on the hierarchical meshing grids using domain decomposition, the CMG solving algorithm can be applied. Since the initial solution is interpolated from previous level, the norm of initial residual is very small and the stop criteria  $\varepsilon$  is set to be 1E-2 for both dc voltage and thermal simulations. For both the problems

with level-1 and level-2 mesh, iterative solving is used for domain decomposition approach. For the level-1 and level-2 mesh, each IR drop simulation requires 2911 and 5343 iterations while each thermal simulation requires 2217 and 4526 iterations, respectively.

For comparison purpose, this example has also been simulated using FEM with the CG method and diagonal preconditioner. The number of unknowns and solution time using DDM and FEM with different mesh level for both dc IR drop and thermal simulations are listed in Table III. Note that the unknowns for DDM listed in Table III denote the number of unknowns for domain and interface. As listed from Table III, the total number of unknowns using DDM is reduced by 72%–84% for dc IR drop and thermal simulation compared with FEM. The total simulation time using DDM is reduced by 64%–88% for both dc IR drop and thermal simulations compared with FEM. For FEM simulation with 6.3 million unknowns, it cannot be solved due to finite memory in our simulation.

The simulated dc voltage and temperature with iterations are shown in Fig. 16(a) and (b), respectively. The thermal simulation is carried out first in the cosimulation. It shows that the voltage and temperature both converge in four iterations. The total simulation time is 9785 s for 4 iterations. Note that the calculated chip IR drop is 30.6 mV at room temperature. As shown in Fig. 16(a), the final IR drop becomes 36.6 mV. Therefore, the thermal effect increases the IR drop by 19.6%. Since the thermal simulation is carried out first, the temperature increase and extra voltage drop due to Joule heating effect



Fig. 16. Simulated (a) die voltage and (b) die and PCB temperature with iterations.

can be studied. The Joule heating effect on IR drop is ∼2% in this example due to shunted power planes. As observed from Fig. 16(b), the Joule heating increases the PCB hotspot temperature ∼8°. Since the on-chip power grid and joule heating is not considered, the Joule heating only increases the chip temperature by 0.8°. To illustrate the independent meshing grids and scale difference for chip, package and board regions, the top overview of the final temperature distribution of this example is shown in Fig. 17.

# *C. 2-D Integration Example*

A 10 cm  $\times$  5-cm PCB board with two chips is also simulated. The whole system view is shown in Fig. 18(a). As shown in Fig. 18(b), one PCB metal layer is used as the power plane with 1.8 V voltage supply. In this example, the equivalent thermal conductivity is used for the C4 layer and chip. The size of chip 1 is 12 mm  $\times$  12 mm and the size of chip 2 is 10 mm  $\times$  10 mm. The PCB substrate via size is 0.5 mm  $\times$  0.5 mm. In this example, the minimum and maximum scales in the lateral direction are 500  $\mu$ m and 10 cm, respectively. The detailed geometry and material parameters are listed in Table IV. In this example, the power consumption of chips 1 and 2 are 64 and 40 W, respectively. Uniform power maps are adopted for both chips. This example has been simulated with convection coefficient of 100  $W/(m^2K)$  on both sides of the PCB board.



Fig. 17. Top overview of final temperature distribution of (a) chip, package, and board and (b) enlarged chip and package domain.

This example is divided into four domains including two separate chip domains and two PCB domains, as shown in Fig. 18(b). Since the equivalent thermal conductivity is used for C4 layer and chip, to reduce the number of unknowns for the chip–package interface, the basis functions for Lagrange multipliers are chosen from the chip side. Due to the independent meshing grids used for each domain, the required total meshed cells and unknowns are reduced dramatically as compared with conventional FEM. For the initial mesh without domain decomposition, the FEM requires 121 K unknowns and 22.53 s solving time for thermal simulation. However, using the DDM, it only requires 48 K unknowns for domain and 0.6 K unknowns for interface in thermal simulation, which can be solved using the direct sparse solver in 3.77 s. For both the problems with level-1 and level-2 mesh, iterative solving is used for the DDM approach. For the level-1 and level-2 mesh, each IR drop simulation requires 1644 and 2521 iterations while each thermal simulation requires 869 and 1314 iterations, respectively.

The number of unknowns and solution time using DDM and FEM with different mesh level for both dc IR drop and thermal simulations are listed in Table V. Note that the unknowns for DDM listed in Table V also denote the number of unknowns for domain and interface. As listed from Table V, the total number of unknowns using nonconformal DDM is reduced by



Fig. 18. (a) Integration system and (b) power plane.

TABLE IV MATERIAL THICKNESS AND THERMAL CONDUCTIVITY

|                      | <b>Material Thickness</b><br>(mm) | Thermal Conductivity<br>(W/mK) |
|----------------------|-----------------------------------|--------------------------------|
| PCB dielectric       | 0.35                              | 0.8                            |
| Copper               | 0.036                             | 400                            |
| Chip                 | 0.3                               | 110                            |
| C <sub>4</sub> layer | 0.1                               | 10                             |
| PCB via              | 0.35                              | 400                            |
| <b>TIM</b>           | 0.2                               | 1.0                            |

TABLE V NUMBER OF UNKNOWNS AND SOLVING TIME USING DDM AND FEM



57% for dc IR drop and 60% for thermal simulation compared with FEM. The total simulation time using DDM is reduced by 60%–75% for dc IR drop simulation and 42%–83% for thermal simulation compared with FEM, which uses PCG method.

The simulated voltage and temperature with iterations are shown in Fig. 19(a) and (b), respectively. Thermal simulation



Fig. 19. Simulated (a) dc voltage and (b) temperature with iterations.

is also carried out first in the cosimulation flow. It shows that the voltage and temperature both converge in 4 iterations. The total simulation time is 2749 s. For chips 1 and 2, compared with the IR drop of 78.8 and 86.4 mV at room temperature, the final IR drops become  $95.8$  and  $104.1$  mV [Fig. 19(a)], respectively. Therefore, the thermal effect increases the IR drop by 21.6% and 20.4%, respectively. Since the thermal simulation is carried out first, the variations of voltage drop and temperature beyond the first iteration are caused by Joule heating. As observed from Fig. 19(a), the Joule heating effect causes about 10-mV voltage drop and therefore the Joule heat effect on voltage drop is ∼11% in this example. As observed from Fig. 19(b), the Joule heating only increases the temperature of chips 1 and 2 by 1.6° and 0.5°, respectively. However, the Joule heating increases PCB temperature  $\sim$ 25° [Fig. 19(b)]. This is caused by the current crowding in the irregular power plane (Fig. 20). To reduce the effect of Joule heating, more power plane layers need to be used to reduce the current crowding effect. The final temperature and voltage distributions of the power plane layer are shown in Fig. 20.



Fig. 20. Final (a) voltage distribution and (b) temperature distribution of power plane.



Fig. 21. CPU runtime and memory usage.

# *D. CPU Runtime and Memory Usage*

The proposed domain decomposition-based electrical– thermal cosimulation solver using CMG solving approach has been implemented using MATLAB and executed on a PC with a 3.2-GHz Xeon(TM) CPU and 3.0-GB memory. The CPU runtime and memory usage have been tested for multiscale 3-D problems. The CPU runtime and memory usage for single thermal simulation or dc IR drop simulation with number of unknowns is shown in Fig. 21. For unknowns <100 K,

direct sparse solver is used. As shown in Fig. 21, due to the CMG solving approach used, the memory usage of the CMG solver is nearly linearly proportional to the number of unknowns up to 1.5 million. This shows the good scalability of the nonconformal DDM.

# VI. CONCLUSION

In this paper, the electrical–thermal coanalysis based on nonconformal DDM is presented and the thermal effect on dc IR drop of large-multiscale 3-D integrated system is studied. It shows that the temperature effect can increase the dc IR drop by 20%. It demonstrates that the system unknowns can be greatly reduced using domain decomposition with nonconformal meshing grids. Moreover, the simulation also shows that the multiscale 3-D integrated system can be solved effectively using CMG solving approach.

#### **REFERENCES**

- [1] *International Technology Roadmap for Semiconductors*. New York, NY, USA: Wiley, 2006.
- [2] J. Xie, D. Chung, M. Swaminathan, M. Mcallister, A. Deutsch, L. Jiang, *et al.*, "Electrical-thermal co-analysis for power delivery networks in 3D system integration," in *Proc. IEEE 3DIC*, Sep. 2009, pp. 1–4.
- [3] J. Xie and M. Swaminathan, "Simulation of power delivery networks with Joule heating effects for 3D integration," in *Proc. ESTC*, Sep. 2010, pp. 1–6.
- [4] J. Xie and M. Swaminathan, "Electrical-thermal co-simulation of 3D integrated systems with micro-fluidic cooling and Joule heating effects," *IEEE Trans. Compon., Packag. Manuf. Technol.*, vol. 1, no. 2, pp. 234–246, Jan. 2011.
- [5] J. Xie and M. Swaminathan, "DC IR drop solver for large scale 3D power delivery networks," in *Proc. 19th Conf. Electr. Perform. Electron. Syst.*, Oct. 2010, pp. 217–220.
- [6] S. Rzepka, K. Banerjee, E. Meusel, and C. Hu, "Characterization of selfheating in advanced VLSI interconnect lines based on thermal finite element simulation," *IEEE Trans. Compon., Packag., Manuf. Technol. Part A*, vol. 21, no. 3, pp. 406–411, Sep. 1998.
- [7] L. Chee-Hoe, W. Wui-Weng, and L. Bao-Min, "Current density and hot spot prediction using electrical-thermal co-simulation for multi-layer IC packages," in *Proc. 8th EPTC*, 2006, pp. 849–852.
- [8] C.-H. Tsai and S.-M. Kang, "Cell-level placement for improving substrate thermal distribution," *IEEE Trans. Comput. Aided Design Integr. Circuits Syst.*, vol. 19, no. 2, pp. 253–266, Feb. 2000.
- [9] P. Li, L. T. Pileggi, M. Asheghi, and R. Chandra, "IC thermal simulation and modeling via efficient multigrid-based approaches," *IEEE Trans. Comput. Aided Design Integr. Circuits*, vol. 25, no. 9, pp. 1763–1776, Sep. 2006.
- [10] Y. Zhan and S. S. Sapatnekar, "High-efficiency Green function-based thermal simulation algorithms," *IEEE Trans. on Comput. Aided Design Integr. Circuits Syst.*, vol. 26, no. 9, pp. 1661–1675, Sep. 2007.
- [11] J. Xie and M. Swaminathan, "Fast electrical-thermal co-simulation with multigrid method for 3D integration," in *Proc. ECTC*, Jun. 2012, pp. 651–657.
- [12] A. F. Peterson, S. L. Ray, and R. Mittra, *Computational Method for Electromagnetics*. Piscataway, NJ, USA: IEEE Press, 1998.
- [13] J.-M. Jin, *The Finite Element Method in Electromagnetics*. New York, NY, USA: Wiley, 2002.
- [14] Y. Saad, *Iterative Methods for Sparse Linear Systems*. Bangkok, Thailand: SIAM Press, 2000.
- [15] F. Ben Belgacem, "The mortar finite element method with Lagrange multipliers," *Numeric Math.*, vol. 84, no. 2, pp. 173–199, Dec. 1999.
- [16] D. Braess, W. Dahmesn, and C. Wieners, "A multigrid algorithm for the mortar finite element method," *SIAM J. Numer. Anal.*, vol. 37, no. 1, pp. 48–69, 1999.
- [17] *Ulrich Trottenberg*, C. W. Oosterlee and A. Schuller, Eds. San Diego, CA, USA: Academic Press, 2001.
- [18] D. Braess, P. Deuflhard, and K. Lipnikov, "A subspace cascadic multigrid method for mortar elements," *Computing*, vol. 69, no. 3, pp. 205–225, Nov. 2002.
- [19] M. Benzi and A. J. Wathen, "Some preconditioning techniques for saddle point problems," in *Model Order Reduction: Theory, Research Aspects and Applications* (Mathematics in Industry), vol. 13. New York, NY, USA: Springer-Verlag, 2008, pp. 195–211.
- [20] D. Braess and R. Sarazin, "An efficient smoother for the stokes problem," *Appl. Numer. Math.*, vol. 23, no. 1, pp. 3–19, Feb. 1997.
- [21] T.-Y. Wang and C. Chung-Ping Chen, "3-D thermal-ADI: A linear time chip level transient thermal simulator," *IEEE Trans. Comput. Aided Design Integr. Circuits Syst.*, vol. 21, no. 12, pp. 1434–1445, Dec. 2002.
- [22] M. Swaminathan, D. Chung, S. Grivet-Talocia, K. Bharath, V. Laddha, and J. Xie, "Designing and modeling for power integrity," *IEEE Trans. Electromagn. Compat.*, vol. 53, no. 2, pp. 288–310, May 2010.
- [23] L. Jiang, C. Xu, B. J. Rubin, A. J. Weger, A. Deutsch, *et al.*, "A thermal simulation process based on electrical modeling for complex interconnect, packaging, and 3DI structures," *IEEE Trans. Adv. Packag.*, vol. 33, no. 4, pp. 777–786, Nov. 2010.
- [24] F. Rapetti, E. Bouillault, L. Santandrea, A. Buffa, Y. Maday, and A. Razek, "Calculation of eddy currents with edge elements on nonmatching grids in moving structures," *IEEE Trans. Magn.*, vol. 36, no. 4, pp. 1351–1355, Jul. 2000.
- [25] M. N. Vouvakis, Z. Cendes, and J.-F. Lee, "A FEM domain decomposition method for photonic and electromagnetic band gap structures," *IEEE Trans. Antennas Propag.*, vol. 54, no. 2, pp. 721–733, Feb. 2006.
- [26] Y. Zhong and M. D. F. Wong, "Fast algorithms for IR drop analysis in large power grid," in *Proc. IEEE/ACM ICCAD*, Nov. 2005, pp. 351–357.
- [27] J. Xie and M. Swaminathan, "3D transient thermal solver using nonconformal domain decomposition approach," in *Proc. ICCAD*, 2012, pp. 333–340.
- [28] J. Yang, Y. Cai, Q. Zhou, and J. Shi, "Fast poisson solver preconditioned method for robust power grid analysis," in *Proc. IEEE/ACM ICCAD*, Nov. 2011, pp. 531–536.
- [29] Y. Zhong and M. D. F. Wong, "Thermal-aware IR drop analysis in large power grid," in *Proc. 9th ISQED*, 2008, pp. 194–199.
- [30] Y. Liu, R. P. Dick, L. Shang, and H. Yang, "Accurate temperaturedependent integrated circuit leakage power estimation is easy," in *Proc. DATE*, Apr. 2007, pp. 1–6.
- [31] Y. Shao, Z. Peng, and J.-F. Lee, "Thermal-aware DC IR-drop co-analysis using non-conformal domain decomposition methods," in *Proc. R. Soc. A*, 2012, pp. 1–21.



**Jianyong Xie** received the B.S. degree in telecommunication engineering from Jilin University, Changchun, China, and the M.S. degree in electrical engineering from Shanghai Jiao Tong University, Shanghai, China, in 2005 and 2008, respectively. He is currently pursuing the Ph.D. degree in electrical and computer engineering with the Georgia Institute of Technology, Atlanta, GA, USA.

His current research interests include computational/numerical methods, signal and power integrity

analysis, thermal modeling, and electromagnetic modeling and simulations.



**Madhavan Swaminathan** (A'91–M'95–SM'98– F'06) received the M.S. and Ph.D. degrees in electrical engineering from Syracuse University, Syracuse, NJ, USA, in 1989 and 1991, respectively.

He is the John Pippin Chair of electromagnetics with the School of Electrical and Computer Engineering and the Director of the Interconnect and Packaging Center, Georgia Institute of Technology (Georgia Tech), Atlanta, GA, USA, and the Founder and CTO of E-System Design, a company focusing on the development of CAD tools for

achieving signal and power integrity in integrated 3-D micro and nanosystems. He is the co-founder of Jacket Micro Devices, a company that specialized in integrated RF modules and substrates for wireless applications that was acquired by AVX Corporation. He formerly held the position of Joseph M. Pettit Professor of electronics in ECE and the Deputy Director of the NSF Microsystems Packaging Center, Georgia Tech. Prior to joining Georgia Tech, he was with IBM involved in research on packaging for supercomputers. He is the author of more than 400 technical articles, holds 27 patents, is the author of three book chapters, primary author and co-editor of three books *Power Integrity Modeling and Design for Semiconductors and Systems* (Prentice Hall, 2007), *Introduction to System on Package* (McGraw Hill, 2008), and *Design and Modeling for 3D ICs and Interposers* (World Scientific Publishers Corporation, 2013) all in the field of packaging. He has served as the Distinguished Lecturer for the IEEE EMC Society. He has supervised 35 Ph.D. and 16 M.S. students in electrical engineering.