# Multiscale EM and Circuit Simulation Using the Laguerre-FDTD Scheme for Package-Aware Integrated-Circuit Design

A Thesis
Presented to
The Academic Faculty
by
Krishna Srinivasan

In Partial Fulfillment
of the Requirements for the Degree
Doctor of Philosophy in the
School of Electrical and Computer Engineering

Georgia Institute of Technology August 2008 Copyright © Krishna Srinivasan 2008

# Multiscale EM and Circuit Simulation Using the Laguerre-FDTD Scheme for Package-Aware Integrated-Circuit Design

#### Approved by:

Prof. Madhavan Swaminathan, Advisor School of ECE Georgia Institute of Technology

Prof. Andrew Peterson School of ECE Georgia Institute of Technology

Prof. Sung Kyu Lim School of ECE Georgia Institute of Technology Prof. George Riley School of ECE Georgia Institute of Technology

Prof. Yogendra Joshi School of Mechanical Engineering Georgia Institute of Technology

Date Approved: 21 April 2008

To Suresh, Shanthi, and Prof. Swaminathan

## Acknowledgements

#### My sincere thanks to

- Prof. Madhavan Swaminathan for providing me with this wonderful opportunity and fulfilling my dream of doing graduate studies. It was great working in the EPSILON lab. His technical advice and feedback made the goal achievable. I greatly appreciate his financial support. With deepest respect and gratitude I dedicate this thesis to him.
- Dr. Ege Engin, post-doctoral researcher, for the valuable technical discussions, feedback, and patiently answering my infinite number of questions.
- Research engineers Hideki san for his guidance when I first started with research work and Takayuki san for mentoring my study.
- The committee members for the time that they have taken to read through this thesis and offer their suggestions to improve this work.
- The students of the EPSILON group for the valuable technical discussions and advice.
- My family and relatives, especially Suresh and Shanthi, without whom this work would not have been possible. I also dedicate this thesis to Suresh and Shanthi.

# Contents

| A              | ckno  | wledgements                                                                | III |
|----------------|-------|----------------------------------------------------------------------------|-----|
| Li             | st of | Tables                                                                     | III |
| Li             | st of | Figures                                                                    | ΙX  |
| $\mathbf{S}$ ι | ımm   | ary                                                                        | 1   |
| 1              | Inti  | roduction                                                                  | 2   |
|                | 1.1   | Development of CAD Tools                                                   | 4   |
|                | 1.2   | Common Signal and Power-Integrity Problems Present in a Chip and a Package | 5   |
|                | 1.3   | Proposed Research                                                          | 8   |
|                | 1.4   | Dissertation Outline                                                       | 11  |
| 2              | Tra   | nsient Simulation Using Laguerre Polynomials                               | 13  |
|                | 2.1   | Introduction                                                               | 13  |
|                | 2.2   | The SLeEC Alogrithm                                                        | 15  |
|                |       | 2.2.1 Transformation from time domain to Laguerre domain                   | 16  |
|                |       | 2.2.2 Companion Models                                                     | 18  |
|                |       | 2.2.3 DC Analysis                                                          | 18  |
|                |       | 2.2.4 Transformation from Laguerre domain to time domain                   | 20  |
|                | 2.3   | Advantages of Laguerre Polynomials                                         | 20  |
|                | 2.4   | Summary                                                                    | 21  |
| 3              | Sim   | ulation For Any Length of Time                                             | 22  |
|                | 3.1   | Introduction                                                               | 22  |
|                | 3.2   | An Example to Demonstrate the Limitation                                   | 22  |
|                | 3.3   | A Solution to Overcome the Limitation                                      | 23  |
|                | 3.4   | Computing the Final Values at the End of an Interval                       | 25  |
|                | 3.5   | Examples of Simulation Using Initial Conditions                            | 25  |
|                | 3 6   | Summary                                                                    | 26  |

| 4 | Cor | mpanion Models for Circuit Simulation                        | 29         |
|---|-----|--------------------------------------------------------------|------------|
|   | 4.1 | Introduction                                                 | 29         |
|   | 4.2 | The Companion Model of an Inductor                           | 29         |
|   | 4.3 | The Companion Model of a Capacitor                           | 32         |
|   | 4.4 | Transient Simulation of Inductor and Capacitor (LC) Circuits | 34         |
|   | 4.5 | The Companion model for Mutual Inductance                    | 36         |
|   | 4.6 | Summary                                                      | 40         |
| 5 | Cor | mpanion Models of the FDTD grid for EM Simulation            | 41         |
|   | 5.1 | Introduction                                                 | 41         |
|   | 5.2 | 1D FDTD                                                      | 42         |
|   | 5.3 | 2D FDTD                                                      | 45         |
|   | 5.4 | 3D FDTD                                                      | 48         |
|   | 5.5 | Boundary Conditions                                          | 50         |
|   |     | 5.5.1 Perfect Electric Conductor (PEC) Boundary              | 52         |
|   |     | 5.5.2 Perfect Magnetic Conductor (PMC) Boundary              | 52         |
|   |     | 5.5.3 Absorbing Boundary Condition (ABC)                     | 53         |
|   | 5.6 | Summary                                                      | 53         |
| 6 | Cho | posing the Correct Number of Laguerre Basis Coefficients     | <b>5</b> 4 |
|   | 6.1 | Introduction                                                 | 54         |
|   | 6.2 | Methodology                                                  | 55         |
|   |     | 6.2.1 Energy analysis to find $q_{knee}$ (Step 1)            | 58         |
|   |     | 6.2.2 Finding the right value for $q$ (Step 2)               | 59         |
|   | 6.3 | Improved Methods to Calculate Energy                         | 59         |
|   | 6.4 | Summary                                                      | 63         |
| 7 | 3D  | EM Simulation Results                                        | 64         |
|   | 7.1 | Introduction                                                 | 64         |
|   | 7.2 | Node-Numbering Schemes                                       | 64         |
|   | 73  | EM Tost cases                                                | 65         |

|   | 7.4  | A Split Power-Ground Plane                                              |
|---|------|-------------------------------------------------------------------------|
|   | 7.5  | An EBG Structure                                                        |
|   | 7.6  | On-chip Coupled Lines                                                   |
|   | 7.7  | A Chip-Package Structure                                                |
|   | 7.8  | Summary                                                                 |
| 8 | Tim  | ne-domain to Frequency-domain Transformation 82                         |
|   | 8.1  | Introduction                                                            |
|   | 8.2  | A Test Case to Illustrate the Transformation                            |
|   | 8.3  | Summary                                                                 |
| 9 | Effi | cient Use of Full-Wave Solvers For Chip-Package Cosimulation 89         |
|   | 9.1  | Introduction                                                            |
|   | 9.2  | SDN-PDN Cosimulation                                                    |
|   | 9.3  | Frequency-Domain Parameters of PDN and SDN                              |
|   | 9.4  | Integration of PDN and SDN                                              |
|   |      | 9.4.1 Microstrip Configuration                                          |
|   |      | 9.4.2 Coplanar-Waveguide Configuration                                  |
|   |      | 9.4.3 A test structure to verify the coplanar-waveguide model 10        |
|   | 9.5  | Port Reduction                                                          |
|   | 9.6  | Causality Enforcement Through Delay Extraction                          |
|   | 9.7  | Transient Simulation                                                    |
|   |      | 9.7.1 Transient Simulation Using Signal-Flow Graphs                     |
|   |      | 9.7.2 Transient Simulation Using S-Parameters in MNA Framework 10       |
|   | 9.8  | Results                                                                 |
|   |      | 9.8.1 Transmission Line Simulation                                      |
|   |      | 9.8.2 Step Response of an Interconnect With Causality Enforcement 11    |
|   |      | 9.8.3 Sixty-Four Bit Bus Referenced to a Nonideal Power-Ground Plane 11 |
|   | 9.9  | Speed And Memory Optimization                                           |
|   |      | 9.9.1 Methodology                                                       |
|   |      | 9.9.2 Enhancement of Power Integrity Using Embedded Capacitors 11       |

|           | 9.10 Summary                                                       | 121        |
|-----------|--------------------------------------------------------------------|------------|
| 10        | Future Work: Alternate Schemes for DC Analysis of the FDTD Lattice | <b>123</b> |
|           | 10.1 Introduction                                                  | 123        |
|           | 10.2 1D Grid                                                       | 124        |
|           | 10.3 2D Grid                                                       | 125        |
|           | 10.4 3D Grid                                                       | 128        |
|           | 10.5 Summary                                                       | 128        |
| 11        | Conclusions                                                        | 130        |
| <b>12</b> | Appendix A: Derivation of the Courant Condition                    | 131        |
| 13        | Publications                                                       | 133        |
|           | 13.1 Journals and Conference Papers                                | 133        |
|           | 13.2 Invention Disclosures                                         | 135        |
| Re        | eferences                                                          | 136        |

# List of Tables

| 1 | The probe locations of the electric and magnetic fields for the three test cases. | 68 |
|---|-----------------------------------------------------------------------------------|----|
| 2 | The memory and simulation time comparison for Test case 1                         | 69 |
| 3 | The memory and simulation time comparison for Test case 2                         | 71 |
| 4 | The memory and simulation time comparison for Test case 3                         | 7: |

# List of Figures

| 1  | A generic Intel PC system                                                         | 3  |
|----|-----------------------------------------------------------------------------------|----|
| 2  | A multichip module                                                                | 3  |
| 3  | The Intel processor family and CAD tools                                          | 4  |
| 4  | Traditional sequential design flow [3]                                            | 6  |
| 5  | Package-aware design flow. [3]                                                    | 6  |
| 6  | Common signal and power-integrity problems present in a package                   | 7  |
| 7  | Simultaneous switching noise (SSN)                                                | 8  |
| 8  | A microstrip and a conductor-backed coplanar-waveguide configuration              | 11 |
| 9  | Multiscale features in a chip-package structure                                   | 14 |
| 10 | The flowchart of the SLeEC methodology                                            | 15 |
| 11 | Weighted Laguerre polynomials for orders $p=0$ to $p=4$                           | 17 |
| 12 | The original (dots) and reconstructed (solid) triangular waveform using La-       |    |
|    | guerre basis functions                                                            | 19 |
| 13 | The coefficients of basis functions for the triangular waveform in Figure $12.$ . | 19 |
| 14 | The companion model for a unit cell in a 1D FDTD grid                             | 20 |
| 15 | A 2D box with PEC boundary                                                        | 23 |
| 16 | The transient simulation response from $15ns$ to $20ns$ . Solid: the Laguerre-    |    |
|    | FDTD scheme and Dots: FDTD                                                        | 24 |
| 17 | The total simulation time is divided into different intervals                     | 24 |
| 18 | The circuit for transient simulation with initial conditions                      | 25 |
| 19 | The voltage across the capacitor $V(t)$ with initial conditions (200 basis coef-  |    |
|    | ficients). Dots: WinSpice and Solid: SLeEC                                        | 27 |
| 20 | The voltage across the capacitor $V(t)$ with initial conditions (400 basis coef-  |    |
|    | ficients). Dots: WinSpice and Solid: SLeEC                                        | 27 |
| 21 | The simulation results from 0ns to 5ns. Solid: SLeEC and Dots: FDTD               | 28 |
| 22 | The simulation results from 5ns to 6.5ns using new interval length and time-      |    |
|    | scale factor. Solid: SLeEC and Dots: FDTD                                         | 28 |

| 23 | The Thevenin and Norton forms of the companion model for an inductor or              |    |
|----|--------------------------------------------------------------------------------------|----|
|    | a capacitor                                                                          | 29 |
| 24 | A series LC circuit                                                                  | 34 |
| 25 | The companion model for the circuit in Figure 24                                     | 34 |
| 26 | Voltage $V^C$ in the circuit shown in Figure 24 using 200 basis coefficients.        |    |
|    | Solid: SLeEC and Dots: WinSpice                                                      | 35 |
| 27 | Voltage $V^C$ in the circuit shown in Figure 24 using 400 basis coefficients.        |    |
|    | Solid: SLeEC and Dots: WinSpice                                                      | 35 |
| 28 | An LC network                                                                        | 37 |
| 29 | The transient voltage waveform at Node 102 in Figure 28. Solid: SLeEC and            |    |
|    | Dots: WinSpice                                                                       | 37 |
| 30 | Two inductors with mutual coupling                                                   | 38 |
| 31 | The companion model for mutual inductance in the Thevenin form                       | 38 |
| 32 | The companion model for mutual inductance in the Norton form                         | 39 |
| 33 | The companion model for a unit cell in an FDTD grid                                  | 43 |
| 34 | The companion model for the 2D FDTD grid                                             | 45 |
| 35 | The standard Yee cell                                                                | 48 |
| 36 | The ×Sections of the Yee cell, which are marked by the dotted lines in Figure        |    |
|    | 35, parallel to the $xz$ , $yz$ and $xy$ planes, respectively. The dots indicate the |    |
|    | direction of the fields pointing out of the page                                     | 51 |
| 37 | The companion model of the 3D FDTD grid in the Laguerre domain                       | 51 |
| 38 | The PEC boundary condition                                                           | 52 |
| 39 | The PMC boundary condition                                                           | 53 |
| 40 | The absorbing boundary condition                                                     | 53 |
| 41 | A 2D power-ground plane structure                                                    | 54 |
| 42 | The time-domain waveform generated using 179 basis functions                         | 56 |
| 43 | The time-domain waveform generated using 362 basis functions                         | 56 |
| 44 | If a small value for $q$ is chosen, then the time-domain waveform does not have      |    |
|    | sufficient energy content                                                            | 57 |
| 45 | A planar structure with multiscale dimensions                                        | 57 |

| 46 | Energy as a function of the number of basis coefficients using Scheme 1                | 58 |
|----|----------------------------------------------------------------------------------------|----|
| 47 | The normalized error at time $t=0$ as a function of the number of basis                |    |
|    | coefficients using Scheme 1                                                            | 60 |
| 48 | The time-domain $E_z$ field obtained using Scheme 1                                    | 60 |
| 49 | The energy profile as a function of the number of basis coefficients using             |    |
|    | Scheme 2                                                                               | 61 |
| 50 | The normalized error as a function of the number of basis coefficients using           |    |
|    | Scheme 2                                                                               | 62 |
| 51 | The time-domain $E_z$ field obtained using Scheme 2                                    | 62 |
| 52 | Planes 1, 2, and 3 in the inefficient node numbering scheme                            | 65 |
| 53 | The sparsity pattern of the A matrix from an inefficient node-numbering                |    |
|    | scheme.                                                                                | 65 |
| 54 | The sparsity pattern of the A matrix that is suitable for LU decomposition             | 66 |
| 55 | The bird's eye view of an EM test case that is enclosed within a PEC box               | 67 |
| 56 | The top view of an EM test case                                                        | 67 |
| 57 | Microsoft Excel® is used as a GUI for the layout of the test cases                     | 67 |
| 58 | A split power-ground plane                                                             | 68 |
| 59 | The contour maps of the split-plane test case                                          | 69 |
| 60 | $E_x$ (126 basis coefficients) and $E_y$ (208 basis coefficients) fields in the split- |    |
|    | plane test case. Solid: SLeEC and Dots: FDTD                                           | 69 |
| 61 | $E_z$ (490 basis coefficients) and $H_x$ (226 basis coefficients) fields in the split- |    |
|    | plane test case. Solid: SLeEC and Dots: FDTD                                           | 70 |
| 62 | $H_y$ (259 basis coefficients) and $H_z$ (398 basis coefficients) fields in the split- |    |
|    | plane test case. Solid: SLeEC and Dots: FDTD                                           | 70 |
| 63 | A 2D EBG test case                                                                     | 71 |
| 64 | The 2D EBG contour map of the $ (E_x, E_y) $ field after 950ps                         | 72 |
| 65 | The 2D EBG contour map of the $ (E_x, E_y) $ field after 1200ps                        | 72 |
| 66 | The 2D EBG contour map of the $ (E_x, E_y) $ field after 1300ps                        | 73 |
| 67 | $E_x$ (413 basis coefficients) and $E_y$ (413 basis coefficients) fields in the EBG    |    |
|    | test case. Solid: SLeEC and Dots: FDTD                                                 | 73 |

| 68 | $E_z$ (413 basis coefficients) and $H_x$ (141 basis coefficients) fields in the EBG          |    |
|----|----------------------------------------------------------------------------------------------|----|
|    | test case. Solid: SLeEC and Dots: FDTD                                                       | 74 |
| 69 | $H_y$ (171 basis coefficients) and $H_z$ (141 basis coefficients) fields in the EBG          |    |
|    | test case. Solid: SLeEC and Dots: FDTD                                                       | 74 |
| 70 | Three on-chip coupled transmission lines                                                     | 75 |
| 71 | The contour map of the $E_z$ field                                                           | 76 |
| 72 | $E_x$ (49 basis coefficients) and $E_y$ (446 basis coefficients) fields of the transmission  | -  |
|    | lines test case. Solid: SLeEC and Dots: FDTD                                                 | 76 |
| 73 | $E_z$ (409 basis coefficients) and $H_x$ (188 basis coefficients) fields of the transmission | n- |
|    | lines test case. Solid: SLeEC and Dots: FDTD                                                 | 77 |
| 74 | $H_y$ (437 basis coefficients) and $H_z$ (411 basis coefficients) fields of the transmission | n- |
|    | lines test case. Solid: SLeEC and Dots: FDTD                                                 | 77 |
| 75 | A chip-package structure with multiscale features                                            | 78 |
| 76 | Non-uniform mesh dimensions simulated using SLeEC                                            | 79 |
| 77 | The cross section of the different metal layers                                              | 79 |
| 78 | The 3D Layout of the chip-package structure                                                  | 80 |
| 79 | SLeEC and FDTD results of the chip-package structure                                         | 80 |
| 80 | A power-ground plane test case                                                               | 82 |
| 81 | Voltage and current definitions                                                              | 84 |
| 82 | The time domain $E_z$ waveform between $0 - 1\mu s$ at the location marked $Src$             |    |
|    | in Figure 80                                                                                 | 84 |
| 83 | $Z_{11}$ , 0-10GHz, without windowing and the PMC boundary condition has been                |    |
|    | used to terminate the mesh. Dots: MFDM [18] and Solid: FDTD                                  | 85 |
| 84 | The second half of the time-domain Kaiser windowing function                                 | 85 |
| 85 | The time domain $E_z$ waveform in Figure 82 with windowing                                   | 86 |
| 86 | $Z_{11}$ , 0-10GHz, with windowing and the PMC boundary condition has been                   |    |
|    | used to terminate the mesh. Dots: MFDM [18] and Solid: FDTD $\ \ldots \ \ldots$              | 86 |
| 87 | $Z_{12}$ , 0-10GHz, without windowing and the PMC boundary condition has been                |    |
|    | used to terminate the mesh. Dots: MFDM [18] and Solid: FDTD                                  | 87 |

| 88  | $Z_{12}$ , 0-10GHz, with windowing and the PMC boundary condition has been    |     |
|-----|-------------------------------------------------------------------------------|-----|
|     | used to terminate the mesh. Dots: MFDM [18] and Solid: FDTD                   | 87  |
| 89  | The traditional and proposed methodologies for chip-package cosimulation      | 90  |
| 90  | The SDN-PDN cosimulation methodology                                          | 91  |
| 91  | A power-ground plane                                                          | 92  |
| 92  | The common types of interconnect configurations in a package                  | 93  |
| 93  | A microstrip line over a power-ground plane                                   | 93  |
| 94  | The circuit model for a microstrip line referenced to a power-ground plane    | 95  |
| 95  | N coupled lines referenced to a power-ground plane                            | 95  |
| 96  | The Y-parameter model for an M-port network referenced to a power-ground      |     |
|     | plane                                                                         | 96  |
| 97  | The cross section of a coplanar-waveguide structure                           | 97  |
| 98  | The coupling between the transmission lines is captured using controlled      |     |
|     | sources                                                                       | 97  |
| 99  | N coupled transmission lines and modal transmission lines                     | 98  |
| 100 | The E-field patterns for the coplanar-waveguide mode and the parallel-plate   |     |
|     | mode                                                                          | 99  |
| 101 | The circuit model for a coplanar-waveguide structure                          | 100 |
| 102 | The modal transmission lines replaced with two-port frequency parameters      | 101 |
| 103 | A test structure to verify the CPW model                                      | 102 |
| 104 | $S_{13}$ magnitude of the CPW test structure. Dots: Sonnet and Solid: CPW     |     |
|     | Model                                                                         | 103 |
| 105 | $S_{13}$ phase of the CPW test structure. Dots: Sonnet and Solid: CPW Model . | 103 |
| 106 | $S_{14}$ magnitude of the CPW test structure. Dots: Sonnet and Solid: CPW     |     |
|     | Model                                                                         | 104 |
| 107 | $S_{14}$ phase of the CPW test structure. Dots: Sonnet and Solid: CPW Model . | 104 |
| 108 | $S_{34}$ magnitude of the CPW test structure. Dots: Sonnet and Solid: CPW     |     |
|     | Model                                                                         | 105 |
| 109 | $S_{34}$ phase of the CPW test structure. Dots: Sonnet and Solid: CPW Model . | 105 |
| 110 | A two port signal-flow graph [25].                                            | 108 |

| 111 | A two port S-parameter network with sources and terminations                | 109 |
|-----|-----------------------------------------------------------------------------|-----|
| 112 | The convolution of the impulse response $s_{ij}$ and constant $a_{j,DC}$    | 110 |
| 113 | The values of $a_j[n]$ for $n < 0$                                          | 111 |
| 114 | Transient simulation of a transmission line with DC analysis using MNA      |     |
|     | formulation                                                                 | 112 |
| 115 | A typical time-varying resistor waveform                                    | 113 |
| 116 | Rpush(t) and $Rpull(t)$ have opposite polarities.<br>                       | 113 |
| 117 | The voltage waveform at the far end of the interconnect. Solid: S-Parameter |     |
|     | simulation with DC analysis and Dots: ADS                                   | 113 |
| 118 | The simulation set up for the step response of an interconnect              | 114 |
| 119 | The step response of a long interconnect from $0-20ns$                      | 115 |
| 120 | The zoom of Figure 119 from $0-6ns$                                         | 115 |
| 121 | The sixty-four bit bus simulation set up                                    | 116 |
| 122 | The eye-diagram simulation results without and with causality enforcement.  | 116 |
| 123 | A transmission line referenced to a power-ground plane                      | 118 |
| 124 | The charging and discharging currents that model the source have opposite   |     |
|     | polarities                                                                  | 118 |
| 125 | A typical package connected to a PCB                                        | 120 |
| 126 | The simulation set up for SSN simulation                                    | 121 |
| 127 | Twenty-five 100nF SMTs and power-ground plane substrate $\epsilon_r=3.8.$   | 122 |
| 128 | Hundred 100nF SMTs and power-ground plane substrate $\epsilon_r=3.8.$       | 122 |
| 129 | Twenty-five 1nF embedded discrete capacitors and power-ground plane sub-    |     |
|     | strate $\epsilon_r = 11$                                                    | 122 |
| 130 | The flowchart of the SLeEC methodology                                      | 123 |
| 131 | The Norton companion model for a 1D FDTD grid terminated with PEC           |     |
|     | boundary                                                                    | 124 |
| 132 | A cascade of ABCD matrices reduced to a single block by multiplying the     |     |
|     | parameters of individual blocks                                             | 124 |
| 133 | The reduced FDTD grid network                                               | 125 |

| 134 | All the field coefficients in the entire 1D grid can be calculated using $E_1$ alone |     |
|-----|--------------------------------------------------------------------------------------|-----|
|     | for the given boundary conditions                                                    | 125 |
| 135 | A 2D FDTD grid                                                                       | 126 |
| 136 | The calculation of the fields in the entire grid using the E-field values in         |     |
|     | Column 1 alone                                                                       | 127 |
| 137 | A 2D FDTD grid with metallization                                                    | 127 |
| 138 | A T-parameter matrix with asymmetric input-output ports                              | 127 |
| 139 | FDTD cells in a 3D grid                                                              | 128 |

## Summary

The objective of this research work is to develop an efficient methodology for chip-package cosimulation. In the traditional design flow, the integrated circuit (IC) is first designed followed by the package design. The disadvantage of the conventional sequential design flow is that if there are problems with signal and power integrity after the integration of the IC and the package, it is expensive and time consuming to go back and change the IC layout for a different input/output (IO) pad assignment. To overcome this limitation, a concurrent design flow, where both the IC and the package are designed together, has been recommended by researchers to obtain a fast design closure. The techniques from this research work will enable multiscale cosimulation of the chip and the package making the concurrent design flow paradigm possible.

Traditional time-domain techniques, such as the finite-difference time-domain method, are limited by the Courant condition and are not suitable for chip-package cosimulation. The Courant condition gives an upper bound on the time step that can be used to obtain stable simulation results. The smaller the mesh dimension the smaller is the Courant time step. In the case of chip-package cosimulation the on-chip structures require a fine mesh, which can make the time step prohibitively small. An unconditionally stable scheme using Laguerre polynomials has been recommended for chip-package cosimulation. Prior limitations in this method have been overcome in this research work. The enhanced transient simulation scheme using Laguerre polynomials has been named SLeEC, which stands for simulation using Laguerre equivalent circuit. A full-wave EM simulator has been developed using the SLeEC methodology.

A scheme for efficient use of full-wave solver for chip-package cosimulation has been proposed. Simulation of the entire chip-package structure using a full-wave solver could be a memory and time-intensive operation. A more efficient way is to separate the chip-package structure into the chip, the package signal-delivery network, and the package power-delivery network; use a full-wave solver to simulate each of these smaller subblocks and integrate them together in the following step, before a final simulation is done on the integrated network. Examples have been presented that illustrate the technique.

#### 1 Introduction

The consumer demand for electronics products with more functionality, better performance, smaller size, less weight, and lower cost has given rise to numerous issues in signal integrity and power integrity. As more functionality is integrated in a package, there is more communication between the chips, resulting in larger number of input/output (IO) pins and more interconnects to be routed. With smaller spacing between the interconnects, there can be significant crosstalk, causing the product to fail. The endless requirement for faster speed has created smaller rise times and fall times on the order of picoseconds. This has pushed the frequency spectrum into the GHz range. Faster signaling creates voltage fluctuations on the power-distribution network that can cause false switching of logic circuits. The current in interconnects and on its return path creates regions of small and large electromagnetic (EM) fields. Chips placed at the locations of high EM field experience loss in signal quality due to EM coupling. With every next generation integrated circuit (IC), the voltage levels are scaled down to reduce power dissipation and transistor failure. As a result, the noise tolerance is becoming smaller.

ICs making up a system, together with passive components and the power-distribution network, are interconnected together in a package. A layout of a generic Intel PC system is shown in Figure 1 [1]. The chip marked 82975X MCH, which is known as the Northbridge, is connected to the graphics, the processor and the memory chips. The chip marked 82801GR ICH7R, which is known as the Southbridge, is connected to the slower IO chips. The various ICs making up the Intel system can be efficiently organized as a multichip module, similar to the configuration shown in Figure 2. The IC and the package do not exist independently, and therefore, in order to be able to evaluate the performance of the system, cosimulation of the chip and the package is needed. For example, the noise generated on the package affect the ICs and vice versa.

According to the International Technology Roadmap for Semiconductors (ITRS) 2005 [2], one of the challenges in future packages is to develop a chip-package cosimulation tool to analyze signal integrity and power integrity. It is becoming difficult to predict failures due to the lack of tools capable of chip-package cosimulation for accurate evaluation of



Figure 1: A generic Intel PC system.

system performance. Signal and power-integrity problems have resulted in a longer design cycle time. Failures that can be detected at the simulation level rather than at the product prototype level can save cost and time. Accurate modeling methods and tools play a key role in noise prediction. With tools that perform a system-level simulation, the number of design iterations that are needed to successfully create a working prototype, can be drastically reduced. Smaller design cycle time reduces cost, as well as decreasing the time



Figure 2: A multichip module.

required to deliver a product to the consumer. The objective of this research is to develop an accurate, time-efficient, and memory-efficient technique for chip-package cosimulation.

#### 1.1 Development of CAD Tools

CAD tools are indispensable in the development of any electronic system for today's market. The Intel microprocessor family from the years 1970 to 2005 is shown in Figure 3 [1]. What is often neglected is the importance played by CAD tools in the progress. Initial processors contained limited number of transistors and were hand crafted. With increasing number of transistors and more integration of digital and analog, modern electronic design without the aid of CAD tools is unthinkable.



Figure 3: The Intel processor family and CAD tools.

CAD tools play different roles such as optimization of logic, automatic placement and routing of transistors, time/frequency-domain simulation for noise prediction. They reduce the product development time and help create better designs that work more efficiently. Most of the CAD tools have a certain domain where they can be applied. Some tools operate purely at the chip level, while others only at the package level. A drawback of this approach is that there is no feedback between the two. The noise generated by the parasitics of the package structures can cause the chip to fail. The objective of this thesis is to develop simulation methodologies in the time and frequency domain to enable cosimulation at the chip and the package levels.

A traditional sequential design flow of an IC and a package is shown in Figure 4 [3]. In the sequential design flow, the chip is first designed followed by the package design based on the IO pad assignments of the IC. A disadvantage of this approach is that the problems that occur due to the integrated chip and the package go undetected until the final stage. The reason for using this type of design flow is because of the lack of CAD tools that are available in order to be able to design both the chip and the package in parallel.

The recommended package-aware design flow is shown in Figure 5 [3]. In this paradigm, both the package and the chip are planned concurrently to ensure signal and power-integrity closure. The advantage of this approach is that potential signal and power-integrity issues that can occur within the chip, the package, or as a result of the integrated chip and the package, are detected early in the design stage. Making changes to the design in the early stages is much easier, faster, and more cost effective. Package-aware integrated-circuit design results in a faster product turnaround time. The simulation methodologies developed in this thesis can be applied to perform chip-package cosimulation at the design stage, thereby making the package-aware design-flow possible.

The multiscale feature of the chip-package structure makes it difficult to use conventional tools for simulation. The on-chip structures require a very fine mesh, while a coarse mesh can be used for the package structures. The large variation in the mesh dimensions, as well as the fine mesh, make the simulation time and the memory requirement prohibitively large. In this thesis, an efficient simulation methodology that uses Laguerre polynomials for simulation has been developed. Several test cases have been simulated that show the advantage of using the Laguerre polynomials based scheme for multiscale simulation over the conventional methods.

# 1.2 Common Signal and Power-Integrity Problems Present in a Chip and a Package

A list of some of the parasitics at the chip and the package levels is shown in Figure 6.

1. Non-ideal power-ground structures:

The transient current that is drawn by switching logic circuits from a power-ground

# Design Convergence (Traditional Sequential Timeline)



Figure 4: Traditional sequential design flow [3].

#### Design Convergence (Concurrent Design Approach) Power-Ground I/O SI Closure Early planning and prototype Planning, Physical Synthesis Routing, Finishing Tape out IC Flow Release to MFG (3-6 weeks) Net list I/O drivers constraints Design verification Techfiles Final Bump Pattern and 1st pass acceptance and Design layout SPG Ratios IC Package Package Exploration Design complete, and Optimization Release to MFG (8-12 weeks)

Figure 5: Package-aware design flow. [3].

plane produces simultaneous switching noise (SSN) due to the non-ideal planes in the power-distribution network. A cross section of a power-ground plane is shown in Figure 7. The transient current drawn from a chip is modeled by a current source. An ideal voltage supply is also connected to the power-ground plane. A typical waveform at some point on the power-ground plane marked *probe* is shown in the figure. Rather than an ideal constant voltage, noise voltage on the order of hundreds of millivolts, known as simultaneous switching noise (SSN), is typically present for chips operating in the GHz range. Power-supply noise can cause problems such as false switching in logic circuits [4]. Power grids on the chip are also non ideal and increase the noise voltage [5]. On-chip and package decoupling capacitors are placed to minimize SSN [6]. Simulators should also be capable of including passive components such as capacitors, inductors and resistors, which are almost always present.

#### 2. Reflections due to imperfect terminations:

Terminations in interconnects that are not matched to their charactersitic impedances will result in problems such as ringing and reflections [4]. These reflections degrade the performance of the driver and the receiver ICs that are attached to the net.



Figure 6: Common signal and power-integrity problems present in a package.



Figure 7: Simultaneous switching noise (SSN).

#### 3. Parasitics of via, solder-bump, package leads, wirebonds:

Solder bumps, leads, and wirebonds are interfaces between the chip and the package. In addition to carrying signals between the chip and the package domains, their parasitics generate significant noise degrading the performance [7] [8]. Via parasitics are also critical to accurately estimate the noise voltage levels. The parasitics may cause reflections, which can introduce ringing in the waveform.

#### 4. Interconnect parasitics:

Interconnects are modeled using cascaded lumped elements composed of inductors (L), capacitors (C), resistors (R, G) and mutual inductance. Cascaded lumped element model for transmission lines based on R,L,G,C per unit length matrices for multiconductor transmission lines is given in [9]. The loss in the interconnects and substrate attenuate the signal as it propagates in the channel.

## 1.3 Proposed Research

The objective of the proposed research is to develop a transient simulation methodology for chip-package cosimulation. The solver should be capable of solving large practical problems; it must be fast and accurate; the techniques should be robust to model structures of different types of configurations. Based on the proposed research the following research work has been completed:

#### 1. Transient simulation using Laguerre polynomials

Transient simulation using Laguerre polynomials is unconditionally stable and therefore, has the advantage of not being limited by any time step. Laguerre FDTD has shown to be  $70-80\times$  faster than the conventional FDTD scheme. For chip-package cosimulation, the on-chip structures require a very small mesh making the time step prohibitively small for simulation using the conventional finite-difference time-domain scheme. Since Laguerre FDTD is unconditionally stable, it is ideally suited for chip-package cosimulation. Since its introduction, several modifications have been made to the algorithm. The new methodology has been named SLeEC and stands for simulation using Laguerre equivalent circuit.

#### 2. Simulation for any length of time

The limited time duration for which Laguerre FDTD could be simulated has been resolved, so that Laguerre FDTD can now be done for any length of time.

#### 3. Companion model of the FDTD grid

An equivalent circuit model of the FDTD grid has been developed, reducing the number of unknowns to be solved without the use of long cumbersome equations.

#### 4. Transient circuit simulation using Laguerre polynomials

Laguerre FDTD has been applied to circuit problems consisting of passive circuit components such as inductors with mutual inductance, resistors, and capacitors.

#### 5. Choosing the correct number of basis coefficients

Transient simulation using Laguerre polynomials requires finding the correct number of basis coefficients to accurately represent the time-domain waveform. A numerical way by which the correct number of basis functions are chosen has been proposed.

#### 6. Full-wave EM simulator using the SLeEC methodology

A 3D time-domain EM simulator that uses Laguerre polynomials for transient simulation has been developed. A variety of test cases have been simulated to demonstrate the advantage of using this tool for chip-package cosimulation.

#### 7. Obtaining frequency-domain parameters through time-domain simulation

A way by which frequency-domain parameters can be obtained from time-domain simulation has been proposed. Results show that time-domain windowing is necessary before conversion to frequency-domain parameters to obtain the right results.

#### 8. Efficient use of full-wave solvers for chip-package cosimulation

For chip-package cosimulation, rather than using a full-wave solver to simulate the entire structure, the structure can be partitioned into different blocks and a full-wave solver can be applied to each of these blocks separately. Results from each of these blocks can be integrated together to model the complete structure. The on-chip structures have been simplified to a great extent. The proposed technique has been demonstrated for package power-ground planes and package interconnects. The following tasks have been completed:

#### (a) Modeling of microstrip lines referenced to a power-ground plane

A microstrip-line configuration is shown in Figure 8. Given two-port frequency-domain parameters of a microstrip line, which has ports located at the near end and the far end of the microstrip, as well as two-port frequency parameters of the power plane, which has ports located at the near-end reference and the far-end reference of the microstrip, an admittance matrix model to integrate the interconnect and the power plane has been developed. The two-port admittance matrix model has been generalized to an N-port model that can be used to model N coupled microstrip lines referenced to a power-ground plane. To demonstrate scalability, a 64-bit bus referenced to a power-ground plane has been simulated.

#### (b) Modeling of a conductor-backed coplanar-waveguide structure

It is common for interconnects to be routed on the same layer as the power or a ground plane, by creating a slot on the plane and routing the interconnect in the slot, as shown in Figure 8. The interconnect and the power-ground plane-pair form a conductor-backed coplanar-waveguide structure. Given such a configuration, the frequency parameters of the interconnect are obtained separate from those of the power plane. The two sets of frequency parameters are integrated together using multiconductor transmission line theory. A good correlation between Sonnet<sup>®</sup> and the proposed model has been obtained over a wide bandwidth of 8GHz.

- (c) DC Analysis with frequency-domain parameters

  A method has been developed to include DC sources along with frequency-domain parameters in transient cosimulation of package interconnects and package power-ground planes. Augmenting the transient simulation method to include DC operating point has been completed. A transmission line example, showing a good match with ADS® has been accomplished.
- (d) Memory optimization for linear transient simulation with current sources Memory can be reduced significantly for linear transient simulation with transient current sources. The memory complexity can be reduced from  $O(N^2)$  to O(N), where N is the number of ports in the frequency-domain parameter block.

#### 1.4 Dissertation Outline

Conventional time-domain solvers are limited by an upper bound on the time step that can be used to obtain stable and accurate simulation results. This limit in the time step, known as Courant time step, is a major bottleneck for chip-package cosimulation. With small mesh dimensions required for on-chip structures, the time step can become prohibitively small. Transient simulation using Laguerre polynomials is unconditionally stable and is not limited by the Courant time step. Prior limitations have been overcome and the enhanced methodology is called SLeEC and stands for simulation using Laguerre equivalent circuit. SLeEC can be applied to both 3D EM simulation and linear transient circuit simulation. Circuits composed of resistors, inductors (with mutual inductance), capacitors, and linear voltage/current sources can be simulated using SLeEC. Transient simulation results show excellent correlation between the proposed technique and the traditional EM/circuit



Figure 8: A microstrip and a conductor-backed coplanar-waveguide configuration.

simulators.

For chip-package cosimulation, rather than using a full-wave solver to simulate the entire structure, the structure can be partitioned into different blocks and full-wave solver can be applied to each of these blocks separately. Results from each of these blocks can be integrated together to model the complete structure. The on-chip structures have been simplified to a great extent. The proposed technique has been demonstrated for package power-ground planes and package interconnects. The methodology is memory efficient and scalable to large problems. The technique can be used for frequency-domain and time-domain simulation of package structures. Examples showing the scalability of this technique to realistic test cases are given. The technique permits the use of complex non-linear driver models in the simulation. A memory optimization technique for linear systems, which also results in faster simulation, has been proposed.

The remainder of this thesis is organized as follows: The SLeEC methodology is given in Chapters 2-6; 3D EM test cases showing good correlation between the conventional FDTD scheme and the SLeEC methodology is presented in Chapter 7; transformation from time-domain to frequency-domain parameters is given in Chapter 8, followed by efficient use of full-wave solvers for chip-package cosimulation in Chapter 9.

## 2 Transient Simulation Using Laguerre Polynomials

#### 2.1 Introduction

The finite-difference time-domain (FDTD) scheme has been a ubiquitous method for transient electromagnetic (EM) analysis [10] and circuit simulation [11]. The main drawback of FDTD is the Courant-Friedrich-Levy (CFL) condition, which will be referred to as the Courant condition, that limits the time step that can be used to obtain stable and accurate simulation results. In EM analysis, the smaller the mesh dimension, the smaller is the Courant time step [10]. In mathematical form, the Courant condition for EM simulation is

$$\Delta t < \frac{1}{v_{max}} \left( \left( \frac{1}{\Delta x} \right)^2 + \left( \frac{1}{\Delta y} \right)^2 + \left( \frac{1}{\Delta z} \right)^2 \right)^{-\frac{1}{2}},$$
 (1)

where  $v_{max}$  is the maximum phase velocity of the wave propagation,  $\Delta x$ ,  $\Delta y$ , and  $\Delta z$  are the smallest mesh dimensions in the x, y, and z directions [10]. The time-step limit for numerical stability can be derived using dispersion analysis [10]. A summary of the derivation in [10] is given in the appendix in Chapter 12.

In transient circuit simulation of passives such as resistors, inductors, and capacitors, the maximum allowable time step is a function of the smallest inductor and capacitor values [12].

$$\Delta t < \sqrt{L_{min}C_{min}} \tag{2}$$

Courant-like condition for stability in the circuit domain is given in Equation 2, where  $L_{min}$  and  $C_{min}$  are the smallest inductor and capacitor values in the circuit.

The Courant condition is a major bottleneck in using FDTD for chip-package cosimulation. Multiscale dimensions in a chip-package structure is shown in Figure 9. The on-chip structures are in the *nanometer* scale, the solder pads typically have a diameter of  $50\mu m$ , the package interconnects are in the  $100\mu m$  range, and the package structures, such as the power-ground planes, are in the mm scale. The on-chip structures that are in the nm range would require a fine mesh for simulation, making the time step prohibitively small.

An unconditionally stable implicit-FDTD scheme using Laguerre polynomials has been proposed in [13]. The method presented in [13] will be referred to as the Laguerre-FDTD scheme in the rest of this document. Laguerre FDTD is unconditionally stable and therefore,



Figure 9: Multiscale features in a chip-package structure.

the time step is not limited by the Courant condition. It has been shown in [13] that Laguerre FDTD can be  $80 - 100 \times$  faster than the conventional FDTD scheme. Since transient simulation using Laguerre polynomials is unconditionally stable, it is ideally suited for chip-package cosimulation.

The following modifications and additions to the original Laguerre-FDTD scheme in [13] have been made in this research work:

- 1. The limited time duration for which Laguerre FDTD could be simulated has been resolved, so that Laguerre FDTD can now be done for all time duration.
- 2. An equivalent circuit model, which is also known as a *companion model*, of the FDTD grid has been developed, reducing the number of unknowns to be solved without the use of long cumbersome equations.
- 3. Laguerre FDTD has also been applied in transient simulation of circuits consisting of passive circuit components, such as inductors with mutual inductance, resistors, and capacitors. The companion models for these components, which allow easier implementation, have also been developed.
- 4. In Laguerre FDTD, the time-domain source waveforms are represented in the Laguerre domain by a set of Laguerre coefficients. The source coefficients are used to solve for the unknown values in the Laguerre domain. The output of interest is converted back to the time domain from the Laguerre domain to obtain the transient waveform. To

obtain maximum accuracy, the right number of coefficients has to be used to generate the time-domain waveform. A numerical way by which the correct number of basis coefficients are chosen has been proposed.

Each of these modifications are explained in detail after the algorithm has been presented. The new and improved algorithm has been named SLeEC, which stands for simulation using Laguerre equivalent circuit.

#### 2.2 The SLeEC Alogrithm

SLeEC can be applied to linear transient circuit simulation, as well as time-domain electromagnetic simulation. In circuit simulation, SLeEC can be used in transient analysis of linear passive components such as resistors, capacitors, inductors, mutual inductance, voltage sources, and current sources. In electromagnetic simulation, SLeEC can be used for transient analysis instead of the traditional leap-frog scheme.

The flowchart of the SLeEC methodology is shown in Figure 10. A summary of the



Figure 10: The flowchart of the SLeEC methodology.

methodology is given in this paragraph, followed by a detailed explanation of each of the steps. The first step is to convert the input source waveforms from time domain to Laguerre domain. A time-domain waveform can be represented in the Laguerre domain by a set of coefficients. The next step is to replace the (1) capacitors, inductors, mutual inductance, in the

case of circuit simulation and (2) the FDTD grid, in the case of electromagnetic simulation, with their respective companion models. The companion models in the time domain for an inductor and a capacitor is given in [14]. The companion models in the Laguerre domain for the FDTD grid, resistors, capacitors, and mutual inductance are derived in this thesis. The companion models in the Laguerre domain are made up of resistors, current sources, voltage sources, and controlled sources. A DC analysis is done once for each of the basis coefficients that represents the input waveforms. Although multiple input waveforms maybe present, they can all be taken into account in a single DC analysis. At the end of each of the DC analyses the companion models are updated before the next DC analysis. The DC solution represent the Laguerre basis coefficients of its corresponding time-domain waveform. In the companion model of the FDTD grid for EM simulation, the nodal voltages are mapped to electric-field coefficients and the branch currents to magnetic-field coefficients. The final step is to convert the DC values for the output of interest to the time domain. Detailed explanation of each of these steps is given in the following subsections.

It is worth mentioning earlier that the DC values do not have to be saved at each iteration. Once the companion models are updated at the end of each iteration, there is no need to save the DC solution. Only the DC values for the output of interest needs to be saved at the end of each iteration, making the algorithm memory efficient.

#### 2.2.1 Transformation from time domain to Laguerre domain

Laguerre polynomials are defined recursively as follows [13]:

$$L_0(t) = 1, (3)$$

$$L_1(t) = 1 - t, (4)$$

$$pL_p(t) = (2p - 1 - t)L_{p-1}(t) - (p - 1)L_{p-2}(t); \text{ for } p \ge 2.$$
 (5)

Laguerre polynomials satisfy the relationships

$$\int_0^\infty \varphi_u(\bar{t})\varphi_v(\bar{t})d\bar{t} = \delta_{uv},\tag{6}$$

$$\varphi_u(\bar{t}) = e^{-\bar{t}/2} L_u(\bar{t}), \tag{7}$$

$$\bar{t} = s \cdot t. \tag{8}$$

In Equations 6 - 7,  $\bar{t}$  is the real time multiplied by a scaling factor s, as shown in Equation 8. The actual time scale for which the simulation is run is very small. To make the basis function work, the real time is multiplied by s to scale the magnitude to the order of seconds.  $\varphi(\bar{t})$  is Laguerre polynomial L weighted by the exponential function  $e^{-\bar{t}/2}$ , as given in Equation 7. A Laguerre polynomial weighted by the exponential function satisfies the orthonormal property of basis functions given in Equation 6.  $\delta_{uv}$  in Equation 6 is the Kronecker delta function. The weighted Laguerre polynomials for orders p = 0 to p = 4 are shown in Figure 11 [13].



Figure 11: Weighted Laguerre polynomials for orders p = 0 to p = 4.

A triangular waveform with a rise/fall time of 10 ps, and a delay of 10 ps is shown in Figure 12. The triangular waveform, W(t), can be represented as a sum of weighted Laguerre polynomials as

$$W(t) = \sum_{p=1}^{p=N} W_p \varphi_p(\bar{t}). \tag{9}$$

In Equation 9,  $W_p$  represents the  $p^{th}$  coefficient of the  $p^{th}$  basis function  $\varphi_p$ .  $W_p$  can be obtained using the orthonormal property of basis functions and is given in Equation 10.

$$W_p = \int_0^\infty W(t)\varphi_p(\bar{t})d\bar{t} \tag{10}$$

The dotted curve in Figure 12 is the original triangular waveform and the solid curve is the waveform reconstructed using 200 coefficients of the Laguerre basis functions. The two waveforms in Figure 12 are indistinguishable. The first 200 coefficients of the basis functions are shown in Figure 13. The scaling factor used to construct the waveform is  $s = 3.0 \times 10^{12}$ .

#### 2.2.2 Companion Models

The second step is to replace the FDTD grid, or the passive components in the case of circuit simulation, with their respective Laguerre companion models. SLeEC requires solving a system of linear equations of the form Ax = b to obtain the unknown  $p^{th}$  Laguerre basis coefficients. These equivalent circuit models enable the use of stamp rule in modified nodal analysis to set up and solve the matrix, thereby making the implementation easier [14]. The popular Spice simulator uses modified nodal analysis for simulation. Therefore SLeEC can be seamlessly integrated with Spice to do transient circuit/EM simulation using Laguerre polynomials. In addition, as explained in Chapter 5, the companion models help reduce the dimension of the matrix to be solved without the use of long and cumbersome equations.

Companion models are composed of resistors, independent and dependent voltage/current sources. Detailed derivation of the models for circuit/EM simulation are given in Chapter 4 and Chapter 5. As a preview, the companion model for a 1D FDTD grid is shown in Figure 14. Vertical bars in the grid represent  $E_z$  fields and  $\times$  represent  $H_y$  fields. The grid is terminated using the perfect electric conductor (PEC) boundary condition. Nodal voltages represent the  $E_z$  fields and branch currents represent the  $H_y$  fields. The PEC boundary condition can be represented in the companion model by a short circuit, as shown in the figure.

#### 2.2.3 DC Analysis

To obtain the  $p^{th}$  Laguerre basis coefficients of the unknown values, a DC analysis is done once on the companion circuit model. The DC solution represent the Laguerre basis coefficients of their corresponding time-domain waveform. At the end of the DC analysis, the solution is used to update the companion model before solving for the  $(p+1)^{th}$  basis coefficients. The two-step process of updating the companion model and solving the matrix is repeated until enough coefficients have been obtained to accurately represent the time-domain waveform for the output of interest.



Figure 12: The original (dots) and reconstructed (solid) triangular waveform using Laguerre basis functions.



Figure 13: The coefficients of basis functions for the triangular waveform in Figure 12.



Figure 14: The companion model for a unit cell in a 1D FDTD grid.

Two important points to be noted are (1) There is no need to save the entire DC solution at every iteration. Once the companion model is updated, there is no need to store the DC solution for the next iteration. Only the coefficients for the output of interest need to be saved. (2) The A matrix when solving Ax = b stays constant throughout the iterations. Only the b matrix is updated at every iteration. Therefore, LU decomposition for solving Ax = b needs to be done only once.

#### 2.2.4 Transformation from Laguerre domain to time domain

The final step is to convert the Laguerre-domain coefficients to time domain for the output of interest. This is done using Equation 9. In order to maximize the accuracy, the right number of basis coefficients must be used to generate the time-domain waveform. The methodology for choosing the correct number of basis coefficients is given in Chapter 6.

#### 2.3 Advantages of Laguerre Polynomials

There are several reasons why Laguerre polynomials is attractive over other orthogonal polynomials:

- 1. Transient simulation using Laguerre polynomials is unconditionally stable.
- 2. Laguerre polynomials allows a simple method for choosing the correct number of basis coefficients to obtain maximum accuracy when generating its corresponding time-domain waveform.

- 3. When solving for N Laguerre basis coefficients  $\{W_0, W_1, \dots, W_{N-1}\}$ , the dimension of the matrix to be solved is independent of N.
- 4. Although a matrix of the form Ax = b is solved to obtain the  $p^{th}$  Laguerre basis coefficients, the A matrix is independent of p and LU-decomposition has to be done only once when solving for the N coefficients  $\{W_0, W_1, \ldots, W_p, \ldots, W_{N-1}\}$ .
- 5. There is no need to save the DC solution at every iteration. Once the companion model is updated, there is no need to store the DC solution for the next iteration. Only the coefficients for the output of interest need to be saved. In this respect, Laguerre polynomials makes the algorithm memory efficient.
- 6. And most important of all, it works well. A good correlation has been obtained with FDTD for all of the test cases that have been simulated.

## 2.4 Summary

Transient simulation using Laguerre polynomials is unconditionally stable. The Laguerre-FDTD scheme was proposed in [13]. Several modifications to the Laguerre-FDTD scheme have been made in this research work. The improved simulation methodology has been named SLeEC, which stands for simulation using Laguerre equivalent circuit.

# 3 Simulation For Any Length of Time

### 3.1 Introduction

A major drawback of the Laguerre-FDTD methodology in [13] is that the transient simulation can be performed only for a limited time duration and cannot be done for all time. There are two reasons for this limitation: the first reason is the nature of the Laguerre basis functions and the second reason is the finite precision of the computer making it impossible to represent very large numbers or very small numbers. Elaborate explanations of the reasons for the limitation will be followed by a solution that allows simulation for all time duration.

The first reason for the limitation is due to the nature of the basis functions. The Laguerre basis functions for order p = 0-4 are plotted in Figure 11. As shown in the figure, the basis functions approach 0 as t tends to  $\infty$ . Therefore, any time-domain waveform that is spanned by these set of basis functions also goes to 0 as t tends to  $\infty$ . Structures that are lossless or have a low loss cannot be simulated accurately because the fields can be nonzero for a long period of time.

The second reason for the limitation is the finite precision of the computer. A Laguerre basis function of order p is an exponentially decaying function multiplied by the  $p^{th}$  Laguerre polynomial. The exponential function quickly decays to 0 and beyond a certain time the exponential function is treated exactly as 0. Laguerre polynomials become very large with increasing time. Beyond a certain time, the numbers become very large to be represented with the limitation of finite precision and is represented as Inf in the IEEE 754 floating-point standard. Consequently, beyond a certain time point, the basis function is represented as  $0 \times Inf$  or NaN, not a number.

## 3.2 An Example to Demonstrate the Limitation

An example where Laguerre FDTD is unable to capture the transient response beyond some time is presented in this paragraph. A lossless resonant cavity containing the fields  $E_x$ ,  $E_y$ , and  $H_z$  that is terminated with PEC boundary is shown in Figure 15. A modulated

Gaussian source waveform with the same parameters as [13] is used as the  $J_y$  current source and is placed along the vertical dashed line in Figure 15. The number of cells used to mesh the structure is  $10 \times 10$ . The  $E_y$  field at the location marked probe between 15ns and 21ns is plotted in Figure 16. Theoretically, since the cavity is lossless, the fields must never decay to 0. The solid waveform has been obtained using the Laguerre-FDTD scheme and the dots by the conventional FDTD scheme. Since the basis functions go to 0 as t tends to  $\infty$ , the solid waveform starts to decay to 0, as shown in the figure. The abrupt termination of the solid waveform, which is indicated by the box, occurs due to the limitation of finite precision, as explained in the previous paragraph.



Figure 15: A 2D box with PEC boundary.

### 3.3 A Solution to Overcome the Limitation

The solution to overcome this limitation is to divide the total simulation time into different intervals. Let Interval I span from time  $t = t_0$  to  $t = t_1$ , Interval II span from time  $t = t_1$  to  $t = t_2$ , and so on, as shown in Figure 17. The length of each interval is chosen such that simulation can be accurately performed in that time duration. The final values at the end of Interval i are used as initial conditions to simulate in Interval (i + 1). This process is repeated until the time duration for which the simulation needs to be done is completed. The companion models for circuit simulation and EM FDTD simulation, which will be presented in Chapters 4 and 5, include initial conditions to enable restarting a



Figure 16: The transient simulation response from 15ns to 20ns. Solid: the Laguerre-FDTD scheme and Dots: FDTD



Figure 17: The total simulation time is divided into different intervals.

simulation. The differential equations that describe the transient behavior of a system have been modified to explicitly include initial conditions that will permit simulation for all time duration. The SLeEC algorithm that is presented in Chapter 2.2 is applied in each of the time intervals.

It must be noted that Laguerre-MNA does not require storing all nodal voltages and all branch currents from the series of DC analysis that has been performed. At the end of each DC analysis, once the companion models have been updated, there is no need for saving the solution. The only solution that needs to be stored at the end of each DC analysis is the solution of the output for which the transient waveform is to be observed, which is a constant amount of memory.

## 3.4 Computing the Final Values at the End of an Interval

The final values at the end of an interval, e.g. Interval i, must be computed in order to use these values as the initial conditions in the next time interval, Interval (i+1). For example, the initial condition for a capacitor is the initial voltage across the capacitor and the initial condition for an inductor is the initial current through the inductor. Not all the coefficients, i.e. the DC solution for the voltage across a capacitor and the current through an inductor need to be saved to compute the final value at the end of a time interval. At the end of each DC analysis, the contribution of  $p^{th}$  Laguerre basis coefficient  $(W_p)$  to the final value of the transient waveform at the end of a time-interval  $(t_f)$  can be computed by using Equation 11.

$$value(t_f) = value(t_f) + W_p \varphi_p(st_f)$$
(11)

 $value(t_f)$  is first initialized to 0, before using Equation 11. By using Equation 11, the coefficients of the DC solution need not be saved in order to compute the final value of a quantity at the end of a time-interval.

## 3.5 Examples of Simulation Using Initial Conditions

An LC circuit is shown in Figure 18. The values for L and C are 1nH and 1pF, respectively.



Figure 18: The circuit for transient simulation with initial conditions.

The initial conditions are the voltage across a capacitor and the current through an inductor at time t = 0. The initial current through the inductor, i(0), is -8.12 mA and the initial voltage across the capacitor, V(0), is 0.18 V. The transient simulation waveforms of V(t) generated using 200 Laguerre basis coefficients and 400 coefficients are shown in Figure 19

and Figure 20, respectively. Note the abrupt termination of the waveform around 0.5ns in Figure 20. Simulation beyond this time requires restarting the simulation again with the new initial conditions.

As a second example, for the structure shown in Figure 15, the simulation time of 7.5ns is divided into two intervals, 5ns and 1.5ns.  $E_y(t)$  at the location marked probe in Figure 15 for Interval I is shown in Figure 21. The solid line has been obtained using SLeEC and the dots is from the conventional FDTD scheme. The time-scale factor used in Interval I is  $s = 7.0 \times 10^{10}$ . The final values of the fields at the end of Interval I are used as initial conditions for simulation in Interval II. The transient waveform for Interval II is plotted in Figure 22. The value of the time-scale factor used for the smaller Interval II is  $s = 7.56 \times 10^{11}$ . The number of basis coefficients used is 400 for Intervals I and II.

## 3.6 Summary

The Laguerre-FDTD scheme proposed in [13] has the bottleneck of being able to simulate only for a limited time duration. This limitation has been overcome in SLeEC. The total simulation time is divided into different intervals. At the end of an interval, the simulation is restarted using the final values in the previous interval as initial conditions.



Figure 19: The voltage across the capacitor V(t) with initial conditions (200 basis coefficients). Dots: WinSpice and Solid: SLeEC



Figure 20: The voltage across the capacitor V(t) with initial conditions (400 basis coefficients). Dots: WinSpice and Solid: SLeEC



Figure 21: The simulation results from 0ns to 5ns. Solid: SLeEC and Dots: FDTD



Figure 22: The simulation results from 5ns to 6.5ns using new interval length and time-scale factor. Solid: SLeEC and Dots: FDTD

# 4 Companion Models for Circuit Simulation

### 4.1 Introduction

SLeEC can be used for linear transient simulation of circuits made up of inductors with mutual inductance, capacitors, and resistors. The advantage of SLeEC is the unconditional stability by which significant speed up can be obtained over the conventional time-domain schemes that are limited by the Courant condition. In the second step of the SLeEC algorithm, shown in Figure 10, the circuit components are replaced by their respective Laguerre-domain companion models. Companion models for an inductor, capacitor, and mutual inductance are derived in Chapters 4.2, 4.3, and 4.5.

## 4.2 The Companion Model of an Inductor

The Thevenin and Norton forms of the companion model for an inductor of value L is shown in Figure 23. The *structure* of the companion models are the same for both an inductor as well as a capacitor. The current through the inductor at time t is i(t). The initial current



Figure 23: The Thevenin and Norton forms of the companion model for an inductor or a capacitor.

through the inductor is i(0) and the direction of the current flow is marked by the arrows shown in the figure. The voltages at Node A and Node B are  $V^A(t)$  and  $V^B(t)$ , respectively.  $V_p^A$  and  $V_p^B$  represent the  $p^{th}$  basis coefficients of the voltages  $V^A(t)$  and  $V^B(t)$ , respectively. The  $p^{th}$  basis coefficient of the branch current i is marked as  $i_p$ .

In the Thevenin form, an inductor is replaced by a resistor in series with two voltage sources. The voltage source marked  $V_{o,ind/cap,T}$  is a function of the initial current through the inductor and represents the initial condition. The value of the series resistor is

$$R_{ind,T} = 0.5Ls, (12)$$

where s is the time-scale factor and the subscript T stands for Thevenin. The value of the first voltage source is a function of the previous DC results of the branch currents. The value of the first voltage source is

$$V_{ind,T} = Ls \sum_{k=0, p \ge 1}^{p-1} i_k.$$
 (13)

In the first DC analysis that is done for p = 0,  $V_{ind,T}$  is set to 0. The value of the second voltage source is

$$V_{o,ind,T} = Lsi(0). (14)$$

The rest of the chapter presents the mathematical derivation of the companion model.

The voltage across the inductor is given by

$$V^A - V^B = L\frac{di}{dt} - Li(0)\delta(t). \tag{15}$$

The time varying current and voltages, i,  $V^A$ , and  $V^B$ , can be written as a sum of Laguerre basis functions as

$$i = \sum_{q=0}^{\infty} i_q \varphi_q(\bar{t}) \tag{16}$$

$$V^A = \sum_{q=0}^{\infty} V_q^A \varphi_q(\bar{t}) \tag{17}$$

$$V^{B} = \sum_{q=0}^{\infty} V_{q}^{B} \varphi_{q}(\bar{t}) \tag{18}$$

The variables  $i_q$ ,  $V_q^A$ , and  $V_q^B$  are the  $q^{th}$  basis coefficients of the current and voltages.  $\varphi_q$  is the  $q^{th}$  basis function defined in Equation 7 and  $\bar{t}$  is the scaled time defined in Equation 8. The time derivative of U can be written in the Laguerre domain as [13]

$$\frac{dU}{dt} = \frac{d}{dt} \left( \sum_{q=0}^{\infty} U_q \varphi_q(\bar{t}) \right)$$

$$= s \sum_{q=0}^{\infty} \left( 0.5U_q + \sum_{k=0,q>1}^{q-1} U_k \right) \varphi_q(\bar{t}). \tag{19}$$

Substituting Equations 16-18 in Equation 15 and using the time-derivative relationship in Equation 19, Equation 20 can be obtained.

$$\sum_{q=0}^{\infty} V_q^A \varphi_q(\bar{t}) - \sum_{q=0}^{\infty} V_q^B \varphi_q(\bar{t}) = Ls \sum_{q=0}^{\infty} \left( 0.5i_q + \sum_{k=0,q>1}^{q-1} i_k \right) \varphi_q(\bar{t}) - Li(0)\delta(t)$$
 (20)

Multiplying Equation 20 by  $\varphi_p(\bar{t})$ , integrating over time  $[0, \infty]$ , and using the orthonormal property of basis functions given in Equation 6, Equation 21 can be obtained.

$$V_p^A - V_p^B = Ls \left( 0.5i_p + \sum_{k=0, p \ge 1}^{p-1} i_k \right) - Lsi(0)$$
 (21)

In deriving Equation 21, Equation 22 is used when integrating the delta function term.

$$\int_{0}^{\infty} \delta(t)\varphi_{p}(\bar{t})d\bar{t} = s\varphi_{p}(0) = s \tag{22}$$

Equation 21 can be represented in the Thevenin form by a resistor in series with two voltage sources with the values given in Equations 12-14.

Equation 21 can be rearranged to obtain a Norton representation. Solving for  $i_p$  in Equation 21, Equation 23 can be obtained.

$$i_p = 2i(0) + \frac{1}{0.5Ls}(V_p^A - V_p^B) - 2\sum_{k=0, p \ge 1}^{p-1} i_k$$
(23)

The Norton representation of the companion model for an inductor is a resistor and two current sources, all in parallel configuration. The Norton representation is shown in Figure 23. The value of the resistor term is

$$R_{ind,N} = 0.5Ls. (24)$$

The value of the current source that represents the initial condition is

$$I_{o.ind.N} = 2i(0). \tag{25}$$

The value of the second current source that is placed in parallel with the other components is

$$I_{ind,N} = 2\sum_{k=0,p\geq 1}^{p-1} i_k. (26)$$

Using KCL and KVL equations it can be verified that the companion models satisfy Equations 21 and 23.

## 4.3 The Companion Model of a Capacitor

The companion model of a capacitor in the Thevenin and Norton forms are also shown in Figure 23. The voltage across the capacitor at time t is  $V^{AB}$ . The initial voltage across the capacitor of value C is  $V^{AB}(0)$  and the polarity of the voltage is shown in the figure.  $V_p^A$  and  $V_p^B$  represent the  $p^{th}$  basis coefficients of the voltages  $V^A(t)$  and  $V^B(t)$ , respectively. The  $p^{th}$  basis coefficient of the branch current i is marked as  $i_p$ .

The Norton form of the companion model for a capacitor is two current sources and a resistor placed in parallel, as shown in Figure 23. The current source marked  $I_{o,ind/cap,N}$  is a function of the initial voltage across the capacitor and represents the initial condition. The value of the parallel resistor is

$$R_{cap,N} = \frac{1}{0.5sC},\tag{27}$$

where s is the time-scale factor. The value of the first independent current source is a function of the previously solved DC nodal voltages across the capacitor. The value of the current source is

$$I_{cap,N} = -sC\left(\sum_{k=0,p\geq 1}^{p-1} V_k^A - \sum_{k=0,p\geq 1}^{p-1} V_k^B\right).$$
 (28)

The value of the current source that represents the initial condition is given by

$$I_{o,cap,N} = -sCV^{AB}(0). (29)$$

The derivation of the companion model of a capacitor is similar to an inductor. The current through a capacitor is given by

$$i = C\frac{dV^{AB}}{dt} - CV^{AB}(0)\delta(t). \tag{30}$$

The time-domain current and voltages can be written in terms of the Laguerre basis functions that are given in Equations 16-18. Substituting these into Equation 30, using the time-derivative relation in Equation 19, multiplying both sides by  $\varphi_p(\bar{t})$ , integrating over time  $[0, \infty]$ , and using the orthonormal property of Laguerre basis functions given in Equation 6, Equation 31 can be obtained.

$$i_p = 0.5sC(V_p^A - V_p^B) + sC\left(\sum_{k=0, p \ge 1}^{p-1} V_k^A - \sum_{k=0, p \ge 1}^{p-1} V_k^B\right) - sCV^{AB}(0)$$
(31)

In deriving Equation 31, Equation 22 is used when integrating the delta function term. Equation 31 can be represented in a Norton form by a resistor and two current sources, all in parallel, as shown in Figure 23.

Equation 31 can be rearranged to obtain a Thevenin representation of the companion model. Solving for  $V_p^A - V_p^B$  in Equation 31, Equation 32 can be obtained.

$$V_p^A - V_p^B = \frac{1}{0.5sC}i_p + 2V^{AB}(0) - 2\left(\sum_{k=0, p \ge 1}^{p-1} V_k^A - \sum_{k=0, p \ge 1}^{p-1} V_k^B\right)$$
(32)

The Thevenin represention for the companion model of a capacitor is a resistor in series with two voltage sources. The value of the resistor is given by

$$R_{cap,T} = \frac{1}{0.5sC}. (33)$$

The value of the voltage source that represents the initial condition is given by

$$V_{o,cap,T} = -2V^{AB}(0). (34)$$

The value of the second voltage source is given by

$$V_{cap,T} = -2\left(\sum_{k=0,p\geq 1}^{p-1} V_k^A - \sum_{k=0,p\geq 1}^{p-1} V_k^B\right).$$
 (35)

Using KCL and KVL equations it can be verified that the companion models satisfy Equations 31 and 32.

## 4.4 Transient Simulation of Inductor and Capacitor (LC) Circuits

The series LC circuit, which is shown in Figure 24, was simulated using the SLeEC methodology given in Figure 10. The companion model for the circuit in Figure 24 is shown in Figure 25. The input to the circuit is a triangular waveform with a 10ps rise/fall time and a delay of 10ps, as shown in Figure 12. The value used for the time-scale factor is



Figure 24: A series LC circuit.



Figure 25: The companion model for the circuit in Figure 24.

 $s=3.0\times10^{12}$ . The transient voltage across the capacitor, using 200 basis coefficients, is plotted in Figure 26. The solid line has been obtained using SLeEC and the dotted curve is the result from a commercially available circuit-simulator tool called WinSpice<sup>®</sup>. The simulation results show a good match up to 0.22ns. By increasing the number of basis functions to 400, a good match is obtained up to 0.5ns, as shown in Figure 27. The solid curve abruptly terminates around 0.5ns. As mentioned earlier in Chapter 3, simulation will have to be restarted using initial conditions for longer simulation.

The second example is an LC network with 102 nodes that is shown in Figure 28. The values of the capacitors and inductors are 1nF and 1nH, except for the capacitors connected



Figure 26: Voltage  $V^C$  in the circuit shown in Figure 24 using 200 basis coefficients. Solid: SLeEC and Dots: WinSpice



Figure 27: Voltage  $V^C$  in the circuit shown in Figure 24 using 400 basis coefficients. Solid: SLeEC and Dots: WinSpice

at Nodes 101 and 102 and the inductor connected between Nodes 101 and 102, which are 1fF and 1fH. The presence of the small valued capacitors and inductors considerably reduces the time step resulting from the Courant condition. SLeEC, which is unconditionally stable, has a significant advantage over FDTD. Simulation of this circuit using FDTD for 100ns requires over  $1.0 \times 10^8$  iterations, but simulation using SLeEC needs only 400 iterations.

The simulation results are plotted in Figure 29. The dotted curve has been obtained using WinSpice<sup>®</sup> and the solid curve is by using SLeEC. The input waveform is a triangular input with a rise/fall time of 4ns and a delay of 1ns. The time-scale factor used is  $s = 3.0 \times 10^9$ .

## 4.5 The Companion model for Mutual Inductance

In this section, the companion model for inductors with mutual coupling is derived. Two inductors, L1 and L2 with coupling M, is shown in Figure 30. The voltages across the inductors are  $V_1$  and  $V_2$  whose polarities are as shown in Figure 30. The direction of the current through the inductors is given in Figure 30. Using the *dot convention*, the voltages across the inductors including initial conditions are given by

$$V_1 = L_1 \frac{di_1}{dt} + M \frac{di_2}{dt} - L_1 i_1(0)\delta(t) - M i_2(0)\delta(t)$$
(36)

$$V_2 = M \frac{di_1}{dt} + L_2 \frac{di_2}{dt} - Mi_1(0)\delta(t) - L_2 i_2(0)\delta(t).$$
(37)

In Equations 36 and 37,  $i_1(0)$  and  $i_2(0)$  are the initial current through the inductors at time t = 0. The Laguerre-domain equations, Equations 38 and 39, corresponding to the time-domain differential equations, Equations 36 and 37, can be obtained using a procedure similar to Chapters 4.2 and 4.3.

$$V_1^p = sL_1\left(0.5i_1^p + \sum_{k=0, p>0}^{p-1} i_1^k\right) - sL_1i_1(0) + Ms\left(0.5i_2^p + \sum_{k=0, p>0}^{p-1} i_2^k\right) - Msi_2(0)$$
 (38)

$$V_2^p = sL_2\left(0.5i_2^p + \sum_{k=0, p>0}^{p-1} i_2^k\right) - sL_2i_2(0) + Ms\left(0.5i_1^p + \sum_{k=0, p>0}^{p-1} i_1^k\right) - Msi_1(0)$$
 (39)

Equations 38-39 can be represented using the Thevenin model shown in Figure 31. The



Figure 28: An LC network.



Figure 29: The transient voltage waveform at Node 102 in Figure 28. Solid: SLeEC and Dots: WinSpice



Figure 30: Two inductors with mutual coupling.



Figure 31: The companion model for mutual inductance in the Thevenin form.

values of the resistors are given by

$$R_A = 0.5L_1 s \tag{40}$$

$$R_B = 0.5L_2s. (41)$$

The current-controlled voltage sources are

$$V_A^{CCVS} = 0.5Msi_2^p (42)$$

$$V_B^{CCVS} = 0.5Msi_1^p. (43)$$

The independent voltage sources, which represent the initial conditions, are

$$V_A^{init} = -L_1 si_1(0) - M si_2(0) (44)$$

$$V_B^{init} = -L_2 si_2(0) - M si_1(0). (45)$$

The independent voltage sources, which represent the history terms, are

$$V_A^H = sL_1 \sum_{k=0, p>0}^{p-1} i_1^k + Ms \sum_{k=0, p>0}^{p-1} i_2^k$$
(46)

$$V_B^H = sL_2 \sum_{k=0, p>0}^{p-1} i_2^k + Ms \sum_{k=0, p>0}^{p-1} i_1^k.$$

$$\tag{47}$$

The companion model in the Norton form can also be obtained. Solving for  $i_1^p$  and  $i_2^p$  in Equations 38 and 39

$$i_1^p = \frac{2}{L_1 s} V_1^p - 2 \sum_{k=0, p>0}^{p-1} i_1^k + 2i_1(0) + \frac{2M}{L_1} i_2(0) - \frac{M}{L_1} i_2^p - \frac{2M}{L_1} \sum_{k=0, p>0}^{p-1} i_2^k$$
(48)

$$i_2^p = \frac{2}{L_2 s} V_2^p - 2 \sum_{k=0, p>0}^{p-1} i_2^k + 2i_2(0) + \frac{2M}{L_2} i_1(0) - \frac{M}{L_2} i_1^p - \frac{2M}{L_2} \sum_{k=0, p>0}^{p-1} i_1^k$$
(49)

Equations 48 and 49 can be represented in the Norton form as shown in Figure 32. The



Figure 32: The companion model for mutual inductance in the Norton form.

values of the conductances are given by

$$G_A = \frac{2}{L_1 s} \tag{50}$$

$$G_B = \frac{2}{L_2 s}. (51)$$

The current-controlled current sources are

$$I_A^{CCCS} = \frac{-M}{L_1} i_2^p \tag{52}$$

$$I_B^{CCCS} = \frac{-M}{L_2} i_1^p. \tag{53}$$

The independent current sources that represent the initial conditions are

$$I_A^{init} = 2i_1(0) + \frac{2M}{L_1}i_2(0) \tag{54}$$

$$I_B^{init} = 2i_2(0) + \frac{2M}{L_2}i_1(0).$$
 (55)

The independent current sources that represent the history terms are

$$I_A^H = -2\sum_{k=0, p>0}^{p-1} i_1^k - \frac{2M}{L_1} \sum_{k=0, p>0}^{p-1} i_2^k$$
(56)

$$I_B^H = -2\sum_{k=0, p>0}^{p-1} i_2^k - \frac{2M}{L_2} \sum_{k=0, p>0}^{p-1} i_1^k.$$
 (57)

Using KCL and KVL equations it can be verified that the circuit models in Figure 31 and 32 satisfy Equations 38-39 and Equations 48-49.

## 4.6 Summary

The Laguerre-domain companion models for an inductor, mutual inductance, and a capacitor have been derived in this chapter. These models can be used to do transient simulation using Laguerre polynomials using the Spice simulator.

# 5 Companion Models of the FDTD grid for EM Simulation

### 5.1 Introduction

The companion models of the FDTD grid in 1D, 2D, and 3D are derived in this section. By using the companion models, the number of unknowns in the matrix to be solved can be reduced without the use of long and cumbersome equations. The companion models enable the use of Spice to do transient EM and circuit simulation using Laguerre polynomials. The 1D, 2D, and 3D companion models for the FDTD grid are derived in Chapters 5.2, 5.3, and 5.4.

The Maxwell's equations in the differential form consists of the following set [10] and are summarized here for convenience:

$$\frac{\partial H_x}{\partial t} = \frac{1}{\mu} \left( \frac{\partial E_y}{\partial z} - \frac{\partial E_z}{\partial y} \right) \tag{58}$$

$$\frac{\partial H_y}{\partial t} = \frac{1}{\mu} \left( \frac{\partial E_z}{\partial x} - \frac{\partial E_x}{\partial z} \right) \tag{59}$$

$$\frac{\partial H_z}{\partial t} = \frac{1}{\mu} \left( \frac{\partial E_x}{\partial y} - \frac{\partial E_y}{\partial x} \right) \tag{60}$$

$$\frac{\partial E_x}{\partial t} = \frac{1}{\epsilon} \left( \frac{\partial H_z}{\partial y} - \frac{\partial H_y}{\partial z} - J_x \right) \tag{61}$$

$$\frac{\partial E_y}{\partial t} = \frac{1}{\epsilon} \left( \frac{\partial H_x}{\partial z} - \frac{\partial H_z}{\partial x} - J_y \right) \tag{62}$$

$$\frac{\partial E_z}{\partial t} = \frac{1}{\epsilon} \left( \frac{\partial H_y}{\partial x} - \frac{\partial H_x}{\partial y} - J_z \right) \tag{63}$$

These set of equations can be conveniently represented using the equivalent circuit model developed in Chapter 5.4.

A note on the accuracy of transient EM simulation using Laguerre polynomials compared to the conventional time-domain scheme. FDTD is second-order accurate in both the spatial and the time domain [10]. The spatial discretization is the same in both SLeEC and FDTD. Therefore SLeEC is also second-order accurate in the spatial domain. In the time domain, however, in an ideal situation when an infinite number of basis coefficients are used, Laguerre FDTD and SLeEC are exact solution without any approximations. Since a finite number

of basis coefficients are used in the simulation, the exact solution is approximated. It will be shown in Chapter 6 that a time-domain response is very sensitive to the number of coefficients that are used to generate it. It will be demonstrated later through test cases that more number of coefficients does not mean results with better match with the FDTD scheme. There is no range for the number of coefficients that would give a good agreement with the FDTD results. To obtain an optimal match with the FDTD scheme, the number of basis coefficients that must be used should be a specific number and the methodology to determine this number is explained in detail in Chapter 6. The optimal number of coefficients to generate the time-domain response varies with every test case, as well as between different probe locations within any particular test case. Numerous test cases have been simulated to verify the proposed algorithm for choosing the optimal number of coefficients.

### 5.2 1D FDTD

A 1D FDTD grid is shown in Figure 33. The only fields present are  $H_y$  and  $E_z$ . The positions of the electric fields are marked by | and those of the magnetic fields are shown by  $\times$ . The boundary conditions on either side of the grid are perfect electric conductor (PEC) boundary conditions.

The companion model of the FDTD grid is described before the derivation. The circuit model of a unit cell in an FDTD grid in terms of resistors, independent voltage sources, and independent current sources are given by the second subfigure in Figure 33. At the end of the  $q^{th}$  DC analysis, the nodal voltages and branch currents represent the  $q^{th}$  Laguerre basis coefficients of the electric fields and the magnetic fields, respectively. The value of the  $q^{th}$  Laguerre basis coefficient of the electric field  $E_z|_i^q$  is represented by the nodal voltage marked  $V_i^q$  and the magnetic fields on either side of  $E_z|_i^q$ ,  $H_y|_{i-1/2}^q$  and  $H_y^q|_{i+1/2}$ , are given by the branch currents  $I_{i-1/2}^q$  and  $I_{i+1/2}^q$ , respectively. The circuit model of the unit cell is cascaded to represent as many unit cells as needed. The model is terminated by a short circuit on both the sides to represent the PEC boundary conditions. More elaborate discussion on the implementation of different types of boundary conditions is given in Chapter 5.5. The nodal voltages represent electric-field coefficients and the short circuit forces the nodal voltage to



Figure 33: The companion model for a unit cell in an FDTD grid.

be zero, therefore modeling a PEC boundary. The values of the components are

$$R_1 = \frac{s\mu\Delta x}{2} \tag{64}$$

$$I_{val,i-1/2}^{q} = 2H_y^{init}|_{i-1/2} - 2\left(\sum_{k=0,q>0}^{q-1} H_y^k|_{i-1/2}\right)$$
(65)

$$R_{TH} = \frac{2}{s\epsilon \Lambda x} \tag{66}$$

$$V_{TH}^{q} = \frac{-2J_{z}^{q}|_{i}}{s\epsilon} - 2\left(\sum_{k=0,q>0}^{q-1} E_{z}^{k}|_{i}\right) + 2E_{z}^{init}|_{i}$$
(67)

 $\mu$ ,  $\epsilon$  represent the material properties of the medium,  $\Delta x$  is the unit-cell dimension,  $H_y^{init}$  and  $E_z^{init}$  are the initial conditions of the electric and magnetic fields at the location marked by their subscripts, and s is the time-scale factor given in Equation 8. The remaining section presents the derivation of the circuit model.

Maxwell's equations with initial conditions in 1D can be written as

$$\frac{\partial H_y}{\partial t} - H_y(\vec{r}, 0)\delta(t) = \frac{1}{\mu} \frac{\partial E_z}{\partial x}$$
(68)

$$\frac{\partial E_z}{\partial t} - E_z(\vec{r}, 0)\delta(t) = \frac{1}{\epsilon} \left[ \frac{\partial H_y}{\partial x} - J_z \right]$$
 (69)

 $H_y(\vec{r},0)$  and  $E_z(\vec{r},0)$  are the initial values of the magnetic and electric fields at position  $\vec{r}$ .  $\delta(t)$  is the Dirac delta function.  $H_y(\vec{r},t), E_z(\vec{r},t)$ , and  $J_z(\vec{r},t)$  can be written as the sum of N Laguerre basis coefficients given by

$$H_y(\vec{r},t) = \sum_{q=0}^{N-1} H_y^q(\vec{r})\varphi_q(\bar{t})$$
(70)

$$E_z(\vec{r},t) = \sum_{q=0}^{N-1} E_z^q(\vec{r})\varphi_q(\bar{t})$$
(71)

$$J_z(\vec{r},t) = \sum_{q=0}^{N-1} J_z^q(\vec{r})\varphi_q(\bar{t})$$
(72)

Substituting Equations 70-72 into Equations 68-69, using the time-derivative relationship given in Equation 19, and by applying the orthonormal property of the Laguerre basis functions given in Equation 6, the following equations can be obtained:

$$H_y^q|_{i+1/2} = -2\left(\sum_{k=0}^{q-1} H_y^k|_{i+1/2}\right) + 2H_y^{init}|_{i+1/2} + \frac{2}{s\mu\Delta x} \left(E_z^q|_{i+1} - E_z^q|_i\right)$$
(73)

$$E_z^q|_i = -2\left(\sum_{k=0,q>0}^{q-1} E_z^k|_i\right) + 2E_z^{init}|_i + \frac{2}{s\epsilon\Delta x} \left[H_y^q|_{i+1/2} - H_y^q|_{i-1/2}\right] - \frac{2}{s\epsilon}J_z^q|_i.$$
 (74)

In deriving Equation 73-74, Equation 75 is used when integrating the delta function term.

$$\int_0^\infty \delta(t)\varphi_p(\bar{t})d\bar{t} = s\varphi_p(0) = s \tag{75}$$

The circuit model of the FDTD grid in Figure 33 satisfies Equations 73-74. The PEC boundary condition dictates that the electric field be zero for all the Laguerre coefficients. This is taken care of by a *short circuit*, forcing the electric-field Laguerre coefficients to be 0.

The matrix to be solved can be set up using the stamp rule [14]. The unknowns to be solved are the nodal voltages  $V_i^q$  and the currents through the independent voltage sources  $V_{TH}^q$ . One possible way of reducing the matrix dimension that needs to be solved is by substituting Equation 73 into Equation 74, so that the only unknowns that needs to be solved are the coefficients of the electric fields. However, this procedure is very cumbersome due to the length of the equations that needs to be manipulated. An easier approach is to

convert the Thevenin representation of the circuit, looking into the circuit marked by the double arrow, into a Norton form as given by the third subfigure in Figure 33.

$$R_N = R_{TH} \tag{76}$$

$$I_N^q = \frac{V_{TH}^q}{R_{TH}} \tag{77}$$

In the Norton representation, the only unknowns are the nodal voltages, which map to the electric-field coefficients. The branch currents that represent the magnetic-field Laguerre coefficients can be obtained from the solved nodal voltages in O(1) time using KCL and KVL equations. The companion model is updated using the  $q^{th}$  DC solution before performing the  $(q+1)^{th}$  DC analysis.

### 5.3 2D FDTD

A 2D FDTD grid with  $H_z$ ,  $E_x$ , and  $E_y$  fields is shown in Figure 34(a). As mentioned earlier



Figure 34: The companion model for the 2D FDTD grid.

in Chapter 3, transient simulation using Laguerre polynomials must be restarted at the end of a time interval. Initial conditions must be included in the differential equations to enable restarting the simulation. Time-domain Maxwell's differential equations without including

the initial conditions are given in [13]. Including initial conditions, similar to the 1D case, the Laguerre representation of Maxwell's equations for the 2D case, consists of the following set:

$$E_y^q|_{i,j+\frac{1}{2}} - 2E_y^{init}|_{i,j+\frac{1}{2}} = -C_x^E|_{i,j} \left( H_z^q|_{i+\frac{1}{2},j+\frac{1}{2}} - H_z^q|_{i-\frac{1}{2},j+\frac{1}{2}} \right)$$

$$-\frac{2}{s\epsilon} J_y^q|_{i,j+\frac{1}{2}} - 2\sum_{k=0,q>0}^{q-1} E_y^k|_{i,j+\frac{1}{2}}$$

$$(78)$$

$$E_x^q|_{i+\frac{1}{2},j} - 2E_x^{init}|_{i+\frac{1}{2},j} = C_y^E|_{i,j} \left( H_z^q|_{i+\frac{1}{2},j+\frac{1}{2}} - H_z^q|_{i+\frac{1}{2},j-\frac{1}{2}} \right) -2\sum_{k=0,q>0}^{q-1} E_x^k|_{i+\frac{1}{2},j}$$

$$(79)$$

$$H_z^q|_{i+\frac{1}{2},j+\frac{1}{2}} - 2H_z^{init}|_{i+\frac{1}{2},j+\frac{1}{2}} = -C_x^H|_{i,j} \left( E_y^q|_{i+1,j+\frac{1}{2}} - E_y^q|_{i,j+\frac{1}{2}} \right) + C_y^H|_{i,j} \left( E_x^q|_{i+\frac{1}{2},j+1} - E_x^q|_{i+\frac{1}{2},j} \right) - 2\sum_{k=0,q>0}^{q-1} H_z^k|_{i+\frac{1}{2},j+\frac{1}{2}}$$

$$(80)$$

where

$$C_y^E|_{i,j} = \frac{2}{s\epsilon_{i,j}\Delta y_i} \tag{81}$$

$$C_x^E|_{i,j} = \frac{2}{s\epsilon_{i,j}\Delta x_i} \tag{82}$$

$$C_x^H|_{i,j} = \frac{2}{s\mu_{i,j}\Delta x_i} \tag{83}$$

$$C_y^H|_{i,j} = \frac{2}{s\mu_{i,j}\Delta y_j}. (84)$$

The two subfigures in Figure 34(b) represent the circuit model of the unit-cell shown in Figure 34(a). The nodal voltages  $V_{i,j+1/2}$  and  $V_{i+1,j+1/2}$ , in the first subfigure in Figure 34(b), represent the  $q^{th}$  Laguerre electric field basis coefficients,  $E_y^q|_{i,j+1/2}$  and  $E_y^q|_{i+1,j+1/2}$ , respectively. The nodal voltages  $V_{i+1/2,j}$  and  $V_{i+1/2,j+1}$ , in the second subfigure in Figure 34(b), are the  $q^{th}$  Laguerre electric field coefficients  $E_x^q|_{i+1/2,j}$  and  $E_x^q|_{i+1/2,j+1}$ , respectively. The branch current marked  $I_{i+1/2,j+1/2}$  are the same values in both the subfigures and represent the  $q^{th}$  Laguerre magnetic field coefficient  $H_z^q|_{i+1/2,j+1/2}$ . The values of  $R_1$  and

 $V_{TH}|_{i,j+1/2}$  are

$$R_1 = C_r^E \tag{85}$$

$$V_{TH}|_{i,j+1/2} = 2E_y^{init}|_{i,j+\frac{1}{2}} - \frac{2}{s\epsilon}J_y^q|_{i,j+\frac{1}{2}} - 2\sum_{k=0,q>0}^{q-1} E_y^k|_{i,j+\frac{1}{2}}$$
(86)

The values of  $R_2$  and  $V_{TH}|_{i+1/2,j}$  are

$$R_2 = C_y^E (87)$$

$$V_{TH}|_{i+1/2,j} = 2E_x^{init}|_{i+\frac{1}{2},j} - 2\sum_{k=0,q>0}^{q-1} E_x^k|_{i+\frac{1}{2},j}.$$
 (88)

Equations 85-86 model Equation 78; Equations 87-88 model Equation 79. The values of  $I_{val,1}$ ,  $I_{val,2}$ ,  $I_{val,3}$ ,  $R_3$ , and  $R_4$  are

$$I_{val,1} = 2H_z^{init}|_{i+\frac{1}{2},j+\frac{1}{2}} - 2\sum_{k=0,q>0}^{q-1} H_z^k|_{i+\frac{1}{2},j+\frac{1}{2}}$$
(89)

$$I_{val,2} = C_y^H \left( V_{i+1/2,j+1} - V_{i+1/2,j} \right)$$
(90)

$$I_{val,3} = C_x^H \left( V_{i,j+1/2} - V_{i+1,j+1/2} \right)$$
(91)

$$R_3 = \frac{1}{C_x^H} \tag{92}$$

$$R_4 = \frac{1}{C_y^H}. (93)$$

Equations 89-93 model Equation 80.  $I_{val,2}$  and  $I_{val,3}$  in Figure 34(b) are shown by dotted circles and are voltage-controlled current sources that couple the two circuits together. It can be verified from KCL and KVL equations that Equations 85-93 represent Equations 78-80.

The number of unknowns that needs to be solved using MNA can be reduced by converting the Thevenin representations into the Norton equivalents. Looking into the circuit marked by the *double arrow* shown in Figure 34(b), the Thevenin circuit can be converted into the Norton model using Equations 76-77 in Chapter 5.2. The number of unknowns can also be reduced by substituting Equation 80 into Equations 78-79, such that only the electric-field Laguerre coefficients need to be solved [13]. However, this is a lot more cumbersome than converting from the Thevenin to the Norton equivalent circuit form. It should

be noted that both of these methods to reduce the number of unknowns result in the same matrix dimension. However, reducing the unknowns by the Thevenin-Norton conversion is much simpler.

### 5.4 3D FDTD

The standard FDTD Yee cell is shown in Figure 35 [28]. The cross sections of the FDTD



Figure 35: The standard Yee cell.

cell at the locations marked by the dotted lines in Figure 35 are shown in Figure 36. These represent the cross sections as viewed by standing on the  $+\infty$  of y, x, and z axes and facing the Yee cell. In the circuit model, as before, the nodal voltages represent the basis coefficients of the electric fields and the branch currents represent the magnetic fields.

Two of the six Maxwell's differential equations are given in Equations 94 and 95.

$$\frac{\partial H_z}{\partial t} - H_z^{init} \delta(t) = \frac{1}{\mu} \left( \frac{\partial E_x}{\partial y} - \frac{\partial E_y}{\partial x} \right)$$
 (94)

$$\frac{\partial E_y}{\partial t} - E_y^{init} \delta(t) = \frac{1}{\epsilon} \left( \frac{\partial H_x}{\partial z} - \frac{\partial H_z}{\partial x} - J_y \right)$$
(95)

The initial conditions are explicitly included in the differential equations before converting them to the Laguerre domain, to enable restarting the simulation beyond a certain time duration, as explained in Chapter 3. Using a procedure similar to Chapter 5.2, Equations 94-95 can be converted to the Laguerre domain as given in Equations 96-97.

$$H_{z}^{q}|_{i+\frac{1}{2},j+\frac{1}{2},k} - 2H_{z}^{init}|_{i+\frac{1}{2},j+\frac{1}{2},k} = -C_{x}^{H}|_{i,j,k} \left( E_{y}^{q}|_{i+1,j+\frac{1}{2},k} - E_{y}^{q}|_{i,j+\frac{1}{2},k} \right) + C_{y}^{H}|_{i,j,k} \left( E_{x}^{q}|_{i+\frac{1}{2},j+1,k} - E_{x}^{q}|_{i+\frac{1}{2},j,k} \right) - 2\sum_{k=0,q>0}^{q-1} H_{z}^{k}|_{i+\frac{1}{2},j+\frac{1}{2},k}$$
(96)

$$E_y^q|_{i,j+\frac{1}{2},k} - 2E_y^{init}|_{i,j+\frac{1}{2},k} = -C_x^E|_{i,j,k} \left( H_z^q|_{i+\frac{1}{2},j+\frac{1}{2},k} - H_z^q|_{i-\frac{1}{2},j+\frac{1}{2},k} \right)$$

$$+C_z^E|_{i,j,k}\left(H_x^q|_{i,j+\frac{1}{2},k+\frac{1}{2}} - H_x^q|_{i,j+\frac{1}{2},k-\frac{1}{2}}\right) - \frac{2}{s\epsilon}J_y^q|_{i,j+\frac{1}{2},k} - 2\sum_{k=0,q>0}^{q-1} E_y^k|_{i,j+\frac{1}{2},k}$$
(97)

Equations 96-97 can be represented in a circuit form as shown in Figure 37. Figure 37 represents the circuit model for the magnetic field  $H_z^q|_{i+1/2,j+1/2,k}$  and the electric field  $E_y^q|_{i,j+1/2,k}$ , at the location marked by the solid edges and their intersection in Figure 36. Only the partial 3D model for a unit cell is given in Figure 37. The complete model can be derived in a similar fashion.

The branch currents represent the  $q^{th}$  Laguerre basis coefficients of the magnetic fields and are given by

$$H_z^q|_{i+\frac{1}{2},j+\frac{1}{2},k} = I_{i+\frac{1}{2},j+\frac{1}{2},k} \tag{98}$$

$$H_z^q|_{i-\frac{1}{2},j+\frac{1}{2},k} = I_{i-\frac{1}{2},j+\frac{1}{2},k} \tag{99}$$

$$H_x^q|_{i,j+\frac{1}{2},k+\frac{1}{2}} = I_{i,j+\frac{1}{2},k+\frac{1}{2}}$$

$$\tag{100}$$

$$H_x^q|_{i,j+\frac{1}{2},k-\frac{1}{2}} = I_{i,j+\frac{1}{2},k-\frac{1}{2}}. (101)$$

The nodal voltages represent the  $q^{th}$  basis coefficients of the Electric fields.

$$E_y^q|_{i,j+\frac{1}{2},k} = V_{i,j+\frac{1}{2},k} \tag{102}$$

The branch-current circuit represents Equation 96 and the circuit connected to the node with voltage  $V_{i,j+1/2,k}$  represents Equation 97. The values of the branch-current circuit are

$$I_{val,1} = 2H_z^{init}|_{i+\frac{1}{2},j+\frac{1}{2},k} - 2\sum_{k=0,q>0}^{q-1} H_z^k|_{i+\frac{1}{2},j+\frac{1}{2},k}$$
(103)

$$I_{val,2} = C_y^H \left( V_{i + \frac{1}{2}, j+1, k} - V_{i + \frac{1}{2}, j, k} \right)$$
 (104)

$$R_1 = \frac{1}{C_x^H}. (105)$$

The circuit connected to the node with the voltage marked  $V_{i,j+1/2,k}$  have the values

$$I_{val,3} = I_{i,j+\frac{1}{2},k+\frac{1}{2}} - I_{i,j+\frac{1}{2},k-\frac{1}{2}}$$

$$\tag{106}$$

$$I_{val,4} = I_{i-\frac{1}{2},j+\frac{1}{2},k} - I_{i+\frac{1}{2},j+\frac{1}{2},k}$$
(107)

$$R_2 = C_x^E \tag{108}$$

$$R_3 = C_z^E \tag{109}$$

$$V_{src}|_{i,j+\frac{1}{2},k} = 2E_y^{init}|_{i,j+\frac{1}{2},k} - \frac{2}{s\epsilon}J_y^q|_{i,j+\frac{1}{2},k} - 2\sum_{k=0,q>0}^{q-1}E_y^k|_{i,j+\frac{1}{2},k}.$$
(110)

The number of unknowns that needs to be solved in DC analysis can be reduced by using the Norton equivalent form looking into the circuit marked by the double arrow in Figure 37. The values of the Norton equivalent circuit are given by

$$I_N = \frac{-I_{val,3}R_2 + V_{src} - I_{val,4}R_3}{R_2 + R_3} \tag{111}$$

$$R_N = R_2 + R_3 ag{112}$$

 $I_N$  has terms involving  $I_{val,3}$  and  $I_{val,4}$  and is therefore a current-controlled current source. In MNA analysis, the current-controlled current sources introduce additional unknowns, besides the unknown nodal voltages [14]. However,  $I_N$  can be implemented as voltage-controlled current sources and independent current sources by stamping the current in a branch. A voltage-controlled current source does *not* introduce additional unknowns besides the unknown nodal voltages [14]. The unknowns to be solved are *only* the electric-field coefficients, which are the nodal voltages, and the matrix dimension to be solved is in its optimal form.

With the values given by Equations 103-112, it can be verified using using KCL and KVL equations that these satisfy Equations 96-97. The partial model in Figure 37 can be completed in a similar fashion to satisfy the complete set of 3D Maxwell's differential equations in the Laguerre domain.

# 5.5 Boundary Conditions

Different types of boundary conditions can be easily represented in the companion model. The companion models, besides making the implementation easier, offer a very elegant way



Figure 36: The  $\times$ Sections of the Yee cell, which are marked by the dotted lines in Figure 35, parallel to the xz, yz and xy planes, respectively. The dots indicate the direction of the fields pointing out of the page.



Figure 37: The companion model of the 3D FDTD grid in the Laguerre domain.

to implement the algorithm. The models for the perfect electric conductor (PEC), perfect magnetic conductor (PMC), and the absorbing boundaries are presented in Chapters 5.5.1, 5.5.2, and 5.5.3.

#### 5.5.1 Perfect Electric Conductor (PEC) Boundary

In the PEC boundary, the tangential electric fields to the boundary are set to zero. As shown in Figure 38, the PEC boundary is implemented by setting a node to ideal ground. In Figure 38, the vertical bars represent the positions of the electric fields on the grid and the



Figure 38: The PEC boundary condition.

symbol  $\times$  represent the locations of the magnetic fields. The last node, which represents the electric field that is tangential to the boundary, has been set to zero. In SLeEC, the nodal voltages represent electric field coefficients. By setting the nodal voltages, which correspond to the tangential electric fields to the boundary, to zero, the PEC boundary condition can be implemented.

#### 5.5.2 Perfect Magnetic Conductor (PMC) Boundary

In the PMC boundary, the tangential magnetic fields to the boundary are set to zero. As shown in Figure 39, the PMC boundary is implemented by leaving a branch, whose current corresponds to the magnetic field that is tangential to the boundary, open circuit. In Figure 39, the vertical bars represent the positions of the electric fields on the grid and the symbols × represent the locations of the magnetic fields. The current in the last branch, which represents the magnetic-field coefficient that is tangential to the boundary, is set to zero. By leaving the branch open circuit, the current through the branch is forced to be zero, thereby implementing the PMC boundary condition.



Figure 39: The PMC boundary condition.

### 5.5.3 Absorbing Boundary Condition (ABC)

An absorbing boundary condition of the type given in [13] can be implemented by a voltagecontrolled voltage source in series with an independent voltage source, as shown in Figure 40. The nodes that correspond to the  $E_y$  fields that are tangential to the boundary are terminated in the manner shown in Figure 40.



Figure 40: The absorbing boundary condition.

# 5.6 Summary

The companion model of the 3D FDTD grid was derived. The companion model permits a very elegant implementation and transforms solving a system of linear equations into DC analysis. The equations can be setup using the stamp rule in modified nodal analysis. The number of unknowns to be solved can be reduced without the use of long cumbersome equations. The circuit representation of the PEC, PMC, and the ABC boundary conditions were presented.

# 6 Choosing the Correct Number of Laguerre Basis Coefficients

### 6.1 Introduction

The final step in the SLeEC methodology is generating the time-domain waveform using the Laguerre basis coefficients that is calculated from the DC analyses. The time-domain waveform is extremely sensitive to the number of coefficients that is used to obtain the waveform and the correct number of coefficients must be used for maximum accuracy. Analytical formulae for determining the correct number of coefficients, such as [13], have been proposed. In this thesis, however, a numerical way to choose the optimal number of basis coefficients has been suggested. Based on the test cases that have been simulated, it has been determined that the numerical approach to finding the correct number of coefficients is the best method.

The time-domain waveforms obtained using the correct number and the incorrect number of Laguerre basis coefficients are illustrated with results from a test case that is shown in Figure 41. The structure to be simulated is two parallel planes sandwiched between dielectric



Figure 41: A 2D power-ground plane structure.

material of relative permittivity 3.4. The fields are approximated with only  $E_z$ ,  $H_x$ , and  $H_y$ . The metal planes are  $100mm \times 50mm$ . The source waveform is a Gaussian pulse placed at the center of the cells marked  $J_z$  in the figure. The cells are  $1mm \times 1mm$ , with a fine mesh of size  $1mm \times 10\mu m$  at the center of the plane. The time-domain waveform of the electric

field at the location marked probe is computed.

The time-domain electric-field waveform using an incorrect number of basis functions is shown in Figure 42. The dotted waveform has been obtained using the conventional finite-difference time-domain scheme and the solid waveform by using SLeEC. As shown in the figure, the waveform obtained from SLeEC is very oscillatory with a large error. The time-domain waveform between 0-5ns using 362 basis functions, which is the optimum number, is shown in Figure 43. Clearly, the time-domain waveform is very sensitive to the number of basis coefficients. Choosing the optimum number of coefficients through numerical analysis is presented in the next subsection.

## 6.2 Methodology

Let  $\{E_0, E_1, ..., E_q\}$  be the coefficients using which the time-domain waveform is generated. Based on the test cases that have been simulated, the right value for q for a 5ns simulation interval lies between 150 - 400. The time-domain waveform is very sensitive to q and there is no range of values for q, which gives the right time-domain waveform. Only a few discrete values for q that can be scattered anywhere between 150 and 400 generates the right time-domain waveform. In the methodology presented, there is no need to know the approximate bounds between which the right value for q lies.

Coefficients  $\{E_0, E_1, ..., E_{q_{max}}\}$  are solved using the SLeEC methodology. The last coefficient  $q_{max}$  is set to an empirically determined value. Based on the test cases simulated 500 coefficients are sufficient to represent the time-domain waveform for an interval of 5ns used in the simulation. There are two steps involved in determining the right value for q:

1. If q is chosen to be small, the time-domain waveform does not have sufficient energy content. For example, as shown in Figure 44, the time-domain waveform does not have sufficient energy content for q = 50. The solid line is the result using SLeEC and the dots are obtained using the conventional FDTD scheme. As shown in the figure, the SLeEC waveform decays to zero as a result of the small number of basis coefficients used to generate the time-domain waveform. The first step is to find the q above which the corresponding time-domain waveform has sufficient energy content.



Figure 42: The time-domain waveform generated using 179 basis functions.



Figure 43: The time-domain waveform generated using 362 basis functions.



Figure 44: If a small value for q is chosen, then the time-domain waveform does not have sufficient energy content.

The value for q above which the time-domain waveform has sufficient energy will be referred to as  $q_{knee}$ . The reason for choosing this name will be explained in a later section.

2. If a value for q is chosen such that although the time-domain waveform has sufficient energy content, the error can be large, as shown in Figure 42. The second step is to choose the right value for q among the set of values  $\{q_{knee}, ..., q_{max}\}$  that has the maximum accuracy.

The two steps are explained in detail in the following paragraphs. The methodology will be illustrated using results from the test case shown in Figure 45. The test case is simulated in 3D. The test case is terminated with PEC boundary. The number of FDTD cells in the simulation is (nx, ny, nz) = (30, 50, 10). The planar metallization is located on the top surface of the cells with z-coordinates k = 0. Modulated Gaussian waveform is used



Figure 45: A planar structure with multiscale dimensions.

as the source waveform and is located at cell (15, 11, 0), marked source in Figure 45. The electric field  $E_z$  is probed at cell (15, 35, 7).

## 6.2.1 Energy analysis to find $q_{knee}$ (Step 1)

The time-domain waveform resulting from using a small number of basis functions has little energy content. The energy content of a time-domain waveform as a function of the number of basis functions used to generate the corresponding time-domain waveform, starts close to 0, increases steadily and flattens to a constant value, as the number of basis functions is increased. The energy profile as a function of the number of basis functions for the example in Figure 45 is shown in Figure 46.



Figure 46: Energy as a function of the number of basis coefficients using Scheme 1.

Two different schemes have been proposed to calculate the energy content of a time-domain waveform. Energy content calculated using Equation 113 by a summation of the square of the time-domain waveform values will be referred to as Scheme 1.

$$E(q) = \sum_{i=0}^{i=N-1} |W_q[i]|^2$$
(113)

E(q) is the energy content of the time-domain waveform generated using basis coefficients  $\{E_0, E_1, ..., E_q\}$ ;  $W_q$  is the time-domain waveform generated using the (q + 1) basis coefficients, and N is the number of discrete time points making up the time-domain waveform.

The energy of the time-domain waveform as a function of the number of basis functions increases steadily until  $q_{knee} = 50$  and then flattens out. From the energy profile, it can be seen that the value for  $q > q_{knee}$  has sufficient energy content.  $q_{knee}$  is the point where the energy profile becomes flat. The subscript has been labeled *knee* appropriately from the resemblance of the energy profile to a knee.

#### 6.2.2 Finding the right value for q (Step 2)

The right number of basis functions can be chosen by doing an error analysis. Minimizing the error at time t=0 is sufficient to determine the right number of basis coefficients. The optimal value for q is chosen among  $\{q_{knee}, ..., q_{max}\}$  that has the smallest error at time t=0.

By using a source waveform with the initial value zero, the field values at all locations also have the value zero at time t=0. By starting the simulation in a known state, the initial value is known. The source waveform used is a Gaussian pulse that is shifted in time to ensure zero value at time t=0. Therefore, FDTD is not needed to determine the initial value at time t=0. The normalized error at time t=0 for q between  $q_{knee}$  and  $q_{max}$  is shown in Figure 47. The normalization is done with respect to the smallest error that occurs for  $q_{opt}$ , as shown by the dotted circle in the figure. The time-domain waveform generated using coefficients  $\{E_0, E_1, ..., E_{q_{opt}}\}$  is shown in Figure 48. There exists a small discrepancy between the FDTD waveform and SLeEC toward the end of the interval, as indicated by the dotted circle. This discrepancy has been resolved using a more accurate evaluation of the energy content of a time-domain waveform, as explained in the next section.

## 6.3 Improved Methods to Calculate Energy

In order to choose the right number of coefficients, the energy contained in the time-domain waveform as a function of the number of basis coefficients needs to be calculated. In the method that was mentioned earlier in Chapter 6.2, the energy content is calculated from the time-domain waveform by computing the sum of the squares of the discrete transient



Figure 47: The normalized error at time t=0 as a function of the number of basis coefficients using Scheme 1.



Figure 48: The time-domain  ${\cal E}_z$  field obtained using Scheme 1.

waveform. An alternate scheme to calculate the energy content is given by Equation 114.

$$E(q) = \sum_{i=0}^{i=N-1} |W_q[i]| \tag{114}$$

E(q) is the energy content of the time-domain waveform generated using basis coefficients  $\{E_0, E_1, ..., E_q\}$ ,  $W_q$  is the time-domain waveform obtained using (q+1) basis coefficients, and N is the number of discrete time points making up the time-domain waveform. The energy definition given in Equation

The time-domain waveform using 85 basis coefficients using Scheme 1 is given in Figure 48. There is a small discrepancy between the FDTD result and SLeEC toward the end of the interval, as indicated by the dotted circle. This inaccuracy is due to the "imprecise" evaluation of the energy content of the time-domain waveform. The energy profile obtained using Equation 114 is given in Figure 49. Equation 114 better reflects the energy content



Figure 49: The energy profile as a function of the number of basis coefficients using Scheme 2.

and Figure 49 clearly shows that the *knee* occurs when the number of basis coefficients used is 100. The normalized error at time t=0 is given in Figure 50. The smallest error occurs when 245 basis coefficients are used, as shown by the dotted circle. The corresponding time-domain waveform is given in Figure 51. The discrepancy in Figure 48, which has been obtained using Scheme 1, is not present in Figure 51, which has been obtained using Scheme 2. Scheme 2 has been verified for numerous examples and matches exactly with the results from the conventional FDTD scheme. Some of these test cases are presented in Chapter 7.



Figure 50: The normalized error as a function of the number of basis coefficients using Scheme 2.



Figure 51: The time-domain  $E_z$  field obtained using Scheme 2.

## 6.4 Summary

The time-domain waveform generated using Laguerre basis coefficients is extremely sensitive to the number of coefficients used. The examples presented clearly demonstrate that there is no range of values which gives the best result. The optimal number of coefficients is best determined through numerical analysis.

## 7 3D EM Simulation Results

#### 7.1 Introduction

SLeEC requires solving a matrix of the form Ax = b at every iteration. However, LU decomposition has to be done only once because the A matrix stays constant thoroughout the iterations. Two different node numbering schemes have been considered. For both the schemes, the A matrix is sparse and structurally symmetric. Only one of the schemes, however, results in a banded matrix that is memory efficient for solving a matrix using LU decomposition. The description of the node-numbering schemes is followed by results from 3D EM test cases.

## 7.2 Node-Numbering Schemes

The Yee cell is shown in Figure 52. The FDTD cells are cascaded in the x, y, and z dimensions to create a 3D mesh. For simplicity, only a single cell is shown in Figure 52, rather than an entire 3D mesh. The cross sections of the FDTD cells that are parallel to the planes xy, yz, and the zx planes in the entire mesh are labeled 1, 2, and 3. In Scheme A, all the nodes lying on Planes 1 are numbered first. The nodes on Planes 2 are numbered next, followed by the nodes on Planes 3. The sparsity pattern of the A matrix that is of dimension  $117712 \times 117712$  (117712 unknowns) resulting from Scheme A is shown in Figure 53. The number of nonzero entries in matrix A is 1 476 652. The structural symmetry can be clearly seen from the pattern. The matrix is always structurally symmetric for PEC and PMC boundary conditions, regardless of the structure that is being modeled.

The sparsity pattern resulting from Scheme B for the same structure is shown in Figure 54. The A matrix is banded, and therefore the number of nonzero entries in L and U factors are much less than the matrix resulting from Scheme A. In Scheme B, the nodes are numbered on a cell by cell basis. The nodes within a cell are numbered first, before moving to another cell.



Figure 52: Planes 1, 2, and 3 in the inefficient node numbering scheme.



Figure 53: The sparsity pattern of the A matrix from an inefficient node-numbering scheme.

#### 7.3 EM Test cases

Four EM test cases are presented in this section. The results show an excellent correlation with the finite-difference time-domain scheme. The planar test cases are drawn on a single metal layer on the top surfaces of the FDTD cells whose coordinates in the z-direction is k = 0. The test cases are enclosed in a PEC box, as shown in Figure 55. The bottom face of the PEC boundary serves as the ground plane. For the four test cases, only the top view of the metallization with dimensions will be shown, as shown in Figure 56. For example, Figure 56 is a shorthand representation for the actual set up in Figure 55.



Figure 54: The sparsity pattern of the A matrix that is suitable for LU decomposition.

A graphical interface using Microsoft Excel<sup>®</sup> has been developed and a sample layout is shown in Figure 57. The planar metallization layer can be drawn on the Excel file and easily transported into the SLeEC code using a macro. The non-uniform dimensions of the FDTD cells are also included in the Excel file, which are shown by the dotted boxes in Figure 57.

An FDTD cell located within the mesh will be referred to with coordinates (i, j, k), where i, j, and k are between [0, nx - 1], [0, ny - 1], and [0, nz - 1], respectively. For the first three test cases to be presented, the probe locations for the  $E_x$ ,  $E_y$ ,  $E_z$ ,  $H_x$ ,  $H_y$ , and  $H_z$  fields are given in Table 1. The first column is the field component and the second column is the cell coordinates whose field component is probed.

## 7.4 A Split Power-Ground Plane

The first test case is a split power-ground plane. The structure is  $10mm \times 10mm$ . The slot width has been reduced to  $1\mu m$  to make the simulation multiscale and see the speedup obtained compared to FDTD. Dimensions smaller than  $1\mu m$  is possible for chip-package cosimulation. The small slot dimension also emulates the presence of on-chip structures along with the package.

The simulation time and the memory requirement comparing SLeEC and FDTD is summarized in Table 7.4. SLeEC shows a speedup of 3× compared to FDTD, at the expense of



Figure 55: The bird's eye view of an EM test case that is enclosed within a PEC box.



Figure 56: The top view of an EM test case.



Figure 57: Microsoft Excel® is used as a GUI for the layout of the test cases.

Table 1: The probe locations of the electric and magnetic fields for the three test cases.

| Field Component | Probed Cell |
|-----------------|-------------|
| $E_x$           | (19, 15, 5) |
| $E_y$           | (20, 27, 5) |
| $E_z$           | (20, 27, 0) |
| $H_x$           | (10, 26, 5) |
| $H_y$           | (20, 8, 5)  |
| $H_z$           | (8, 5, 5)   |

more memory. The Courant time step used in FDTD is  $4 \times 10^{-15} s$ . The source waveform is modulated Gaussian and is located at the location marked Src in Figure 58. The contour maps of the  $E_z$  field with slot widths of  $1\mu m$  and  $20\mu m$  are given in Figure 59. As expected, the coupling to the power island is a lot less for a larger slot-width spacing. The electric and magnetic fields that are probed at the locations given in Table 1 are shown in Figure 60 - Figure 62. The numbers marked on the figures show the maximum amplitude of the fields. The dotted waveform has been obtained using FDTD and the solid waveform using SLeEC. There is an excellent correlation between the two datasets. The captions beneath the figures indicate the number of coefficients used to generate the time-domain waveform in SLeEC. The methodology given in Chapter 6 has been used to choose the optimum number of basis coefficients.



Figure 58: A split power-ground plane.

Table 2: The memory and simulation time comparison for Test case 1.

| Solver | Simulation Time | Memory |
|--------|-----------------|--------|
| FDTD   | 90min.          | 1kB    |
| SLeEC  | 30min.          | 150MB  |



Figure 59: The contour maps of the split-plane test case.



Figure 60:  $E_x$  (126 basis coefficients) and  $E_y$  (208 basis coefficients) fields in the split-plane test case. Solid: SLeEC and Dots: FDTD



Figure 61:  $E_z$  (490 basis coefficients) and  $H_x$  (226 basis coefficients) fields in the split-plane test case. Solid: SLeEC and Dots: FDTD



Figure 62:  $H_y$  (259 basis coefficients) and  $H_z$  (398 basis coefficients) fields in the split-plane test case. Solid: SLeEC and Dots: FDTD

#### 7.5 An EBG Structure

An EBG test case  $18mm \times 36mm$  is shown in Figure 63. The source waveform  $J_z$  is a modulated Gaussian placed at the location indicated on the figure. The slot widths are  $1\mu m$ .



Figure 63: A 2D EBG test case.

The contour maps of the  $E_z$  field after 950ps, 1200ps, and 1300ps are shown in Figures 64, 65, and 66, respectively. The simulation time and the memory requirement comparing SLeEC and FDTD are summarized in Table 7.5. SLeEC shows a speedup of  $3\times$  compared to FDTD, at the expense of more memory. The Courant time step used in FDTD is  $4\times10^{-15}s$ .

Table 3: The memory and simulation time comparison for Test case 2.

| Solver | Simulation Time | Memory |
|--------|-----------------|--------|
| FDTD   | 90min.          | 1kB    |
| SLeEC  | 30min.          | 150MB  |

The number of cells used in SLeEC is  $30 \times 50 \times 10$ . The PEC boundary condition is used to terminate the mesh. The electric and magnetic-field plots at the locations given in Table 1 are shown in Figure 67 - Figure 69.



Figure 64: The 2D EBG contour map of the  $|(E_x, E_y)|$  field after 950ps.



Figure 65: The 2D EBG contour map of the  $|(E_x, E_y)|$  field after 1200ps.



Figure 66: The 2D EBG contour map of the  $|(E_x, E_y)|$  field after 1300ps.



Figure 67:  $E_x$  (413 basis coefficients) and  $E_y$  (413 basis coefficients) fields in the EBG test case. Solid: SLeEC and Dots: FDTD



Figure 68:  $E_z$  (413 basis coefficients) and  $H_x$  (141 basis coefficients) fields in the EBG test case. Solid: SLeEC and Dots: FDTD



Figure 69:  $H_y$  (171 basis coefficients) and  $H_z$  (141 basis coefficients) fields in the EBG test case. Solid: SLeEC and Dots: FDTD

## 7.6 On-chip Coupled Lines

The third test case is shown in Figure 70. Three coupled transmission lines with 100nm spacing between the transmission lines and 1mm long has been simulated. The source waveform is modulated Gaussian placed at the solid dot on the figure.



Figure 70: Three on-chip coupled transmission lines.

The contour map of the  $E_z$  field is shown in Figure 71. The electric and magnetic field plots at the locations given in Table 1 are shown in Figure 72 - Figure 74. The dotted line has been obtained using FDTD and the solid curve is the simulation result using SLeEC. Numbers on the figures show the maximum amplitude of the field. The Courant time step used in FDTD is  $2.0 \times 10^{-16} s$ . The simulation time and the memory requirement comparing SLeEC and FDTD are summarized in Table 7.6. Over  $70 \times$  speedup has been obtained using SLeEC compared to FDTD. The number of cells used in SLeEC is  $30 \times 50 \times 10$ . The PEC

Table 4: The memory and simulation time comparison for Test case 3.

| Solver | Simulation Time  | Memory |
|--------|------------------|--------|
| FDTD   | 2160min. (36hrs) | 1kB    |
| SLeEC  | 30min.           | 150MB  |

boundary condition is used to terminate the mesh.



Figure 71: The contour map of the  $E_z$  field.



Figure 72:  $E_x$  (49 basis coefficients) and  $E_y$  (446 basis coefficients) fields of the transmissionlines test case. Solid: SLeEC and Dots: FDTD



Figure 73:  $E_z$  (409 basis coefficients) and  $H_x$  (188 basis coefficients) fields of the transmission-lines test case. Solid: SLeEC and Dots: FDTD



Figure 74:  $H_y$  (437 basis coefficients) and  $H_z$  (411 basis coefficients) fields of the transmission-lines test case. Solid: SLeEC and Dots: FDTD

## 7.7 A Chip-Package Structure

The results from a chip-package cosimulation test case is presented in this section. The simulation, including defining the test case, has been done by Myunghyun Ha, member of the EPSILON group, Georgia Tech. The top view of the structure is shown in Figure 75. The structure that has been modeled is on-chip interconnects in the metal layers M1 and



Figure 75: A chip-package structure with multiscale features.

M2, connected by vias and routed on the redistribution layer, through the solder pads, to the package and routed as package-level interconnects.

A feature of the chip-package structure is multiscale dimensions. The on-chip structures are in the nm scale, the dimensions of the structure present at the interface betwen the chip and the package, such as the redistribution layer, solder pads, are in the um scale, and package structures such as the power-ground planes are in the mm range. The on-chip structures that are in the nm scale require a very fine mesh, and therefore the simulation time can become prohibitively large using conventional FDTD scheme due to the Courant time-step limit. Non-uniform mesh dimensions are given in Figure 76 and the cross section of the structure is shown in Figure 77. The 3D layout showing the structure is given in Figure 78. Time-domain response of the electric field at the location marked probe in Figure 75 is given in Figure 79. There is an excellent correlation between SLeEC and FDTD. The number of cells used in the simulation is 15 000. FDTD takes 1 day to run, while SLeEC

takes only 5 minutes to complete. The simulation is run on a Pentium quadcore  $2.4\mathrm{GHz}$  processor with  $4\mathrm{GB}$  RAM.



Figure 76: Non-uniform mesh dimensions simulated using SLeEC.



Figure 77: The cross section of the different metal layers.



Figure 78: The 3D Layout of the chip-package structure.



Figure 79: SLeEC and FDTD results of the chip-package structure.

## 7.8 Summary

Simulation results from a chip-package structure were presented to illustrate the scalability of the technique. For this test case, a speed up of over  $150\times$  is obtained compared to the conventional FDTD scheme. The same number of FDTD cells are used in the comparison. The field plots show good correlation with the FDTD scheme. The optimal number of coefficients to generate the time-domain waveform are chosen based on the methodology presented in Chapter 6.

# 8 Time-domain to Frequency-domain Transformation

#### 8.1 Introduction

The procedure to obtain frequency-domain parameters through time-domain simulation is presented in this section. The methodology presented in this section can be applied to structures where voltages are well defined. The technique is demonstrated using a test case.

#### 8.2 A Test Case to Illustrate the Transformation

The power-ground plane that has been simulated in time domain using the conventional FDTD scheme, before converting the results to frequency-domain parameters, is shown in Figure 80. The 2D structure consists of two metal planes sandwiched between a dielectric



Figure 80: A power-ground plane test case.

material with permittivity  $\epsilon_r = 3.8$  and thickness  $60\mu m$ . Only the  $E_z$ ,  $H_x$ , and  $H_y$  components are used to model the structure. The number of FDTD cells used is  $40 \times 40$  and the perfect magnetic conductor (PMC) boundary condition is used to terminate the mesh. The metal plane is  $15mm \times 15mm$ . The locations marked Src and Probe will be referred to as Port 1 and Port 2. The ports are defined with respect to the ground node located directly beneath the postive terminals of the port definitions.

For the two port definitions,  $Z_{11}$  and  $Z_{21}$  can be obtained from Equations 115 and 116.

$$Z_{11} = \frac{F\{V_1[n]\}}{F\{I_1[n]\}}|_{I_2=0}$$
(115)

$$Z_{21} = \frac{F\{V_2[n]\}}{F\{I_1[n]\}}|_{I_2=0}$$
(116)

F is the discrete Fourier transform operator. The source waveform for  $J_z$  is the Gaussian pulse given in Equation 117. The value for  $\beta$  is chosen to ensure that the source waveform has sufficient frequency content between  $0-20 \,\mathrm{GHz}$ .

$$J_z[n] = exp\left(-\frac{(n-n_0)^2}{\beta^2}\right)$$

$$n = 0, 1, 2..., \ n_0 = 25000, \ \beta = 750 \ for \ \Delta t = 1.0 \times 10^{-13}$$
(117)

The voltage at Port i,  $V_i$ , and the input current at Port i,  $I_i$ , can be obtained from Equations 118 and 119.

$$V_i = E_z d \text{ for uniform } E_z$$
 (118)

$$I_i = J_z \Delta x \Delta y. \tag{119}$$

The variables in Equations 118 and 119 are shown pictorially in Figure 81. The cross section of the power-ground plane, with thickness d and uniform  $E_z$ , is shown in Figure 81a. The top view of the FDTD grid with a  $J_z$  source is shown in Figure 81b.  $\Delta x$  and  $\Delta y$  in Equation 119 are the dimensions around the  $J_z$  source.

The time-domain  $E_z$  values at Ports 1 and 2 have been sampled every 25ps. The total simulation time is  $1\mu s$ . For  $\Delta t = 25ps$ , the values for  $\beta$  and  $n_0$ , which correspond to the  $J_z$  parameters given in Equation 117 for  $\Delta t = 1.0 \times 10^{-13}$ , are 3 and 100, respectively. The frequency values corresponding to the discrete Fourier transform of a time-domain response with sampling time T and N sampled points are given by [22]

$$f_k (units \ Hz) = \frac{k}{NT}, k = 0, 1, ..., N - 1.$$
 (120)

The time-domain  $E_z$  waveform at Probe 1 is shown in Figure 82. The frequency-domain parameters obtained using Equation 115, is shown in Figure 83. The solid waveform is the result from FDTD and the dotted waveform is the result from MFDM. The MFDM



Figure 81: Voltage and current definitions.



Figure 82: The time domain  $E_z$  waveform between  $0 - 1\mu s$  at the location marked Src in Figure 80.

methodology is given in [18]. The solid waveform is very oscillatory. The reason for the oscillation is the nonzero steady-state value of the time-domain  $E_z$  waveform, as shown in Figure 82. By multiplying the time-domain  $E_z$  waveform by the right half of the Kaiser window, which is shown in Figure 84, the oscillation can be removed, resulting in a good agreement with MFDM. The time-domain  $E_z$  waveform with windowing is shown in Figure 85 and the  $Z_{11}$  parameters, which has been obtained after windowing of the time-domain  $E_z$ 



Figure 83:  $Z_{11}$ , 0-10GHz, without windowing and the PMC boundary condition has been used to terminate the mesh. Dots: MFDM [18] and Solid: FDTD



Figure 84: The second half of the time-domain Kaiser windowing function.

waveform, is shown in Figure 86. The  $Z_{12}$  parameters obtained with and without windowing of the time-domain  $E_z$  waveform are shown in Figure 87 and 88. A good correlation between FDTD and MFDM has been obtained using windowing of  $E_z$  waveforms.



Figure 85: The time domain  $E_z$  waveform in Figure 82 with windowing.



Figure 86:  $Z_{11}$ , 0-10GHz, with windowing and the PMC boundary condition has been used to terminate the mesh. Dots: MFDM [18] and Solid: FDTD



Figure 87:  $Z_{12}$ , 0-10GHz, without windowing and the PMC boundary condition has been used to terminate the mesh. Dots: MFDM [18] and Solid: FDTD



Figure 88:  $Z_{12}$ , 0-10GHz, with windowing and the PMC boundary condition has been used to terminate the mesh. Dots: MFDM [18] and Solid: FDTD

## 8.3 Summary

The transformation from time domain to the frequency domain may require the use of windowing in the time domain to reduce ripples in the frequency domain. An excellent correlation with a frequency-domain solver has been obtained for the test case presented.

# 9 Efficient Use of Full-Wave Solvers For Chip-Package Cosimulation

#### 9.1 Introduction

A method by which full-wave solvers can be applied efficiently for cosimulation of chippackage structures is presented in this section. The traditional and the proposed methods for chip-package cosimulation are given in Figure 89. In the traditional way, starting with the layout of the chip and the package, full-wave solvers are used to obtain the frequency-domain parameters between ports that are defined on the layout. The frequency-domain parameters, which capture the parasitics of the layout, are used to do the time-domain simulation. In the methodology that has been proposed in this thesis, the signal-distribution network (SDN) in the package, the power-distribution network (PDN) in the package, and the on-chip structures are analyzed independently. As mentioned earlier, the power-distribution network is composed of power-ground planes and decoupling capacitors; the signal-distribution network is made up of interconnects and passive terminations. Full-wave solvers are used to capture the frequency-domain parameters of the SDN, the PDN, and the on-chip structures separately and then integrated together in the following step. The integrated frequency-domain parameters is used for the time-domain simulation. By breaking up the problem into smaller pieces and applying full-wave solvers on the smaller blocks, the memory requirement and the computation time can be reduced. As a starting point, the focus of this section will be on integration of SDN and PDN networks at the package level. The on-chip structures have been simplified to current sources with a pseudo-random bit stream to emulate the CMOS logic.

#### 9.2 SDN-PDN Cosimulation

Existing techniques are limited by either accuracy or time and memory required for computation. The newly proposed transient simulation methodology to cosimulate SDN and PDN is shown in Figure 90. The first step is to separate the power-distribution network (PDN) and the signal-distribution network (SDN). The frequency parameters of the SDN and the



Figure 89: The traditional and proposed methodologies for chip-package cosimulation.

PDN, which capture the parasitics of the structure, are first obtained separately and then integrated together in the following step. The frequency parameters of the power-ground planes can be obtained using the Transmission Matrix Method (TMM) without the need for full-wave solvers [16]. The frequency parameters of transmission lines can be obtained from the ADS® library without any numerical analysis. The second step is to integrate the frequency parameters of the interconnects and the power-ground planes. Not all of the ports resulting after the integration of the SDN and the PDN are needed for transient simulation. Some of these ports can be eliminated to save memory, which is the third step. The bandlimited frequency-domain data of the power-ground planes and the interconnects can violate causality [17]. Delay between two ports can be extracted from the frequency-domain parameters. The delay information that is extracted in Step 4 is used in the final step, which is transient simulation, to obtain simulation results that do not violate causality. The first four steps are operations done in the frequency domain and the final step is in the time domain.

The remaining part of this section is organized as follows: obtaining the frequency-domain parameters of SDN and PDN is given in Chapter 9.3; integration of SDN and PDN is explained in Chapter 9.4; port reduction is presented in Chapter 9.5; delay extraction and causality enforcement is outlined in Chapter 9.6, followed by transient simulation in Chapter 9.7.



Figure 90: The SDN-PDN cosimulation methodology.

## 9.3 Frequency-Domain Parameters of PDN and SDN

The frequency-domain parameters of the power-ground planes can be obtained using the Transmission Matrix Method (TMM). In the transmission matrix method, a power-ground plane is divided into unit cells, as shown in Figure 91. The parasitics of each unit cell is modeled as shown in Figure 91. Expressions relating the values of the parasitics and the structure dimensions and the material properties are given in [16]. Ports are defined at the vertices of the unit cells and the frequency parameters can be obtained by solving a system of linear equations. Frequency-domain parameters of power-ground planes with more than two metal layers can be obtained using the Multilayer Finite Difference Method (MFDM) [18]. Frequency parameters of interconnects can be obtained using the ADS® library or measurement data.

# 9.4 Integration of PDN and SDN

Three common types of interconnect configurations in a package are shown in Figure 92. The three types are microstrip, coplanar waveguide, and stripline configurations. The fourth type shown in the figure is a combination of the first three cases. In the microstrip configuration, an interconnect is routed above a power-ground plane; in a coplanar waveguide, the



Figure 91: A power-ground plane.

interconnect is routed in the same layer as the power or the ground plane; in a stripline configuration, the interconnect is sandwiched between the power plane and the ground plane. Modeling methods for the microstrip and the coplanar waveguide are given in Chapter 9.4.1 and 9.4.2, respectively. A stripline model is given in [19] and a similar methodology has been applied to model the coplanar-waveguide configuration. Models for microstrip, coplanar waveguide, and stripline can be combined together to represent interconnects that are a combination of different types of configurations.

#### 9.4.1 Microstrip Configuration

A microstrip line referenced to a nonideal power-ground plane is separated into a power-ground plane pair and a microstrip line as shown in Figure 93. Two ports P1 and P2 are defined on the power-ground plane at the near-end reference and the far-end reference of the microstrip line, as shown in Figure 93. The frequency parameters of the power-ground plane can obtained using TMM, as explained in Chapter 9.3. The frequency response of the microstrip line can be obtained using the ADS® library.

The circuit model of a single microstrip line referenced to a nonideal power-ground plane for some frequency is shown in Figure 94. For the two-port admittance-matrix model shown in Figure 94, the resistors represent self-admittance parameters  $Y_{11}$  and  $Y_{22}$ ; voltage-



Figure 92: The common types of interconnect configurations in a package.

controlled current sources represent transfer-admittance parameters  $Y_{12}$  and  $Y_{21}$ , which capture the coupling between the ports.

The integrated microstrip line and the power-ground plane can be represented as an admittance matrix using the stamp rule [20]. As explained in [20], Y-parameter blocks that are referenced to a global ground node can be stamped in the admittance matrix without the need for a circuit model. For this reason, the two-port parameters of the power-ground plane is shown by a black box in Figure 93, rather than a circuit model.

To model M coupled transmission lines referenced to a power-ground plane, 2M ports are defined on the power-ground plane. Two ports on the power-ground plane are defined for each transmission line, one at the near-end reference and the other at the far-end reference of the interconnect. The circuit model to represent an M-port network can be extended from



Figure 93: A microstrip line over a power-ground plane.

the two-port network shown in Figure 94. The self admittance at each port is modeled by a resistor and voltage-controlled current sources are placed in parallel to model the coupling to all the other ports. The admittance matrix circuit model for an M-port network referenced to a nonideal power-ground plane is shown in Figure 96.

## 9.4.2 Coplanar-Waveguide Configuration

In this section, a model for a conductor-backed coplanar-waveguide structure is developed. Due to a high wiring density in a package, an interconnect may be routed on the same layer as a power or a ground plane, as shown in Figure 92. A slot is created on a plane and the interconnect is routed in the slot.

The cross section of a conductor-backed coplanar-waveguide structure is shown in Figure 97. The interconnect and the Vdd plane can be viewed as multiconductor transmission lines over a ground plane. Although the cross section in Figure 97 shows three conductors over a ground plane, it is assumed in the derivation that the Vdd conductors are connected together and are treated like a single conductor. By using multiconductor transmission-line theory [9], the Vdd plane and the interconnect can be decoupled from each other. The coupling between them can be represented using controlled sources at the near end and the far end of the transmission lines, as shown in Figure 98. Remaining of this section presents the mathematical derivation of the circuit model.

Multiconductor transmission-line equations in the frequency domain can be written as,

$$\frac{d}{dz}\bar{V}(z) = -(\bar{\bar{R}} + j\omega\bar{\bar{L}})\bar{I}(z) = -\bar{\bar{Z}}\bar{I}(z)$$
(121)

$$\frac{d}{dz}\bar{I}(z) = -(\bar{\bar{G}} + j\omega\bar{\bar{C}})\bar{V}(z) = -\bar{\bar{Y}}\bar{V}(z). \tag{122}$$

 $\bar{R}, \bar{L}, \bar{G}, \bar{C}$  represent per unit length resistance, inductance, conductance and capacitance matrices;  $\bar{V}(z)$  and  $\bar{I}(z)$  represent the voltage and current at some location z along a transmission line with uniform cross section, as shown in Figure 99;  $\bar{Z}$  and  $\bar{Y}$  are per unit length impedance and admittance matrices.

Let  $\bar{T}_I$  and  $\bar{T}_V$  be transformation matrices to transform variables  $\bar{I}$  and  $\bar{V}$  to  $\bar{I}_m$  and  $\bar{V}_m$ ,



Figure 94: The circuit model for a microstrip line referenced to a power-ground plane.



Figure 95: N coupled lines referenced to a power-ground plane.

respectively.

$$\bar{I} = \bar{\bar{T}}_I \bar{I}_m \tag{123}$$

$$\bar{V} = \bar{\bar{T}}_V \bar{V}_m \tag{124}$$

Substituting Equations 123 and 124 in Equations 121 and 122, Equations 125 and 126 can be obtained.

$$\frac{d}{dz}\bar{V}_m(z) = -\bar{\bar{T}}_V^{-1}\bar{\bar{Z}}\bar{\bar{T}}_I\bar{I}_m(z) = -\bar{\bar{Z}}_m\bar{I}_m(z) \tag{125}$$

$$\frac{d}{dz}\bar{V}_m(z) = -\bar{\bar{T}}_V^{-1}\bar{\bar{Z}}\bar{\bar{T}}_I\bar{I}_m(z) = -\bar{\bar{Z}}_m\bar{I}_m(z)$$

$$\frac{d}{dz}\bar{I}_m(z) = -\bar{\bar{T}}_I^{-1}\bar{\bar{Y}}\bar{\bar{T}}_V\bar{V}_m(z) = -\bar{\bar{Y}}_m\bar{V}_m(z)$$
(125)

 $\bar{T}_I$  and  $\bar{T}_V$  are chosen such that  $\bar{Z}_m$  and  $\bar{Y}_m$  result in diagonal matrices. Equations 125 and 126 are transmission-line equations with no coupling between the transmission lines. The decoupled transmission lines will be referred to as modal transmission lines.  $\bar{Z}_m$  and  $\bar{Y}_m$ are per unit length impedance and admittance parameters of the modal transmission lines, whose modal voltages and modal currents are  $\bar{V}_m$  and  $\bar{I}_m$ , respectively (see Figure 99).

The transformation matrices  $\bar{T}_V$  and  $\bar{T}_I$  can be written in terms of self and mutual inductances for the special case of two lossless transmission lines in a homogeneous medium.



Figure 96: The Y-parameter model for an M-port network referenced to a power-ground plane.



Figure 97: The cross section of a coplanar-waveguide structure.

The lossless requirement can be relaxed after the circuit model has been derived and dielectric and conductor losses can be included. For the special case of perfect conductors in a homogeneous medium, the following relationships hold true [9]:

$$\bar{\bar{R}} = \bar{\bar{0}} \tag{127}$$

$$\bar{\bar{C}}\bar{\bar{L}} = \bar{\bar{L}}\bar{\bar{C}} = \mu\epsilon\bar{\bar{I}} \tag{128}$$

$$\bar{\bar{G}}\bar{\bar{L}} = \bar{\bar{L}}\bar{\bar{G}} = \mu\sigma\bar{\bar{I}} \tag{129}$$

From Equation 127, the per unit length impedance matrix is simply the impedance of per unit length inductance matrix, as given in Equation 130.

$$\bar{\bar{Z}} = \bar{\bar{R}} + j\omega\bar{\bar{L}} = j\omega\bar{\bar{L}} \tag{130}$$

Suppose that  $\bar{T}_V$  and  $\bar{T}_I$  can be found such that the inductance matrix  $\bar{L}$  is diagonalized to  $\bar{L}_m$ , then it can be shown that the admittance per unit length matrix is also diagonalized.



Figure 98: The coupling between the transmission lines is captured using controlled sources.



Figure 99: N coupled transmission lines and modal transmission lines.

In Equation 131,  $\bar{L}_m$  is a diagonal matrix that represents per unit length modal inductance.

$$\bar{\bar{T}}_I^{-1}\bar{\bar{Y}}\bar{\bar{T}}_V = \bar{\bar{T}}_I^{-1}(\bar{\bar{G}} + j\omega\bar{\bar{C}})\bar{\bar{T}}_V$$
(132)

$$= \bar{T}_I^{-1}\bar{\bar{G}}\bar{\bar{T}}_V + j\omega\bar{\bar{T}}_I^{-1}\bar{\bar{C}}\bar{\bar{T}}_V \tag{133}$$

$$= \mu \sigma \bar{\bar{T}}_{I}^{-1} \bar{\bar{L}}^{-1} \bar{\bar{T}}_{V} + j \omega \mu \epsilon \bar{\bar{T}}_{I}^{-1} \bar{\bar{L}}^{-1} \bar{\bar{T}}_{V}$$
 (134)

$$= (\mu \sigma + j \omega \mu \epsilon) (\bar{\bar{T}}_V^{-1} \bar{\bar{L}} \bar{\bar{T}}_I)^{-1}$$
(135)

$$= (\mu \sigma + j \omega \mu \epsilon) \bar{\bar{L}}_m^{-1} \tag{136}$$

$$= Y_m \tag{137}$$

Equation 134 is obtained from Equation 133 by substituting the relationships between  $\bar{L}$  and  $\bar{C}$ , as well as  $\bar{G}$  and  $\bar{L}$ , given in Equations 128 and 129. Equation 136 has been obtained based on the assumption that the transformation matrices  $\bar{T}_V$  and  $\bar{T}_I$  diagonalizes the per unit length inductance matrix. Equations 132 - 137 show that for lossless conductors in a homogeneous medium, if the per unit length inductance matrix is diagonalized, then the per unit length admittance matrix is also diagonalized. Therefore, it is sufficient to diagonalize the per unit length inductance matrix to transform the coupled transmission lines to the decoupled modal transmission lines.

For the special case of two lossless transmission lines in a homogeneous medium, an analytical expression for the transformation matrices that is written in terms of the per unit length inductance matrix entries can be derived. Let the per unit length inductance matrix of a two-conductor transmission line over a ground plane be

$$\bar{\bar{L}} = \begin{pmatrix} L_{pp} & L_{ps} \\ L_{sp} & L_{ss} \end{pmatrix}. \tag{138}$$

In Equation 138, subscript p refers to the power plane and s refers to the signal line. If the transformation matrices  $T_V$  and  $T_I$  are chosen as,

$$T_V = \begin{pmatrix} 1 & 0 \\ -k & 1 \end{pmatrix} \tag{139}$$

$$T_I = \begin{pmatrix} 1 & k \\ 0 & 1 \end{pmatrix} \tag{140}$$

where 
$$k = \frac{-L_{sp}}{L_{pp}},$$
 (141)

then  $T_V$  and  $T_I$  diagonalize the per unit length inductance matrix given in Equation 138. Diagonalizing the impedance matrix, which is the same as diagonalizing the inductance matrix due to the lossless condition, also diagonalizes the admittance matrix as shown in Equations 132 - 137.

The two modes of propagation in a conductor-backed coplanar-waveguide structure are the parallel-plate mode and the coplanar-waveguide mode. The electric field patterns for the two modes are shown in Figure 100. In the coplanar-waveguide mode, there is no electric field, or in other words no voltage difference, between the Vdd plane and the Gnd plane. The parallel-plate mode captures the electric field between the Vdd plane and the Gnd plane, as shown in Figure 100.



Figure 100: The E-field patterns for the coplanar-waveguide mode and the parallel-plate mode.

The circuit model for a coplanar-waveguide structure is shown in Figure 101.  $V_{par}^i$ ,  $V_{par}^o$ ,  $I_{par}^i$ , and  $I_{par}^o$  are the modal voltages and currents of the parallel-plate mode at the near



Figure 101: The circuit model for a coplanar-waveguide structure.

end and the far end of the modal transmission line.  $V_{CPW}^i$ ,  $V_{CPW}^o$ ,  $I_{CPW}^i$ , and  $I_{CPW}^o$  are the modal voltages and currents of the coplanar-waveguide mode at the near end and the far end of the modal transmission line.  $V_p^i$ ,  $V_p^o$ ,  $I_p^i$ , and  $I_p^o$  are the voltages and currents at the input and the output of the physical transmission line.  $V_s^i$ ,  $V_s^o$ ,  $I_s^i$ , and  $I_s^o$  are the voltages and currents at the input and the output of the interconnect.

The coupling information between the modal transmission lines is captured in the transformation matrices  $\bar{T}_V$  and  $\bar{T}_I$ . The coupling terms are added to the modal transmission lines such that Kirchoff's Voltage Law (KVL) and Kirchoff's Current Law (KCL) equations at the input and the output ports of the modal transmission lines satisfy Equations 142-143 of the transformation matrices.

$$\begin{pmatrix} V_p \\ V_s \end{pmatrix} = \begin{pmatrix} 1 & 0 \\ -k & 1 \end{pmatrix} \begin{pmatrix} V_{par} \\ V_{CPW} \end{pmatrix}$$
 (142)

$$\begin{pmatrix} I_p \\ I_s \end{pmatrix} = \begin{pmatrix} 1 & k \\ 0 & 1 \end{pmatrix} \begin{pmatrix} I_{par} \\ I_{CPW} \end{pmatrix}$$
(143)

$$where k = \frac{-L_{sp}}{L_{pp}}, (144)$$

The modal transmission lines can be replaced with two-port frequency parameters of the power-ground plane and the interconnect that can be obtained using TMM and the ADS<sup>®</sup> library (see Figure 102). The advantage of this circuit model is that the frequency parameters of the power-ground plane and the interconnect can be obtained separately and integrated together in the following step, making the process memory and time efficient.

#### 9.4.3 A test structure to verify the coplanar-waveguide model

The test case shown in Figure 103 was simulated using the coplanar-waveguide model and the results were compared with Sonnet<sup>®</sup> EM simulator. The top view of the test structure and its cross section are shown in Figure 103. A slot is created in the middle of the Vdd plane and an interconnect is routed in the slot on the same layer as the Vdd plane. The spacing between the interconnect and the Vdd plane is  $100\mu m$ . The dimensions of the power-ground plane is  $10mm \times 10mm$  and the length of the interconnect is 7.4mm. The entire structure is embedded in a dielectric material with  $\epsilon_r = 3.8$ . The width of the interconnect is 1mm. The locations of four ports P1, P2, P3, and P4 are marked in the figure. P1 and P2 are located on the Vdd plane; P3 and P4 are located on the interconnect.  $S_{13}$  (dB and phase),  $S_{14}$  (dB and phase), and  $S_{34}$  (dB and phase) are plotted in Figures 104-109. Results show excellent correlation with Sonnet<sup>®</sup> over a wide bandwidth of 8GHz.

## 9.5 Port Reduction

The integrated power-ground plane and the interconnect can be represented as an admittance matrix, which will be referred to as  $Y_{int}$ , using the stamp rule [20]. The dimension of  $Y_{int}$  can be reduced to save memory because not all of the ports are needed for transient simulation. The ports of interest in transient simulation are the ports where the driver circuit and passives are connected, and ports where the voltage waveforms are to be observed. The purpose of this step is to reduce the number of ports to only those of interest in transient simulation.



Figure 102: The modal transmission lines replaced with two-port frequency parameters.



Figure 103: A test structure to verify the CPW model.

The algorithm for port reduction is described here [20]. The rows and columns of  $Y_{int}$  are reordered such that the matrix entries for the ports of interest,  $Y_{pp}$ , are on the top left of the matrix, as shown in Equation 145.

$$\begin{pmatrix} I_{pp} \\ I_{other} \end{pmatrix} = \begin{pmatrix} Y_{pp} & Y_{pc} \\ Y_{cp} & Y_{cc} \end{pmatrix} \begin{pmatrix} V_{pp} \\ V_{other} \end{pmatrix}$$

$$(145)$$

Setting  $\bar{I}_{other}$  to  $\bar{0}$  and solving for  $\bar{I}_{pp}$ , the Y-matrix of the reduced network is given in Equation 146.

$$Y_{reduced} = Y_{pp} + Y_{pc}(-Y_{cc}^{-1}Y_{cp})$$
 (146)

 $Y_{reduced}$  can be converted to S-parameters using the equations given in [20] [21].

# 9.6 Causality Enforcement Through Delay Extraction

In simulation of long interconnects using frequency-domain parameters the finite bandwidth and the limited number of points in the frequency-domain data may cause simulation results to violate causality [17]. Frequency response from  $[0, \infty]$  will be needed to accurately capture the delay in the impulse response. Delay can be extracted from the frequency-domain parameters using the Hilbert transform and this information can be used to obtain a causal impulse response of the S-parameters [17]. The remainder of this subsection presents a summary of the methodology in [17].



Figure 104:  $S_{13}$  magnitude of the CPW test structure. Dots: Sonnet and Solid: CPW Model



Figure 105:  $S_{13}$  phase of the CPW test structure. Dots: Sonnet and Solid: CPW Model



Figure 106:  $S_{14}$  magnitude of the CPW test structure. Dots: Sonnet and Solid: CPW Model



Figure 107:  $S_{14}$  phase of the CPW test structure. Dots: Sonnet and Solid: CPW Model



Figure 108:  $S_{34}$  magnitude of the CPW test structure. Dots: Sonnet and Solid: CPW Model



Figure 109:  $S_{34}$  phase of the CPW test structure. Dots: Sonnet and Solid: CPW Model

Let  $S_{12}$  be the frequency response between two ports with a long delay.  $S_{12}$  is separated into minimum-phase and all-pass components, as given in Equation 147 [22].

$$S_{12}(j\omega) = S_{12,min}(j\omega)S_{12,AP}(j\omega) \tag{147}$$

The delay is embedded in the all-pass component that can then be calculated. The magnitude and the argument of the minimum-phase component of S-parameters can be calculated using Equations 148 and 149.

$$|S_{12,min}(j\omega)| = |S_{12}(j\omega)|$$
 (148)

$$arg(S_{12,min}(j\omega)) = \frac{-1}{2\pi} P \int_{-\pi}^{\pi} log|S_{12}(j\theta)|cot\left(\frac{\omega-\theta}{2}\right) d\theta$$
 (149)

The all-pass component has unity magnitude and therefore, the magnitude of the minimumphase S-parameters is the same as the magnitude of the S-parameters, as given in Equation 148. The argument of the minimum-phase S-parameters can be computed using the Hilbert transform given in Equation 149. The all-pass component of  $S_{12}$  is given in Equation 150.

$$S_{12,AP}(j\omega) = \frac{S_{12}(j\omega)}{S_{12,min}(j\omega)} = e^{-j\omega T_d}$$
 (150)

The all-pass component is assumed to be of the form  $e^{-j\omega T_d}$ , where  $T_d$  is the delay between Ports 1 and 2. Solving for the port-port delay,

$$T_d = -\frac{arg(S_{12,AP}(j\omega))}{\omega}. (151)$$

The impulse response of  $S_{12}$  can be written as

$$F^{-1}(S_{12}) = F^{-1}(S_{12,min}S_{12,AP}) (152)$$

$$= F^{-1}(S_{12,min}e^{-j\omega T_d}) (153)$$

$$= s_{12,min}(t - T_d), (154)$$

where operator  $F^{-1}$  is the inverse Fourier transform and  $s_{12,min}(t-T_d)$  is the impulse response of the minimum-phase S-parameters shifted by time delay  $T_d$ . For non-uniformly spaced frequency samples, the non-uniform inverse discrete Fourier transform can be used [23]. The causal impulse response of  $S_{12}$  can thus be obtained by ensuring that the impulse response remains zero until the time delay  $T_d$ .

#### 9.7 Transient Simulation

There are two possible schemes for transient simulation. Both schemes require solving a matrix of the form Ax = b at each time step to compute the unknown voltages and currents. The difference between the two schemes are the way in which the S-parameter equations, terminations, and sources are set up in the A matrix. The two schemes are presented in Chapters 9.7.1 and 9.7.2. In the second scheme, the equations are set up in a modified nodal analysis (MNA) framework. The industry standard for circuit simulation is Spice and Spice uses MNA formulation for circuit simulation [14]. The second scheme, which is based on MNA formulation, will therefore be compatible with existing tools and is a more attractive option.

## 9.7.1 Transient Simulation Using Signal-Flow Graphs

The S-parameter equations, together with sources and terminations, are set up in a matrix of the form Ax = b. The matrix is solved once at each time step to calculate the unknown voltages and currents, with the b matrix updated at the end of each iteration. The transient simulation methodology will be illustrated with an example [25].

An S-parameter network may be represented in the form of a signal-flow graph (SFG) [24]. A two-port network and its signal-flow graph is shown in Figure 110. The two-port network represents the transmission line frequency response. The terminations at the near end and the far end of the interconnect are  $Z_1$  and  $Z_2$ ;  $g_1(t)$  and  $g_2(t)$  are voltage sources connected to the near end and the far end of the interconnect. S-parameter equations in the time domain can be written using the convolution operation, as given in Equations 155 and 156.

$$b_1(t) = s_{11}(t) * a_1(t) + s_{12}(t) * a_2(t)$$
(155)

$$b_2(t) = s_{21}(t) * a_1(t) + s_{22}(t) * a_2(t).$$
(156)

Additional equations are obtained from the terminations connected to the S-parameter network. For the example under consideration, the additional equations at Ports 1 and 2



Figure 110: A two port signal-flow graph [25].

are

$$a_1(t) = \Gamma_1 b_1(t) + T_1 g_1(t) \tag{157}$$

$$a_2(t) = \Gamma_2 b_2(t) + T_2 g_2(t). \tag{158}$$

 $\Gamma_1$  and  $\Gamma_2$  are the reflection coefficients at Port 1 and Port 2.  $T_1$  and  $T_2$  are the transmission coefficients at Port 1 and Port 2.

$$\Gamma_i = \frac{Z_i - Z_{ref}}{Z_i + Z_{ref}} \tag{159}$$

$$T_i = \frac{Z_{ref}}{Z_i + Z_{ref}} \tag{160}$$

The expressions for the reflection and the transmission coefficients are given in Equations 159 and 160. Equations 155 - 158 are discretized in the time domain and a system of linear equations of the form Ax = b is solved at each time step to obtain the unknown nodal voltages and currents [17] [25].

#### 9.7.2 Transient Simulation Using S-Parameters in MNA Framework

Transient analysis using S-parameters can be made compatible with the existing tools by setting up the matrix in MNA format. The transient analysis methodology will be illustrated with an example of a two-port network with terminations, which is shown in Figure 111. The nodes are labeled n1, n2, and n3 as shown in the figure.



Figure 111: A two port S-parameter network with sources and terminations.

The relationship between the port currents and the port voltages, in terms of the power waves  $a_{pi}$  and  $b_{pi}$  [4], discretized in the time domain are

$$i_{pi}[n] = Y_o(a_{pi}[n] - b_{pi}[n])$$
 (161)

$$v_{pi}[n] = a_{pi}[n] + b_{pi}[n],$$
 (162)

where  $Y_o$  is the inverse of the reference characteristic impedance of the S-parameter network.  $\bar{a}(t)$  and  $\bar{b}(t)$  are related to the impulse response of the S-parameter network  $\bar{s}(t)$  by

$$\bar{b}(t) = \bar{\bar{s}}(t) * \bar{a}(t), \tag{163}$$

where \* denotes the convolution operation. Equation 163 is discretized in the time domain to obtain Equations 164 and 165.

$$\bar{b}[n] = \bar{s}[0]\bar{a}[n] + \bar{h}[n]$$
 (164)

$$h_i[n] = \sum_{j=1}^{j=N_p} \sum_{m=1}^{n-1} s_{ij}[n-m]a_j[m]$$
(165)

 $N_p$  is the number of ports in the S-parameter block and  $h_i[n]$  is a history term that is a function of  $\bar{a}$  in the previous time steps. Equations 161-162 and Equations 164-165, together with the equations for the passive terminations and sources, enable solving for the nodal voltages and the branch currents.

To include independent DC sources in the transient simulation, DC analysis will have to be done prior to the transient simulation to determine the initial nodal voltages. The initial operating point will be included in the transient simulation. To determine the initial operating point, assume that  $a_{DC}$  and  $b_{DC}$  are constant for all time duration as shown in Figure 112. The convolution between the impulse response  $s_{ij}$  and  $a_{j,DC}$ , which are shown

in Figure 112, reduces to a simple summation of the values of the impulse response, which is given in Equation 166.



Figure 112: The convolution of the impulse response  $s_{ij}$  and constant  $a_{j,DC}$ .

$$s_{ij} * a_{j,DC} = \left(\sum_{k=0}^{N-1} s_{ij}[k]\right) a_{j,DC}$$

$$= s_{ij}^*.$$
(166)

Symbol  $s_{ij}^*$  is the result of the convolution operation and N is the number of points in the impulse response. Assume that for time  $t \leq 0$ , the independent voltage source in Figure 111 has a constant value of  $v_s[0]$ .

The MNA matrix for DC analysis is

$$\begin{pmatrix}
\frac{1}{R_s} & 0 & \frac{-1}{R_s} & 0 & Y_o & -Y_o & 0 & 0 \\
0 & \frac{1}{R_{term}} & 0 & 0 & 0 & 0 & Y_o & -Y_o \\
\frac{-1}{R_s} & 0 & \frac{1}{R_s} & 1 & 0 & 0 & 0 & 0 \\
0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & -s_{11}^* & 1 & -s_{12}^* & 0 \\
0 & 0 & 0 & 0 & -s_{21}^* & 0 & -s_{22}^* & 1 \\
1 & 0 & 0 & 0 & 0 & 0 & -1 & -1 & 0 & 0 \\
0 & 1 & 0 & 0 & 0 & 0 & -1 & -1 & 1
\end{pmatrix}
\begin{pmatrix}
v_{1,DC} \\ v_{2,DC} \\ v_{3,DC} \\ i_{vs,DC} \\ a_{1,DC} \\ b_{1,DC} \\ a_{2,DC} \\ b_{2,DC} \end{pmatrix} = \begin{pmatrix} 0 \\ 0 \\ 0 \\ 0 \\ 0 \\ 0 \\ 0 \end{pmatrix}.$$
(168)

 $a_{i,DC}$  and  $b_{i,DC}$  appear in the unknown vector x after the conventional MNA variables, which are the nodal voltages and the current through the independent voltage sources.



Figure 113: The values of  $a_j[n]$  for n < 0.

In the transient analysis, which follows the DC analysis,  $a_j[n]$  has value  $a_{j,DC}$  for n < 0, as shown in Figure 113. The discretized S-parameter convolution equation given in Equation 165 is modified to include the DC analysis terms  $a_{j,DC}$ . The modified expression for  $h_i[n]$  is

$$h_{iT}[n] = \sum_{j=1}^{j=N_p} \left\{ \left( \sum_{m=1}^{n-1} (s_{ij}[n-m]a_j[m]) + \sum_{q=n+1}^{N-1} s_{ij}[q]a_{j,DC} \right\}.$$
 (169)

The MNA matrix for the network shown in Figure 111 at time-step n will be

$$\begin{pmatrix}
\frac{1}{R_s} & 0 & \frac{-1}{R_s} & 0 & Y_o & -Y_o & 0 & 0 \\
0 & \frac{1}{R_{term}} & 0 & 0 & 0 & 0 & Y_o & -Y_o \\
\frac{-1}{R_s} & 0 & \frac{1}{R_s} & 1 & 0 & 0 & 0 & 0 \\
0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 \\
0 & 0 & 0 & 0 & -s_{11}[0] & 1 & -s_{12}[0] & 0 \\
0 & 0 & 0 & 0 & -s_{21}[0] & 0 & -s_{22}[0] & 1 \\
1 & 0 & 0 & 0 & 0 & -1 & -1 & 0 & 0 \\
0 & 1 & 0 & 0 & 0 & 0 & -1 & -1
\end{pmatrix}
\begin{pmatrix}
v_1 \\ v_2 \\ v_3 \\ i_{vs} \\ a_1 \\ b_1 \\ a_2 \\ b_2
\end{pmatrix} = \begin{pmatrix}
0 \\ 0 \\ 0 \\ v_s[n] \\ h_{1T}[n] \\ h_{2T}[n] \\ 0 \\ 0 \\ 0
\end{pmatrix}. (170)$$

Equation 170 is solved once at each time step and the history terms  $h_{iT}$  are updated after the solution has been obtained.

Although the example presented is for a two-port network, the method can be easily generalized for an N-port S-parameter network. Complex non-linear driver models can also be included in the simulation.

## 9.8 Results

Simulation results from three test cases are presented in this section. The first test case in Chapter 9.8.1 has a nonzero DC operating point and DC analysis is done prior to the

transient simulation using the methodology explained in Chapter 9.7.2. The second test case in Chapter 9.8.2 compares simulation results with and without causality enforcement. The third test case was simulated using the transient simulation methodology in Chapter 9.7.1 and shows the scalability of the technique to practical problems.

#### 9.8.1 Transmission Line Simulation

A  $50\Omega$  transmission line with matched termination is shown in Figure 114. The two-port transmission-line frequency-domain parameters are obtained for a bandwidth of 10GHz. The two ports are located at the near end and the far end of the interconnect. The delay of the interconnect is 3ns. The driver is represented by two time-varying resistors to emulate a static CMOS driver. Rpush(t) and Rpull(t) represent the pull-up and the pull-down network of the static CMOS driver. The time-varying resistor waveform, Rpush(t), is plotted in Figure 115. Rpush(t) and Rpull(t) have opposite polarities as shown in Figure 116. The rise/fall time of the driver is 200ps and the data rate is 2.5Gbps. The voltage waveform at the far end of the interconnect is plotted in Figure 117. The dots are the simulation results from ADS and the solid curve is from the simulation methodology presented in Chapter 9.7.2. Note that the nonzero DC operating point at time t = 0 has been accurately captured by the proposed scheme.



Figure 114: Transient simulation of a transmission line with DC analysis using MNA formulation.



Figure 115: A typical time-varying resistor waveform.



Figure 116: Rpush(t) and Rpull(t) have opposite polarities.



Figure 117: The voltage waveform at the far end of the interconnect. Solid: S-Parameter simulation with DC analysis and Dots: ADS

## 9.8.2 Step Response of an Interconnect With Causality Enforcement

A  $50\Omega$  transmission line with a mismatched termination of  $25\Omega$  and a delay of 2.5ns is shown in Figure 118. A long transmission-line delay and a mismatched termination has been chosen to observe violation of causality in the reflected and the transmitted waves. The unit step input has a rise time of 100ps. The frequency-domain parameters of the interconnect has been obtained for a bandwidth of 10GHz. The step response at the far end of the interconnect is plotted in Figure 119. The solid curve has been obtained using ADS<sup>®</sup>, the dotted-dashed line is simulation without causality enforcement, and the dashed line is with causality enforcement. There is a good correlation between ADS<sup>®</sup> and the simulation without causality enforcement at a zoom of Figure 119 is shown in Figure 120. It can be clearly seen that the simulation without causality enforcement starts to increase before the actual delay of the line, which is non-physical. However, simulation with causality enforcement remains zero until the actual delay of the line.



Figure 118: The simulation set up for the step response of an interconnect.

#### 9.8.3 Sixty-Four Bit Bus Referenced to a Nonideal Power-Ground Plane

A sixty-four bit bus referenced to a nonideal power-ground plane is shown in Figure 121. The dimension of the plane is  $10in. \times 10in$ . The characteristic impedance of the microstrip lines are  $22\Omega$ . The termination at the far end of the line is  $43\Omega$  to the Vdd plane and  $43\Omega$  to the ground plane. The rise/fall time is 500ps and the date rate is 1.6Gbps. The frequency-domain parameters of the power-ground plane and the interconnect has been obtained for a bandwidth of 2.5GHz. The number of ports in the S-parameter network after



Figure 119: The step response of a long interconnect from 0-20ns.



Figure 120: The zoom of Figure 119 from 0 - 6ns.

port reduction is 130, sixty-four ports at the near end of the interconnect and sixty-four ports on the power-ground plane at the near-end references of the interconnect to connect the driver, one port on the power-ground plane for the voltage supply, and one port at the far end of the interconnect to observe the output voltage waveform. Eye diagrams without and with causality enforcement are plotted in Figure 122. Results show that eye-diagram simulation without causality enforcement results in an artificial eye closure of 110mV.



Figure 121: The sixty-four bit bus simulation set up.



Figure 122: The eye-diagram simulation results without and with causality enforcement.

## 9.9 Speed And Memory Optimization

The memory required to store an N-port S-parameter block with f frequency points is  $O(N \times N \times f)$ . The number of frequency points f in the S-parameter data must be large enough to obtain accurate impulse response. S-parameters are computed and stored prior to transient analysis, which would become memory intensive for large number of ports. For transient simulation using linear current sources, memory can be optimized by using Z-parameters. It will be shown in Chapter 9.9.1, with Z-parameters there is no necessity to compute the N-port Z-parameter block prior to transient simulation. However, the memory-efficient technique can be applied only for linear time-invariant transient simulation. A smaller memory required for simulation also results in faster simulation time. Linear current sources can be used to model switching logic circuits and therefore, the technique can be valuable for quick analysis prior to a more detailed simulation. The simulation methodology in Chapter 9.9.1 is followed by a test case in Chapter 9.9.2.

## 9.9.1 Methodology

The simulation methodology is best explained with an example. The circuit representation of a transmission line referenced to a power-ground plane, similar to the structure shown in Case 1 of Figure 92, is shown in Figure 123. The parasitics of the nonideal power-ground plane is represented as a 1D structure, rather than 2D, for simplicity. The driver is represented by two current sources shown in the dotted box, one discharging the interconnect, while the other charging it. The switching current patterns for the two sources have opposite polarities as shown in Figure 124. For this example, the desired goal is to obtain the simultaneous switching noise voltage at the node marked P1.

The circuit representation of the interconnect and the power-ground plane can be represented as an admittance matrix, which is given in Equation 171, using the stamp rule [20].



Figure 123: A transmission line referenced to a power-ground plane.

The size of the matrix is the same as the number of nonzero nodes in the circuit.

$$\begin{pmatrix}
I_{1} \\
I_{2} \\
\vdots \\
I_{N-1} \\
I_{N}
\end{pmatrix} = \begin{pmatrix}
Y_{11} & Y_{12} & \dots & Y_{1N} \\
Y_{21} & Y_{22} & \dots & Y_{2N} \\
\vdots \\
Sparse \\
\vdots \\
Y_{N1} & Y_{N2} & \dots & Y_{NN}
\end{pmatrix} \begin{pmatrix}
V_{1} \\
V_{2} \\
\vdots \\
\vdots \\
V_{N-1} \\
V_{N}
\end{pmatrix}$$
(171)

The voltage at P1 can be obtained from the transfer-impedance parameters between P1 and all the other nodes, which is given by the set  $\{Z_{11}, Z_{12}, ...., Z_{1N}\}$ , where N is the number of nodes/ports in the circuit. The impedance matrix is symmetric and the first row,  $\{Z_{11}, Z_{12}, ...., Z_{1N}\}$ , is the same as the first column of the matrix,  $\{Z_{11}, Z_{21}, ...., Z_{N1}\}$ . By the property that the impedance and admittance matrices are the inverse of each other, the required row or the column of the impedance matrix can be obtained. The first row of the impedance matrix can be obtained by solving Equation 172. To obtain the  $i^{th}$  row/column



Figure 124: The charging and discharging currents that model the source have opposite polarities.

of the impedance matrix, the right hand side in Equation 172 must be the  $i^{th}$  row/column of the identity matrix.

$$\begin{pmatrix} Y_{11} & Y_{12} & . & . & . & Y_{1N} \\ Y_{21} & Y_{22} & . & . & . & Y_{2N} \\ . & & & & . \\ sparse & & & & . \\ . & & & & . & . \\ Y_{N1} & Y_{N2} & . & . & . & Y_{NN} \end{pmatrix} \begin{pmatrix} Z_{11}(\omega) \\ Z_{12}(\omega) \\ . \\ . \\ . \\ Z_{1N}(\omega) \end{pmatrix} = \begin{pmatrix} 1 \\ 0 \\ . \\ . \\ 0 \\ 0 \end{pmatrix}$$

$$(172)$$

The desired voltage at P1 can be computed by the matrix vector product given in Equation 173. The column vector  $\bar{I}(\omega)$  in Equation 173 are the current sources that are connected to the nodes in the circuit.

$$V_{1}(\omega) = \begin{pmatrix} Z_{11}(\omega) & Z_{12}(\omega) & \dots & Z_{1N}(\omega) \end{pmatrix} \begin{pmatrix} I_{1}(\omega) & & & \\ I_{2}(\omega) & & & \\ & \ddots & & \\ & I_{N-1}(\omega) & & \\ & I_{N}(\omega) \end{pmatrix}$$

$$(173)$$

The time-domain voltage waveform at P1 can be obtained by computing the inverse Fourier transform of  $V_1(\omega)$ . To ensure operation  $A(\omega)B(\omega)$  is the same as the linear convolution between A(t) and B(t), zero padding must be done [22].

Therefore, given the circuit representation of the integrated power-ground plane and the interconnect, with linear current sources to model the drivers, there is no need to compute and store the S-parameter block prior to a transient simulation. The speed and the memory can thus be optimized.

#### 9.9.2 Enhancement of Power Integrity Using Embedded Capacitors

Passive components that are embedded in a package are becoming increasingly important for the next generation miniaturized systems through the gradual replacement of discrete passives. Improvements in the electrical performance of microelectronic systems can be achieved by the integration of embedded passive elements such as capacitors, resistors, and inductors. The biggest challenge in integration of all passives are those posed by the capacitors. This is primarily because of the high capacitance that is associated with these structures. Decoupling in today's systems is primarily achieved by using surface mount (SMT) capacitors. These capacitors are ineffective at frequencies above 100MHz due to the large inductances associated with the capacitors [26]. Capacitors that are embedded inside a package, which is shown in Figure 125, overcome this limitation because of the low inductance microvias that connect these capacitors to the power and ground planes of the package.

Two types of embedded capacitors are used in the simulation: (1) large planar capacitors, and (2) embedded discrete capacitors, both of which are shown in Figure 125. A planar capacitor is used as a power-ground plane, which also acts as a reference for the microstrips. Figure 125 shows a typical package connected to a printed circuit board (PCB). Two active



Figure 125: A typical package connected to a PCB.

chips are connected to the package. End to end simulation of signal lines connected from the driver to the receiver that is referenced to a high-k planar capacitor is simulated. The embedded discrete capacitors are placed in the package, close to the chip. The proximity of the capacitors to the chip minimizes parasitic inductance and provides charge to the switching circuits quicker. The SMTs, which are placed on the PCB, have larger parasitic inductance (ESL) and resistance (ESR) resulting from the longer current path from the capacitor to the chip. Embedded capacitors have significantly lower ESL and ESR and better pin down the SSN voltage than surface mounts. This is demonstrated by transient

simulation of SSN using the memory optimized scheme that is presented in Chapter 9.9.1.

The simulation set up is shown in Figure 126. The thickness of the power-ground plane substrate is  $14\mu m$ . The plane size is 15mm by 50mm. Hundred single ended  $50\Omega$  microstrips that are each 15mm long are referenced to the power plane. Each microstrip is terminated with  $99\Omega$  resistors to the Vdd plane and the ground plane. SSN is simulated at the location marked by the arrow in Figure 126. The ESL and the ESR of a 100nF SMT is 205.5pH and  $100m\Omega$ . The ESL and the ESR of a 1nF embedded discrete capacitor is 33.54pH and  $9m\Omega$ .



Figure 126: The simulation set up for SSN simulation.

Figure 127 shows the switching noise due to twenty-five 100nF SMTs and a power-ground plane substrate that has a relative permittivity of  $\epsilon_r = 3.8$ . The peak noise voltage is approximately 150mV. If the number of SMTs is increased from twenty five to one hundred, the noise voltage reduces to 100mV, which is shown in Figure 128. Figure 129 shows the SSN for the case with twenty-five 1nF embedded discrete capacitors and a high-k planar capacitor with  $\epsilon_r = 11$  that is used as a power-ground plane. The peak noise voltage for this case is 50mV. Simulation results show that a high-k dielectric material for the plane pair, together with embedded discrete capacitors, help reduce SSN better than surface mount discrete capacitors.

# 9.10 Summary

Simulation of a chip-package structure using a full-wave solver is a memory and timeintensive operation. An efficient way to simulate the structure is to separate it into different blocks, apply the solver on each of these smaller problems separately, followed by integration of the blocks using the modal-decomposition technique.



Figure 127: Twenty-five 100nF SMTs and power-ground plane substrate  $\epsilon_r=3.8.$ 



Figure 128: Hundred 100nF SMTs and power-ground plane substrate  $\epsilon_r=3.8.$ 



Figure 129: Twenty-five 1nF embedded discrete capacitors and power-ground plane substrate  $\epsilon_r = 11$ .

# 10 Future Work: Alternate Schemes for DC Analysis of the FDTD Lattice

## 10.1 Introduction

The companion model of the Yee cell given in Chapter 5 gives rise to alternate schemes for DC analysis, which is Step 3 of the SLeEC algorithm that is shown in Figure 130. The SLeEC methodology was presented in Chapter 2.2 and the flowchart is shown here again for convenience.



Figure 130: The flowchart of the SLeEC methodology.

Alternate schemes for DC analysis, such as the random-walk scheme used to analyze on-chip structures [29], can also be applied in the SLeEC algorithm. The steps involved in the flowchart given in Figure 130 remain the same, except for the way in which the third step *DC Analysis* is done. An alternate method to do DC analysis using ABCD parameters for 1D, 2D, and 3D FDTD grids is given below. Information on ABCD matrices are given in [20]. Only the methodology is given and the implementation is left for future work. The remainder of this section presents the DC analysis technique using ABCD parameters for the 1D, 2D, and the 3D cases.

#### 10.2 1D Grid

The Norton variation of the 1D companion model of the FDTD grid is shown in Figure 131. The derivation of the companion model was presented in Chapter 5.2. The vertical bars



Figure 131: The Norton companion model for a 1D FDTD grid terminated with PEC boundary.

and the multiplication symbols represent the spatial positions of the electric and magnetic fields. The DC values of the nodal voltages represent the electric-field coefficients and the DC branch currents represent the magnetic-field coefficients. The number of nodes in the network can be reduced by constructing ABCD matrices of individual cells and multiplying them together. For example, in Figure 132, a chain of ABCD matrices are multiplied together reducing the chain to a single block. The desired set up for the FDTD grid is shown in Figure 133.  $T_1$  represents the ABCD parameters of the first unit cell and



Figure 132: A cascade of ABCD matrices reduced to a single block by multiplying the parameters of individual blocks.

 $T_2T_3...T_N$  represents the product of the ABCD matrices for Cells  $\{2,3,...,N\}$ , where N is



Figure 133: The reduced FDTD grid network.

the number of cells in the FDTD grid. The DC analysis is done on the smaller problem shown in Figure 133. The DC voltage at the node marked by the arrow in Figure 134 represents the coefficient  $E_1$  that is shown in Figure 134. The DC values of all the E and



Figure 134: All the field coefficients in the entire 1D grid can be calculated using  $E_1$  alone for the given boundary conditions.

H field coefficients can be obtained from  $E_1$  alone in a constant O(1) time using Kirchoff's voltage and current laws. Proceeding in the direction shown by the arrow in Figure 134, using  $E_0$  and  $E_1$ ,  $H_0$  can be obtained; from  $H_0$  and  $E_1$ ,  $H_1$  can be solved;  $E_1$  and  $H_1$  can be used to get  $E_2$ , and so on, until all the field coefficients have been obtained.

## 10.3 2D Grid

The DC analysis for a 2D case using ABCD matrices can be done in a similar fashion to the 1D case. A 2D FDTD grid with fields  $E_x$ ,  $E_y$ , and  $H_z$  is shown in Figure 135. The ABCD matrix for a column of unit cells have to be obtained, as shown by the dotted box in Figure 135. The ABCD matrices of columns of cells are multiplied together to set up the system in the form shown in Figure 133.  $T_1$  in Figure 133 is the ABCD matrix for Column 1 of the 2D FDTD grid shown in Figure 135 and  $T_2T_3...T_N$  are the product of the ABCD matrices for Columns 2, 3, ..., N, where N is the number of columns in the grid. The input ports for the ABCD matrices are the  $E_y$  nodes on the left of the dotted box and the output ports are the  $E_y$  nodes on the right of the dotted box. DC analysis is done on the reduced multiport



Figure 135: A 2D FDTD grid.

network, which is shown in Figure 133, and the  $E_x$  and the  $E_y$  fields for the first column are calculated. Once the nodal voltages for the first column have been obtained, the remaining fields for the entire grid can be obtained in O(1) time using Kirchoff's voltage and current laws.

The calculation of all the field coefficients in constant time for the entire grid is illustrated in Figure 136. Once the E-fields in Column 1 (vertical and horizontal diamonds) have been calculated, the clouds can be determined; using clouds and vertical diamonds, stars can be calculated; from stars, chevrons can be solved; from vertical diamonds, stars, and chevrons, multiplication symbols can be found. Proceeding from left to right, the fields in the entire grid can be solved in a constant or O(1) time using KCL and KVL equations.

A 2D FDTD grid with metallization is shown in Figure 137, where the metallization is represented by the shaded cells. For such a structure, the ABCD matrix for the column marked by the dotted box must have asymmetric input-output ports, as shown in Figure 138. Information on asymmetric input-output transfer scattering parameter matrices is given in [20]. The output ports are located on the nodes corresponding to the  $E_y$  fields that are not tangential to the metallization structure. For the dotted column shown in Figure 137, there will be 4 input ports and 1 output port. Asymmetric input-output ports make it possible to include metallization in the structure, as illustrated by the test case in Figure 137.



Figure 136: The calculation of the fields in the entire grid using the E-field values in Column 1 alone.



Figure 137: A 2D FDTD grid with metallization.



Figure 138: A T-parameter matrix with asymmetric input-output ports.

## 10.4 3D Grid

Similar to solving the 1D and the 2D FDTD grids, ABCD parameters can be used to solve for the field coefficients in the 3D case. Cascaded 3D Yee cells are shown in Figure 139. As before, ABCD matrices are constructed and set up in the form shown in Figure 133 before the DC analysis is done.  $T_1$  is the ABCD network for the cells at k = 0 and  $T_2T_3...T_N$  is the product of the ABCD matrices for the cells k = 1, k = 2, ..., k = N. For  $k = k_o$  the input ports are the nodes located on the bottom face of the FDTD cells at  $k = k_o$  and the output ports are the nodes located on the top face of the FDTD cells at  $k = k_o$ . Once the E-fields at k = 0 have been solved, the fields in the entire 3D grid can be calculated in O(1) time. Similar to the 1D and the 2D cases, the fields at k = 0 can be used to calculate the fields at k = 1 and so on, until the DC nodal voltages and branch currents for the entire grid have been obtained.



Figure 139: FDTD cells in a 3D grid.

# 10.5 Summary

The companion model of the FDTD grid gives rise to new schemes for solving a system of linear equations, rather than using the conventional LU-decomposition method. The alternate schemes could result in a more memory-efficient solution. It has been shown in this chapter that only some of the electric-field coefficients need to be solved and the

remaining electric and magnetic-field coefficients can be obtained in a constant time using KCL and KVL equations.

# 11 Conclusions

The conventional time-domain techniques that are limited by the Courant condition are unsuitable for chip-package cosimulation. The on-chip structures are in the nanometer range and require a mesh with small dimensions. The small mesh dimensions result in a time step that is prohibitively small, making the total simulation time unacceptable for designers. For the chip-package test case presented in this thesis, the time taken for simulation using the FDTD scheme is one day, while SLeEC takes only 5 minutes for the same number of cells. Prior limitations in the Laguerre-FDTD scheme have been overcome in this research work. The new enhanced scheme has been named SLeEC and stands for simulation using Laguerre equivalent circuit. The following improvements have been made:

- Laguerre-FDTD has the drawback of being able to simulate only for a limited time duration. In the new methodology the simulation can be restarted, which will enable simulation to be run for any length of time.
- The companion model for the FDTD grid has been developed, making the implementation easier without the use of long cumbersome equations.
- A methodology for choosing the correct number of basis coefficients has been proposed.
- Laguerre-FDTD has been applied for linear transient circuit simulation composed of inductors, capacitors, resistors, and mutual inductance. Companion models have been developed for each of these components.
- Scalability has been demonstrated by applying SLeEC to practical test cases. A node numbering scheme for optimal memory efficiency has been suggested.

Efficient use of full-wave solvers for chip-package cosimulation has been proposed. Several test cases have been simulated using the proposed methodology. The model for microstrip lines referenced to power-ground planes has been developed. The model for an interconnect routed on the same layer as the power or the ground plane, which form a conductor-backed coplanar-waveguide structure, has been derived using the modal-decomposition method.

# 12 Appendix A: Derivation of the Courant Condition

The Courant condition limits the maximum allowable time step that can be used to obtain stable simulation results. The derivation of the Courant condition using dispersion analysis is given in [10] and is summarized in this section.

A propagating wave in a discretized FDTD grid at position  $(I\Delta x, J\Delta y, K\Delta z)$  and at time step n can be written as

$$\vec{V}|_{I,J,K}^{n} = \vec{V}_{0}e^{j\left[(\tilde{\omega}_{real} + j\tilde{\omega}_{imag})n\Delta t - \tilde{k}_{x}I\Delta x - \tilde{k}_{y}J\Delta y - \tilde{k}_{z}K\Delta z\right]}$$

$$= \vec{V}_{0}e^{-\tilde{\omega}_{imag}n\Delta t}e^{j\left(\tilde{\omega}_{real}n\Delta t - \tilde{k}_{x}I\Delta x - \tilde{k}_{y}J\Delta y - \tilde{k}_{z}K\Delta z\right)}$$
(174)

The relationship between the propagating wave's angular frequency  $\tilde{\omega}$  and the wavevector  $(\tilde{k_x}, \tilde{k_y}, \tilde{k_z})$  for an FDTD grid made up of cells with dimensions  $\Delta x$ ,  $\Delta y$ , and  $\Delta z$  is given by

$$\tilde{\omega} = \frac{2}{\Delta t} sin^{-1}(\xi), where$$
 (175)

$$\xi = c\Delta t \sqrt{\frac{1}{(\Delta x)^2} sin^2 \left(\frac{\tilde{k_x} \Delta x}{2}\right) + \frac{1}{(\Delta y)^2} sin^2 \left(\frac{\tilde{k_y} \Delta y}{2}\right) + \frac{1}{(\Delta z)^2} sin^2 \left(\frac{\tilde{k_z} \Delta z}{2}\right)}.$$
 (176)

The value of  $\xi$  is bounded between

$$0 \le \xi \le c\Delta t \sqrt{\frac{1}{(\Delta x)^2} + \frac{1}{(\Delta y)^2} + \frac{1}{(\Delta z)^2}} = \xi_{upper\ bound}$$
 (177)

for all possible real values of k. The possible values of  $\xi$  can be partitioned into two intervals.

Stable Range: 
$$0 < \xi < 1$$
 (178)

Unstable Range: 
$$1 < \xi < \xi_{upper\ bound}$$
 (179)

It can be shown that Equation 179 results in an unstable time-domain response by substituting Equation 175 into Equation 174, resulting in an expression that indicates an increasing amplitude for the time-domain waveform with every time step for  $\xi > 1$ . The unstable range in Equation 179 exists only if

$$\xi_{upper\ bound} = c\Delta t \sqrt{\frac{1}{(\Delta x)^2} + \frac{1}{(\Delta y)^2} + \frac{1}{(\Delta z)^2}} > 1.$$
 (180)

In other words

$$\Delta t > \frac{1}{c\sqrt{\frac{1}{(\Delta x)^2} + \frac{1}{(\Delta y)^2} + \frac{1}{(\Delta z)^2}}}$$
 (181)

makes the time-domain response unstable. Equation 181 gives an upper bound on the time step that can be used for stable results, completing the derivation of the Courant condition.

# 13 Publications

## 13.1 Journals and Conference Papers

- 1. **K. Srinivasan**, M. Ha, E. Engin, M. Swaminathan, "Transient Chip-Package Cosimulation Using the Laguerre-FDTD Scheme," Submitted to IEEE Trans. Advanced Packaging.
- 2. K. Srinivasan, P. Yadav, E. Engin, M. Swaminathan, "Fast EM/Circuit Transient Simulation Using Laguerre Equivalent Circuit (SLeEC) for Multiscale Problems," Submitted to IEEE Trans. Electromagnetic Compatibility.
- 3. **K. Srinivasan**, M. Swaminathan, and E. Engin, "Overcoming Limitations of Laguerre-FDTD for Fast Time-domain EM Simulation," *IEEE MTT-S International Microwave Symposium*, June, 2007.
- 4. **K. Srinivasan**, E. Engin and M. Swaminathan, "Fast FDTD Simulation Using Laguerre Polynomials in MNA Framework, *International Symposium on Electromagnetic Compatibility*, July 2007.
- K. Srinivasan, M. Swaminathan, E. Engin, "Enhancement of Laguerre-FDTD With Initial Conditions for Fast Transient EM/Circuit Simulation," *Electronic Components* and Technology Conference, May 2007.
- K. Srinivasan, E. Engin, M. Swaminathan, "Fast FDTD Simulation of Multiscale 3D Models using Laguerre-MNA," IEEE Workshop on Signal Propagation on Interconnects, May 2007.
- 7. K. Srinivasan, P. Yadav, E. Engin, M. Swaminathan, "Choosing the Right Number of Basis Functions in Multiscale Transient Simulation Using Laguerre Polynomials," Electrical Performance of Electronic Packaging, Oct. 2007.
- 8. **K. Srinivasan**, P. Muthana, R. Mandrekar, E. Engin, M. Swaminathan, "Enhancement of Signal Integrity and Power Integrity with Embedded Capacitors in High-

- Speed Packages," International Symposium on Quality Electronic Design (ISQED), Mar. 2006.
- 9. **K. Srinivasan**, R. Mandrekar, E. Engin, M. Swaminathan, "Power Integrity/Signal Integrity Co-Simulation for Fast Design Closure," *IEEE Electronics Packaging Technology Conference*, Dec. 2005.
- K. Srinivasan, H. Sasaki, M. Swaminathan and R. Tummala, "Calibration of Near Field Measurements Using Microstrip Line for Noise Predictions," IEEE Electronic Components and Tech. Conference, Vol. 2, pp. 1432-1436, June 2004.
- 11. R. Mandrekar, **K. Srinivasan**, E. Engin, M. Swaminathan, "Co-simulation of Signal and Power Delivery Networks with Causality," IEEE Electrical Performance of Electronics Packaging, Oct. 2005.
- 12. R. Mandrekar, **K. Srinivasan**, E. Engin, M. Swaminathan, "Co-simulation of Signal and Power Delivery Networks with Causality," *IEEE Electrical Performance of Electronics Packaging*, Oct. 2005.
- 13. J. Choi, D. G. Kam, D. Chung, K. Srinivasan, V. Govind, J. Kim and M. Swaminathan, "Near Field and Far Field Analysis of Alternating Impedance Electromagnetic Bandgap (AI-EBG) Structure for Mixed-Signal Applications," IEEE Electrical Performance of Electronics Packaging, Oct. 2005.
- 14. T. Watanabe, **K. Srinivasan**, H. Asai, M. Swaminathan, "Modeling of Power Distribution Networks with Retardation using the Transmission Matrix Method," *IEEE Electrical Performance of Electronics Packaging*, pp. 233-236, 2004.
- H. Sasaki, V. Govind, K. Srinivasan, S. Dalmia, V. Sundaram, M. Swaminathan,
   R. Tummala, "Electromagnetic Interference (EMI) Issues for Mixed-Signal Systemon-Package (SOP)," Conference Proc., *IEEE Electronic Components and Tech. Con*ference, Vol. 2, pp. 1437-1442, June 2004.
- 16. S. N. Lalgudi, **K. Srinivasan**, G. Casinovi, R. Mandrekar, E. Engin and M. Swaminathan, "Causal Transient Simulation of Systems Characterized by Frequency-Domain

Data in a Modified Nodal Analysis Framework," *Electrical Performance of Electronics Packaging*, Oct. 2006.

## 13.2 Invention Disclosures

- 1. **K. Srinivasan**, E. Engin, M. Swaminathan, "Fast Transient EM/Circuit Simulation Using Laguerre Polynomials," Invention Disclosure, May 2007.
- 2. P. Muthana, E. Engin, **K. Srinivasan**, M. Swaminathan, "Use of Embedded Discrete Capacitor Layers in IC Package Substrates for High Speed I/O Decoupling," Invention Disclosure, Jul. 2005.

# References

- [1] H. S. Lee, "Advanced Computer Architecture," Sep. 2007; http://users.ece.gatech.edu/~leehs/ECE6100/.
- [2] "International Technology Roadmap for Semiconductors (ITRS) 2005", Oct. 2006; http://public.itrs.net.
- [3] K. Sheth, E. Sarto, J. McGrath, "The importance of adopting a package-aware chip design flow," *Design Automation Conference*, July 2006, pp. 853-856.
- [4] B. Young, Digital Signal Integrity, Prentice-Hall, 2000.
- [5] S. N. Lalgudi, Y. K. Kretchmer, M. Swaminathan, "Simulation of switching noise in on-chip power distribution network of FPGAs," *IEEE 14th Trop. Meeting on Electrical Performance of Electronic Packaging*, pp. 319-322, Oct. 2005.
- [6] P. Muthana, M. Swaminathan, E. Engin, R. Markondeya, R. Tummala, "Mid-frequency Decoupling Using Embedded Decoupling Capacitors," *IEEE 14th Tropical Meeting on Electrical Performance of Electronic Packaging*, pp. 271-274, Oct. 2005.
- [7] H. Johnson, M. Graham, *High-Speed Signal Propagation: Advanced Black Magic*, Prentice-Hall, Upper Saddle River, NJ: 2002.
- [8] K. J. Han, E. Engin, and M. Swaminathan, "Cylindrical Conduction Mode Basis Functions for Modeling of Inductive Couplings in System-in-Package (SiP)," *IEEE Electrical Performance of Electronic Packaging*, Oct. 2007, pp. 361-364.
- [9] C. R. Paul, Analysis of Multiconductor Transmission Lines, Wiley-Interscience, NY, 1994.
- [10] A. Taflove, S. C. Hagness, Computational Electrodynamics: the finite-difference time-domain method, Norwood, MA: Artech House, Inc., 2000.
- [11] J. E. Schutt-Aine, "Latency Insertion Method (LIM) for the Fast Transient Simulation of Large Networks," *IEEE Trans. Circuits and Systems-I:Fundamental Theory and Applications*, Vol. 48, pp. 81-89, Jan. 2001.

- [12] Z. Deng, J. E. Schutt-Aine, "Stability analysis of latency insertion method (LIM)," IEEE 13th Tropical Meeting on Electrical Performance of Electronic Packaging, pp. 167-170, 2004.
- [13] Y. Chung, T. K. Sarkar, B. H. Jung and M. Salazar-Palma, "An Unconditionally Stable Scheme for the Finite-Difference Time-Domain Method," *IEEE Trans. Microwave Theory Tech.*, Vol. 51, pp. 697-704, Mar. 2003.
- [14] L. T. Pillage, R. A. Rohrer, C. Visweswariah, Electronic Circuit and System Simulation Methods, McGraw-Hill, 1994.
- [15] Sung-Hwan Min, "Automated Construction of Macromodels from Frequency Data for Simulation of Distributed Interconnect Networks," PhD Thesis, Georgia Institute of Technology, Apr. 2004.
- [16] J. Kim, M. Swaminathan, "Modeling of Irregular Shaped Power Distribution Planes Using Transmission Matrix Method," *IEEE Trans. Adv. Packaging*, pp. 334-346, Vol. 24, Issue 3, Aug. 2001.
- [17] R. Mandrekar and M. Swaminathan, "Causality Enforcement in Transient Simulation of Passive Networks through Delay Extraction," *IEEE 9th Workshop - Signal Propagation* on Interconnects, pp. 25-28, May 2005.
- [18] E. Engin, K. Bharath, M. Swaminathan, M. Cases, B. Mutnury, N. Pham, D. Araujo, E. Matoglu, "Finite-Difference Modeling of Noise Coupling between Power/Ground Planes in Multilayered Packages and Boards," *Electronic Components and Technology Conference*, 2005.
- [19] E. Engin, "Modeling of Lossy Interconnects and Packages with Non-Ideal Power/Ground Planes," *PhD Thesis, Vom Fachbereich Elektrotechnik und Informationstechnik der Universität Hannover*, 2004.
- [20] J. Dobrowolski, Introduction to Computer Methods for Microwave Circuit Analysis and Design, Norwood, MA: Artech House, 1991

- [21] K. C. Gupta, R. Garg and R. Chadha, Computer-Aided Design of Microwave Circuits, Artech House, 1981.
- [22] A. V. Oppenheim, R. W. Schafer and J. R. Buck, Discrete-time Signal Processing, Prentice-Hall, Upper Saddle River, NJ, 1999.
- [23] S. Bagchi and S. K. Mitra, *The Nonuniform Discrete Fourier Transform and Its Applications in Signal Processing*, Kluwer Academic Publishers, Norwell, MA, 1999.
- [24] D. M. Pozar, Microwave Engineering, John-Wiley & Sons, 2005.
- [25] J. E. Schutt-Aine, R. Mittra, "Scattering Parameter Transient Analysis of Transmission Lines Loaded With Nonlinear Terminations," *IEEE Trans. Microwave Theory Tech.*, pp. 529-536, Mar. 1988.
- [26] P. Muthana, M. Swaminathan, R. Tummala, P. M. Raj, E. Engin, L. Wan, D. Balaraman, S. Bhattacharya, "Measurement, modeling and characterization of embedded capacitors for power delivery in the mid-frequency range", *International Symposium on EMC*, 2005.
- [27] K. S. Yee, "Numerical solution of initial boundary value problems involving Maxwell's equations in isotropic media," *IEEE Transactions on Antennas and Propagation*, pp. 302-307, 1966.
- [28] S. M. Rao, Time-domain Electromagnetics, Academic Press, 2006.
- [29] Haifeng Q, Nassif, S.R., Sapatnekar, S.S., "Power grid analysis using random walks," IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, pp. 1204-1224, Aug. 2005.