Eﬃcient Procedure Improving Precision of High Conditioned Matrices in Electronic Circuits Analysis

. In this article, we propose several improvements that could be done to SPICE simulator. The ﬁrst is based on a functional implementation of device models. The advantages of functional implementation are demonstrated on basic Shichman-Hodges model of MOS transistor. It starts with a description of primary algorithms used in SPICE simulator for the solution of circuits with nonlinear devices and identify the problems that can occur during simulations. Main part of the article is devoted to improved factorization procedure for simulation of the nonlinear electronic circuits. The primary intention of the proposed method is to increase ﬁ-nal precision of the result in a case of high condition linear systems. The procedure is based on a use of the iterative methods for solution of nonlinear and linear equations. Utilizing those methods for one iterative process helps to reduce memory consumption during simulation computation, and it can signiﬁcantly improve simulation precision. The procedure allows to use enumeration with deﬁnable precision in a very eﬃcient way.


Introduction
Recent development of the simulation of electronic circuits can be divided into several directions.The first is focused mainly on improving device models of the program SPICE.They go from enhancing possibilities given by standard models, through implementing entirely new ones to constructing new macro-models formed by standard models.From the recent publications, we can point out the following articles [1][2][3][4][5][6][7][8][9] that are entirely or partly focused on model improvement in the SPICE simulators.The second direction of the development is targeted to algorithms used during simulation.It can be divided into the two subsections.The first subsection is dedicated to concrete simulation type or simulation problem where standard simulation core could not be used because given device models or simulation prob-lem is not applicable to SPICE computation procedure.The most often it is the simulation of various DC-DC circuits, but many other articles have been published for enhancing core to simulate magnetic features or another circuit behavior and effects.From recent works focused on strengthening simulation core we can mention [10][11][12][13][14][15][16][17].It could be seen this is very active field of development.The second subsection is focused to SPICE core algorithm performance or performance of an entire simulation process.SPICE core wasn't changed much since it was first developed; therefore it is interesting to adapt it to present hardware or software, mainly with an intention of improving simulator performance or accuracy of the results.From recent publications in this area we recommend [18][19][20].Moreover, new (and often unusual) modes of analysis and optimization have recently been implemented to the SPICE-family programs that are very demanding to the precision of the algorithms [21][22][23][24][25][26][27].This article belongs to the last mentioned subsection with one important note that it does try to enhance standard simulation procedure but it completely redefine the process with the new one where standard procedure is replaced with recursive and factorization free evaluation algorithms and as suitable programming tool is used functional language Lisp.It can reduce memory requirements, computation time and add a new factor of variability not only during device modeling but also during specification of the simulation.The idea originates from adaptive environment for solving of extra large datasets and combines advantages from today's use of functional and aspect-oriented languages [28][29][30][31].

Models and Virtual Representation
The real electronic devices such as resistors, capacitors, diodes, or MOS transistors are represented in the simulation programs by its software models.Those models are interpreted in the simulation core as a set of mathematical equations with the purpose of accurately approximating device behavior in a particular region of operation.It is a fact that device models become not very reliable when simulation gets out of assumed operation bounds.It is mainly caused by the developers' intent to keep models equations simple and efficient with fast convergence behavior while they are still precise within their operation point bounds [32], [33].If we assume that simulation takes place within the device's model bounds, and there is no error in circuit design the next stage of device modeling is a rate of affection by simulation changes.Device models are usually defined as closed structures that enter a simulation by calling its extern functions.This leads to a situation where models are only particular part of the simulation and can be affected only as much as arguments of their functions allow.They can be enhanced when we stop taking model devices as closed structures that can return only numeric values and rewrite them in a functional form.

Functional Device Models
In order to demonstrate the functional device modeling (FDM), let us start with the very simple (Shichman-Hodges) model of MOS transistor used in SPICE as described in [34].The drain-source current I DS of an n-channel device is then where each case defines different region of an operation cutoff, saturation, and linear.Variables V ON and X λ are substitutions defined as V ON = V GS − V TH and X λ = 1 + λV DS .Parameter β f (for a forward transconductance) is substitution where L EFF = L − 2L D is effective channel length corrected for the lateral diffusion, L D , of the drain and source, and is the threshold voltage in the presence of a bulk-gate bias, V BS < 0. Parameters V TH0 , K, γ, φ and λ are the electric parameters of the MOSFET model, representing the zero-bias threshold voltage, transconductance factor, bulk threshold parameter, surface potential, and output conductance factor in saturation, respectively.In our simulations, we are going to use the n-channel devices only with bulk connected to source.Therefore, we can simplify the model putting V TH to a constant value.
The variables V DS , V GS depend on circuit definition and current state.At the time of model definition, those variables remain unknown and will be evaluable at the moment of circuit simulation.Also, from the point of view of the simulation core, it is not clear how many unknown variables each model introduce to a circuit.One possible solution is to a make generic get/set function returning values by their reference.Better implementation can be done utilizing a two or more stage self-redefinition mechanism denoted as "functional chaining".In the first stage model functions returns, as a result, another function with internal variables adapted to simulation unknowns as voltages, currents or temperature.In the second stage through the chaining procedure, all unknown variables will evaluate itself according to simulation state.
Functions defined through this mechanism can be treated as values despite that they are unknown.That allows computation core to save execution time and handle those functions, referred as functionals, as any normal number and store them in vectors, matrices or sparse matrix systems.As a demonstration of the previous statement, a complete definition of previously mentioned equations for a simple model of the MOS transistor in functional language Lisp follows: ) ) The method denoted as "mos_current_drain" is a wrapper (a symbolic representation) for the inner functional definition (see a more detailed description of functionals in [35]): ( defmethod m o s _ c u r r e n t _ d r a i n ( ( mos c l a s s − m o s ) vg v s vd ) # ' ( lambda ( ) Specifically, it is defined without any parameters, but the wrapper function, which, on its call, will set up all internal variables of the device model to references passed through the input argument list.Those arguments vg, vs, and vd are simulation variables and will be set to circuit voltages concerning a location where the device is placed in a circuit.It is done by calling "eval" function during the enumeration: ( v g s (− ( e v a l vg ) ( e v a l v s ) ) ) For completeness, we need to note that for each nonlinear model there have to be defined all derivatives for all nonlinear functions and time dependent parameters.They will be used for the generation of Jacobian matrix that is required by Newton-Raphson algorithm.In our case, it has to be the two partial derivatives that will result in additional functions wrapping their functionals:  The full definition requires derivation for each nodal voltage direction.It will result in growing of Jacobian matrix during evaluation.More information about the concepts of functional programming can be found in [36].Comprehensive definition of nonlinear device modeling in Lisp with explanation of further steps can be found in [37] and [38].
It should be noted that functional language can be very beneficial from a point of view of simulation variability and definition.It can improve experience of the users during simulation, device model definition and creation.On the other hand it will be very inappropriate to use it for demanding numerical operations that are required by simulation analysis.It is still very difficult to get any closer with any other languages to performance of optimized code written in Fortran or C language.In Tab. 1 (it is expanded version of [35]), there is a comparison between performance of basic operations in the C and Lisp programming languages.

Device Models and Simulation
For each simulation type, there is a need to perform certain steps to obtain correct results.Basic DC simulation includes solving a linear system compiled by modified nodal formulation (MNF).It is a fundamental part of any other type of simulation and the essential starting point of transient analysis.Transient simulation characterizes device behavior, mostly nonlinear, with respect to time.It includes solving sets of nonlinear equations.The procedure that is commonly used is presented in the Algorithm 1. From the point of view of the memory handling and computation stability, it is very fast and efficient procedure.The resulting sparse matrix after LU factorization introduces very small amount of new fill-ins, and accompanied with forward elimination & back substitution, it produces result in very short time.However, there are some special cases when usage of supplementary algorithm can be required to improve stability of simulation or precision of the result.For instance, settings of the circuit devices can significantly increase condition number of matrix produced by MNF.It can introduce to already complicated problem additional complexity, low density of sparse matrix together with high matrix condition number can cause numerical errors during floating point operations followed by substantial decrease in final precision.

LU Factorization
The LU factorization (LUF) is an efficient algorithm able to find a solution of linear system defined by highly sparse matrices.It was implemented into SPICE and other simulation programs [39].It can be briefly defined as where U (upper) and L (lower) are two triangular submatrices that lead to a solution of a linear system as The problematic fact about the LU factorization is that it is a very demanding operation consuming much memory and computation time.Even though there have been published many articles on an optimization of LU factorization [40][41][42], they do not solve the main problem that we believe lies in the condition number of the factorized matrix.High condition number can affect final precision of direct factorization method.In Tab. 2, a dependency is presented of the total number of required operations needed for matrix factorization on given dimension.It additionally shows the development of the number of required iterations of the algorithm and matrix sparsity.The visual comparison is shown in Fig. 1.LUF method is usually performed by the Newton-Raphson iterative algorithm.It can be performed many times before algorithm finds a solution of a nonlinear system.It is obvious that obtaining final solution is very time demanding process and should not be repeated.Also once the solution of final LUF is done the precision of the result cannot be improved by LUF algorithm in other way that recomputing it again with higher (or even arbitrary) precision numbers.Therefore, we decided to add the algorithm improving the solution of nonlinear equations with one iteration of Newton algorithm for computing of eigenvalues of linear system.Even it was originally developed as iterative algorithm for inversion matrix it can be applied on system of linear equations in very efficient way.

Matrix-Inversion Iterative Methods
Iterative methods refer to techniques that use successive approximations to obtain a more accurate solution of the linear system at each step.There are so-called stationary methods and the non-stationary methods based on the idea of sequences of orthogonal vectors.Stationary methods perform in each step a same operation on the current values.In the non-stationary method, iterations depend on coefficients.From stationary methods, it can be point out following ones: • The Jacobi Method (JACOBI) • The Gauss-Seidel Method (GS) • The Successive Overrelaxation Method (SOR) From the non-stationary methods (sometimes referred as Krylov subspace methods), it could be pointed out the following methods: The original Newton iteration algorithm [43][44][45] was enhanced for implementation into the core of an electronic circuit simulator.Denoting Jacobian matrix as J and identity matrix as I, for inverted Jacobian matrix J −1 we can write For an arbitrary real matrix, J, the above statement allows for each iteration step n to define a generalized inverse (pseudo inverse) denoted as X n where ε n is an error matrix.From that we get to The above form can be rewritten using X n and X n+1 to a more readable form It is obvious that previous equation will iterate to the solution if It is correct but very impractical for a real implementation.To become a full replacement for standard LUF procedure we need to define an algorithm for obtaining appropriate initial starting values otherwise the iteration will not converge to a solution.

Starting Values
The algorithm uses results from the LUF method as a starting value.The conversion of this result to higher precision numbers is relatively fast operation involving one matrix vector product.Alternatively, the algorithm can recompute solution with own guess based on (11).This relatively slow procedure can be useful in a case of problematic errors caused by floating point arithmetics.
The result considered here is a special case of a more general and well known Newton iteration method which concerns the iterative computation of pseudo-inverses.Let A ∈ R n×n be a non-singular matrix and define the sequence {X n } n 0 of matrices as follows: where e 0 stands for the highest value of the eigenvalue vector e of the matrix J.Then, This mathematically rigorous definition, unfortunately, requires evaluating eigenvalues.It is not a simple task, and it is very impractical especially in the situation when we want to avoid any factorization method.As a possible solution proved to be reducing the entire definition can be used where β simply belongs to interval (0, 1).If some iteration does not converge, then simulation restarts with a smaller β value.It turned out that best value for β is somewhere in the interval (10 −8 , 10 −5 ) (for double floating point precision on 64-bit operating system).

Residual and Stopping Criteria
After each iteration, the algorithm needs to evaluate residual to review convergence.The easiest way to do it is to check whether the vector norms of the current and previous solutions show a decreasing trend, i.e., where The stopping criteria should be defined by at least two different mechanisms; the first by setting a maximum number of iteration loops to protect from an infinite loop (usually maximum 30 iterations), and also by checking a change between the two consecutive iterations steps.It can be mathematically defined in the following way: where is a measure of iteration step size, and depending on required precision should be set to at least 10 −6 .

Iterative Method for Linear System
Sticking to the previous notation, a system of linear equations written as a matrix equation can be denoted as where A is (in general case) an m × n matrix, A −1 is a symbol of matrix inverse and x and y are column vectors with m and n entries, respectively.As we need to avoid factorization methods, again, some iterative algorithm must be used.The simplest solution is to implement well known Gauss-Seidel method [46] for solution of the presented linear system where stopping criteria can be defined similarly to Newton iteration algorithm as Usage of the whole algorithm would be impractical and (very) time demanding.Each iteration adds to the simulation additional number of matrix multiplications.Therefore, we use the algorithm only in one iteration as precision booster of the simulation.In other words, the maximum number of iteration loops is limited to one.

Modified Simulation Algorithm
Putting it all together, we finish with a new procedure that can be implemented to the core of the electronic circuit simulator.The procedure comes with several changes.Apart from special cases of steady state analysis of circuits with oscillators, the simulation process starts with DC analysis.The DC analysis is newly performed by factorization method accompanied with iteration algorithm for the solution of the linear system.In our case, it was simple Newton iteration method.When the transient analysis is the next step in the simulation then results from DC analysis are used as initial starting values for the initial guess of Newton-Raphson algorithm.There is LUF method accompanied with residual computation and our modified Newton iteration method.Full redefined algorithm is written in Algorithm 2.
The core modification of standard algorithm can be seen in initial Jacobian matrix estimation.If the precision of the result is decreased, using NIM (Newton iteration method, see also [47]) as a successive step after the factorization method can significantly increase the accuracy of the result during one iteration loop.position."matrix-minus" will subtract two sparse matrices of the same size."mat-tim" is optimized form of multiplication of the matrix and constant number."eval-function-list" is core function that will evaluate all chained functionals to known values.

Selected Testing Simulated Circuits
We have selected six electronic circuits of various technology types for comparing the accuracy of the proposed methods (Tab.3): • Resistor Mesh -a simple linear circuit interconnected to a large mesh (Fig. 2) • Big Nonlinear Circuit -a large LED matrix control circuit (Fig. 3) • A variant of the standard nonlinear circuit with JFET as shown, e.g., in [48] • Differential Amplifier -a differential amplifier shown in Fig. 4 • CMOS Adder -adder constructed using MOS transistors, this is a standard SPICE (PSpice) demonstration example (called "ADDER -4 BIT ALL-NAND-GATE BINARY ADDER") • CMOS DS Regulator -a CMOS stepping regulator circuit as defined in [49] The results of all the simulations performed with different numerical iterative methods are shown in Tab. 3.

Results
In Fig. 1, the growth of the number of required operations (addition, multiplication, division) is shown vs increasing size of matrix dimension (full-line and left y-axis).As an informative comparison, it is shown together with required computation time for LU factorization (dot-dashed line, right y-axis).It is important to state that this implementation was programmed as an algorithm using mathematical simulation program Matlab with absolutely no optimization, and its purpose here is to be a reference.Therefore, the presented times here should be taken only as a comparison of a scale.Using appropriate real-world implementation, for example in C language, would be certainly a better choice, but the primary intention was to use unified environment.In Fig. 5, we present a dependency between the count of the LU operations (full line) and matrix dimension.It should be noted that it uses top x-axis, it goes from 0 to 800 and together with the left y-axis.This figure additionally shows, using bottom x-axis and right y-axis, the difference in the number of required iterations of the LU factorization for the matrices with different densities (dot and dash lines).In this case, a sparse matrix system was filled with random numbers.They were generated to have several different non-zero densities from full matrix density 100% to 50% density of nonzero values.The numbers on the right y-axis represent how much fewer operations where needed.
Figure 6 shows final accuracy received for three different circuits of very different circuit sizes (from the point of view of number of nonlinear device models).Used models in this simulation have been: nonlinear resistor defined as pure nonlinear voltage driven resistor, MOS transistor defined by (1), and JFET transistor described by a simplified model derived from original SPICE model [33].It is presented to demonstrate that when an iterative algorithm is used the precision of the result is limited.It is limited by stopping criteria as well as by floating point precision.Regardless of the required accuracy of the result or a maximum number of iteration loops the accuracy of the results is given primarily by nature of simulated nonlinear equations (device models) and floating point implementation.
Figure 7 demonstrates interaction and influence of precision of inversion matrix on a result of NR method.The x-axis contains different settings of the accuracy of inversion matrix computed by NIM.The y-axis is the number of iterations for different circuits.Each circuit was holding 100 devices of given type.The full-line curve stands for NIM algorithm (evaluation of inversion matrix).Dotted lines represent the number of iterations needed by NR algorithm to find a solution fulfilling stopping criteria.It is nothing new that the number of iteration loops of NIM depends on a required accuracy of stopping criteria.An interesting observation is that stopping criterion of NIM does not need to be as strict as in NR, and therefore, NIM algorithm can be stopped much earlier which save some computation time.Secondly, there is some point where stopping criteria of NIM algorithm cause very steep growth of the required number of iteration loops on NR algorithm or even its divergence.
Although it proves to be very stable method, it is clearly visible now that the real implementation suffers from one big drawback apart from other iterative algorithms and of course from direct LUF.It is a need of computing and storing of at least one additional (unfortunately dense) matrix.Simplicity of the algorithm allows compiler to optimize the code in such a way that performance is to be rapidly increased, but with increase of matrix dimension memory, requirements of this method rapidly decrease performance.This can be visible from It can be seen from the results that "naive" implementation that represents (only) implementation of the algorithm without respect to procesor cache and fast computation techniques slows entire computation by several orders.Number of required iterations to obtain the result with particular precision unfortunately depends on the initial estimation and matrix dimension.

Comparing Tables of Method Accuracies
In Tab. 4, there is a comparison of the accuracy increase with additional NIM cycle computed with arbitrary numbers to matrix dimension.It got an additional accuracy boost computed by NIM after each successful iteration time step.The adapted transient analysis completed by single loop NIM with arbitrary precision is shown in Algorithm 2. It can be applied to the end of operating-point analysis or as a final procedure after each time step of transient analysis.
The values in Tabs. 3 and 4 represent maximal accuracy of the computation of RHS vector that could be achieved with given method on the specific simulation task.The measure of accuracy is limited by mathematical properties of given numerical method from one side and by digital representation of floating point numbers from the other side.While Tab. 3 gives an idea on rate of change of the accuracy levels, Tab. 4 demonstrates possible improvement in accuracy given by proposed method.Therefore, the values in the tables do not represent the results of the simulations, but the inaccuracy of the calculation from the ideal state.

Conclusion
Basing on the assumption that electronic device models definitions are introduced to the simulation as chained functionals, they will autonomously evaluate themselves to values upon call.It is evident that trying to convert those entries to matrix system to compute matrix inversion or solution of the linear system is redundant operation.It seems to be as a solution to chain all entries to another function that on call evaluate entire simulation.The key factor of the implementation lies in the use of the iterative recursion.In this article, we presented the new procedure utilizing factorization free algorithm that can enhance standard computation core of electronic circuit simulators.
Not only from the results it is evident that accuracy or better convergence of iterative numerical methods depends on matrix properties and also on starting values.It means that in the case when LUF method does finish with sufficient precision then better results can be achieved with numerical iteration (with an assumption that same numeric floating types are used in both scenarios).The article suggests that accuracy of the results should be presented alongside with results and of course in a situation when the standard method does not achieve reasonable accuracy then some iteration method can be used next.The concrete method depends on the problem and picks up could be a tricky part.We suggest NIM approach as very universal solution, but it is of course very memory demanding and can be very inefficient for huge matrixes.
As it has been already said, NIM method supports a use of rationals (numbers expressed as the fraction p/q where p and q are integers and q is nonequal to zero).When iteration utilizes computation with rationals, it is possible with NIM to receive a more precise solution.Performing standard direct method with rational numbers has significant drawbacks that are for wider discussion.Mainly, standard operations with rationals are slower than those with standard types.A large difference has been presented between evaluating of one million operations in C language and Lisp with a use of rationals or with floating point numbers.The significant problem that arises with rational numbers is a size of the numeric container and required time to store and obtain it from computer memory.Our tests showed that it takes 0.7 s to store the rational number of size 10 1000000 .Usage of one iteration of NIM with initial starting value as a result from previous factorization allows to use arbitrary numbers in much more effective way and, therefore, obtain the final solution with improved precision faster.
All computation and simulation tests were made on the same computer machine with a processor Intel Core i3 4160 Haswell, chipset Intel H87 and RAM 8GB DDR3, with 64b architecture and operating system.From 1986 to 1992, he was a researcher of TESLA Research Institute.He is currently with the Department of Radioelectronics of the Czech Technical University in Prague.His current research interests include physical modeling of elements of electronic circuits, especially radio-frequency and microwave transistors and transmission lines, creating or improving special algorithms for circuit analysis and optimization such as time-and frequency-domain sensitivity, poles-zeros or steady-state analyses, and creating a CAD tool for analysis and optimization of radio-frequency circuits.
Stanislav BANÁŠ was born in 1970.He received his M.Sc.degree in Electrical Engineering from the Technical University in Brno in 1994.Last year of his study he spent in scholarship in CNRS institute in Grenoble, where he was interested in optoelectronic properties of hydrogenated amorphous silicon.From 1996 he works as a modeling and characterization engineer in Motorola Czech Design Center in Roznov, later transferred to ON Semiconductor SCG Czech Design Center in Roznov.From 2012 he studies for Ph.D. in Technical University in Prague.His research interests include the modeling of high-voltage semiconductor components.

(
defmethod m o s − c u r r e n t − d r a i n − d v g s ) ( defmethod m o s − c u r r e n t − d r a i n − d v d s )

Fig. 2 .Fig. 3 .
Fig. 2.A fragment of the resistor mesh used as a large exemplary linear circuit.

Fig. 8 .
It uses the following naming conventions: • (LUF BFS n.) LU factorization with forward elimination and back substitution • (NIM n. -Ofast) Naive implementation of algorithm • (LUF BFS naive -Ofast) Compiled with full support of compiler optimization • (LUF BFS Lapack) Algorithms were implemented using Lapack functions • (NIM Lapack) Newton Iteration method • (GNU Octave) Algorithm for matrix inversion implemented in GNU Octave • (Cholesky Lapack) Computation uses Cholesky method
About the Authors . . .David ČERNÝ was born in Prague, Czech Republic, in 1985.He received his M.Sc.degree in 2009 from the Faculty of Electrical Engineering of the Czech Technical University in Prague.He finished his Ph.D. study at the Department of Radioelectronics at Czech Technical University in Prague in 2017 and received Ph.D. degree.His research interests include simulation of high frequency circuits, physical modeling of electrical devices, and very large-scale integrated circuit analysis.Josef DOBEŠ received the Ph.D. degree in Microelectronics from the Czech Technical University in Prague in 1986.

1M Operations Addition Multiplication Division Division by Constant
Procedure Number of operations of LUF for different dimensions and densities.
Comparison of different circuit accuracy vs different iterative methods.