systemﬁt : A Package for Estimating Systems of Simultaneous Equations in R

This introduction to the R package systemﬁt is a slightly modiﬁed version of Hen-ningsen and Hamann (2007), published in the Journal of Statistical Software . Many statistical analyses (e.g., in econometrics, biostatistics and experimental design) are based on models containing systems of structurally related equations. The systemﬁt package provides the capability to estimate systems of linear equations within the R programming environment. For instance, this package can be used for “ordinary least squares” (OLS), “seemingly unrelated regression” (SUR), and the instrumental variable (IV) methods “two-stage least squares” (2SLS) and “three-stage least squares” (3SLS), where SUR and 3SLS estimations can optionally be iterated. Furthermore, the systemﬁt package provides tools for several statistical tests. It has been tested on a variety of datasets and its reliability is demonstrated.


Introduction
Many theoretical models that are econometrically estimated consist of more than one equation.The disturbance terms of these equations are likely to be contemporaneously correlated, because unconsidered factors that influence the disturbance term in one equation probably influence the disturbance terms in other equations, too.Ignoring this contemporaneous correlation and estimating these equations separately leads to inefficient estimates of the coefficients.However, estimating all equations simultaneously with a "generalized least squares" (GLS) estimator, which takes the covariance structure of the residuals into account, leads to efficient estimates.This estimation procedure is generally called "seemingly unrelated regression" (SUR, Zellner 1962).Another reason to estimate a system of equations simultaneously are cross-equation restrictions on the coefficients. 1Estimating the coefficients under cross-equation restrictions and testing these restrictions requires a simultaneous estimation approach.
Furthermore, these models can contain variables that appear on the left-hand side in one equation and on the right-hand side of another equation.Ignoring the endogeneity of these variables can lead to inconsistent estimates.This simultaneity bias can be corrected for by applying a "two-stage least squares" (2SLS) estimation to each equation.Combining this estimation method with the SUR method results in a simultaneous estimation of the system of equations by the "three-stage least squares" (3SLS) method (Zellner and Theil 1962).
The systemfit package provides the capability to estimate systems of linear equations in R (R Development Core Team 2007).Currently, the estimation methods "ordinary least squares" (OLS), "weighted least squares" (WLS), "seemingly unrelated regression" (SUR), "two-stage least squares" (2SLS), "weighted two-stage least squares" (W2SLS), and "threestage least squares" (3SLS) are implemented. 2The WLS, SUR, W2SLS, and 3SLS estimates can be based either on one-step (OLS or 2SLS) (co)variances or these estimations can be iterated, where the (co)variances are calculated from the estimates of the previous step.Furthermore, the systemfit package provides statistical tests for restrictions on the coefficients and for testing the consistency of the 3SLS estimation.
Although systems of linear equations can be estimated with several other statistical and econometric software packages (e.g., SAS, EViews, TSP), systemfit has several advantages.First, all estimation procedures are publicly available in the source code.Second, the estimation algorithms can be easily modified to meet specific requirements.Third, the (advanced) user can control estimation details generally not available in other software packages by overriding reasonable defaults.
In Section 2 we introduce the statistical background of estimating equation systems.The implementation of the statistical procedures in R is briefly explained in Section 3. Section 4 demonstrates how to run systemfit and how some of the features presented in the second section can be used.In Section 5 we replicate several textbook results with the systemfit package.Finally, a summary and outlook are presented in Section 6.
After introducing notations and assumptions, we provide the formulas to estimate systems of linear equations.We then demonstrate how to estimate coefficients under linear restrictions.Finally, we present additional relevant issues about estimation of equation systems.
Consider a system of G equations, where the ith equation is of the form where y i is a vector of the dependent variable, X i is a matrix of the exogenous variables, β i is the coefficient vector and u i is a vector of the disturbance terms of the ith equation.
We can write the "stacked" system as (2) or more simply as y = Xβ + u. (3) We assume that there is no correlation of the disturbance terms across observations, so that where i and j indicate the equation number and t and s denote the observation number, where the number of observations is the same for all equations.
However, we explicitly allow for contemporaneous correlation, i.e., (5) Thus, the covariance matrix of all disturbances is where Σ = [σ ij ] is the (contemporaneous) disturbance covariance matrix, ⊗ is the Kronecker product, I T is an identity matrix of dimension T , and T is the number of observations in each equation.

Estimation with only exogenous regressors
If all regressors are exogenous, the system of equations (Equation 1) can be consistently estimated by ordinary least squares (OLS), weighted least squares (WLS), and seemingly unrelated regression (SUR).These estimators can be obtained by The covariance matrix of these estimators can be estimated by

Ordinary least squares (OLS)
The ordinary least squares (OLS) estimator is based on the assumption that the disturbance terms are not contemporaneously correlated (σ ij = 0 ∀ i ̸ = j) and have the same variance in each equation (σ 2 i = σ 2 j ∀ i, j).In this case, Ω in Equation 7 is equal to I G•T and thus, cancels out.The OLS estimator is efficient, as long as the disturbances are not contemporaneously correlated.
If the whole system is treated as one single equation, Ω in Equation 8 is σ2 I G•T , where σ2 is an estimator for the variance of all disturbances (σ 2 = E[u 2 it ]).If the disturbance terms of the individual equations are allowed to have different variances, Ω in Equation 8 is Σ ⊗ I T , where σij = 0 ∀ i ̸ = j and σii = σ2 i is the estimated variance of the disturbance term in the ith equation.
If the estimated coefficients are not constrained by cross-equation restrictions, the simultaneous OLS estimation of the system leads to the same estimated coefficients as an equation-wise OLS estimation.The covariance matrix of the coefficients from an equation-wise OLS estimation is equal to the covariance matrix obtained by Equation 8with Ω equal to Σ ⊗ I T .

Weighted least squares (WLS)
The weighted least squares (WLS) estimator allows for different variances of the disturbance terms in the different equations (σ 2 i ̸ = σ 2 j ∀ i ̸ = j), but assumes that the disturbance terms are not contemporaneously correlated.In this case, Ω in Equations 7 and 8 is Σ ⊗ I T , where σij = 0 ∀ i ̸ = j and σii = σ2 i is the estimated variance of the disturbance terms in the ith equation.Theoretically, σii should be the variance of the (true) disturbances (σ ii ).However, they are not known in most empirical applications.Therefore, true variances are generally replaced by estimated variances (σ ii ) that are calculated from the residuals of a first-step OLS estimation (see Section 2.4). 3he WLS estimator is (asymptotically) efficient only if the disturbance terms are not contemporaneously correlated.If the estimated coefficients are not constrained by cross-equation restrictions, they are equal to OLS estimates.

Seemingly unrelated regression (SUR)
If the disturbances are contemporaneously correlated, a generalized least squares (GLS) estimation leads to an efficient estimator for the coefficients.In this case, the GLS estimator is generally called "seemingly unrelated regression" (SUR) estimator (Zellner 1962).However, the true covariance matrix of the disturbance terms is generally unknown.The textbook solution for this problem is a feasible generalized least squares (FGLS) estimation.As the FGLS estimator is based on an estimated covariance matrix of the disturbance terms, it is only asymptotically efficient.In case of a SUR estimator, Ω in Equations 7 and 8 is Σ ⊗ I T , where Σ is the estimated covariance matrix of the disturbance terms.
It should be noted that while an unbiased OLS or WLS estimation requires only that the regressors and the disturbance terms of each single equation are uncorrelated (E u ⊤ i X i = 0 ∀ i), a consistent SUR estimation requires that all disturbance terms and all regressors are uncorrelated (E u ⊤ i X j = 0 ∀ i, j).

Estimation with endogenous regressors
If the regressors of one or more equations are correlated with the disturbances (E u ⊤ i X i ̸ = 0), OLS, WLS, and SUR estimates are biased.This can be circumvented by a two-stage least squares (2SLS), weighted two-stage least squares (W2SLS), or a three-stage least squares (3SLS) estimation with instrumental variables (IV).The instrumental variables for each equation Z i can be either different or identical for all equations.They must not be correlated with the disturbance terms of the corresponding equation (E u ⊤ i Z i = 0).At the first stage new ("fitted") regressors are obtained by Then, these "fitted" regressors are substituted for the original regressors in Equation 7to obtain unbiased 2SLS, W2SLS, or 3SLS estimates of β by where An estimator of the covariance matrix of the estimated coefficients can be obtained from Equation 8 analogously.Hence, we get

Two-stage least squares (2SLS)
The two-stage least squares (2SLS) estimator is based on the same assumptions about the disturbance terms as the OLS estimator.Accordingly, Ω in Equation 10 is equal to I G•T and thus, cancels out.Like for the OLS estimator, the whole system can be treated either as one single equation with Ω in Equation 12 equal to σ2 I G•T , or the disturbance terms of the individual equations are allowed to have different variances with Ω in Equation 12 equal to Σ ⊗ I T , where σij = 0 ∀ i ̸ = j and σii = σ2 i .

Weighted two-stage least squares (W2SLS)
The weighted two-stage least squares (W2SLS) estimator allows for different variances of the disturbance terms in the different equations.Hence, Ω in Equations 10 and 12 is Σ ⊗ I T , where σij = 0 ∀ i ̸ = j and σii = σ2 i .If the estimated coefficients are not constrained by cross-equation restrictions, they are equal to 2SLS estimates.

Three-stage least squares (3SLS)
If the disturbances are contemporaneously correlated, a feasible generalized least squares (FGLS) version of the two-stage least squares estimation leads to consistent and asymptotically more efficient estimates.This estimation procedure is generally called "three-stage least squares" (3SLS, Zellner and Theil 1962).The standard 3SLS estimator and its covariance matrix are obtained by Equations 10 and 12 with Ω equal to Σ ⊗ I T , where Σ is the estimated covariance matrix of the disturbance terms.
While an unbiased 2SLS or W2SLS estimation requires only that the instrumental variables and the disturbance terms of each single equation are uncorrelated Schmidt (1990) points out that the 3SLS estimator is only consistent if all disturbance terms and all instrumental variables are uncorrelated (E u ⊤ i Z j ) = 0 ∀ i, j).Since there might be occasions where this cannot be avoided, Schmidt (1990) analyses other approaches to obtain 3SLS estimators.
One of these approaches based on instrumental variable estimation (3SLS-IV) is An estimator of the covariance matrix of the estimated 3SLS-IV coefficients is Another approach based on the generalized method of moments (GMM) estimator (3SLS-GMM) is An estimator of the covariance matrix of the estimated 3SLS-GMM coefficients is A fourth approach developed by Schmidt (1990) himself is An estimator of the covariance matrix of these estimated coefficients is The econometrics software EViews uses where β2SLS is the two-stage least squares estimator as defined above.EViews uses the standard 3SLS formula (Equation 12) to calculate an estimator of the covariance matrix of the estimated coefficients.
If the same instrumental variables are used in all equations (Z 1 = Z 2 = . . .= Z G ), all the above mentioned approaches lead to identical estimates.However, if this is not the case, the results depend on the method used (Schmidt 1990).The only reason to use different instruments for different equations is a correlation of the instruments of one equation with the disturbance terms of another equation.Otherwise, one could simply use all instruments in every equation (Schmidt 1990).In this case, only the 3SLS-GMM (Equation 15) and the 3SLS estimator developed by Schmidt (1990) (Equation 18) are consistent.

Estimation under linear restrictions on the coefficients
In many empirical applications, it is desirable to estimate the coefficients under linear restrictions.For instance, in econometric demand and production analysis, it is common to estimate the coefficients under homogeneity and symmetry restrictions that are derived from the underlying theoretical model.
There are two different methods to estimate the coefficients under linear restrictions.First, a matrix M can be specified that where β M is a vector of restricted (linear independent) coefficients, and M is a matrix with the number of rows equal to the number of unrestricted coefficients (β) and the number of columns equal to the number of restricted coefficients (β M ).M can be used to map each unrestricted coefficient to one or more restricted coefficients.
The second method to estimate the coefficients under linear restrictions constrains the coefficients by where β R is the vector of the restricted coefficients, and R and q are a matrix and vector, respectively, that specify the restrictions (see Greene 2003, p. 100).Each linear independent restriction is represented by one row of R and the corresponding element of q.
The first method is less flexible than the second4 , but is preferable if the coefficients are estimated under many equality constraints across different equations of the system.Of course, these restrictions can be also specified using the latter method.However, while the latter method increases the dimension of the matrices to be inverted during estimation, the first reduces it.Thus, in some cases the latter way leads to estimation problems (e.g., (near) singularity of the matrices to be inverted), while the first does not.
These two methods can be combined.In this case, the restrictions specified using the latter method are imposed on the linear independent coefficients that are restricted by the first method, so that where β MR is the vector of the restricted β M coefficients.

Calculation of restricted estimators
If the first method (Equation 21) is chosen to estimate the coefficients under these restrictions, the matrix of regressors X is (post-)multiplied by the M matrix, so that Then, X M is substituted for X and a standard estimation as described in the previous section is done .This results in the linear independent coefficient estimates βM and their covariance matrix.The original coefficients can be obtained by Equation 21 and the estimated covariance matrix of the original coefficients can be obtained by The implementation of the second method to estimate the coefficients under linear restrictions (Equation 22) is described for each estimation method in the following sections.

Restricted OLS, WLS, and SUR estimation
The OLS, WLS, and SUR estimators restricted by Rβ R = q can be obtained by where λ is a vector of the Lagrangean multipliers of the restrictions and Ω is defined as in Section 2.1.An estimator of the covariance matrix of the estimated coefficients is

Restricted 2SLS, W2SLS, and 3SLS estimation
The 2SLS, W2SLS, and standard 3SLS estimators restricted by Rβ R = q can be obtained by where Ω is defined as in Section 2.2.An estimator of the covariance matrix of the estimated coefficients is The 3SLS-IV estimator restricted by Rβ R = q can be obtained by where The restricted 3SLS-GMM estimator can be obtained by where The restricted 3SLS estimator based on the suggestion of Schmidt (1990) is where The econometrics software EViews calculates the restricted 3SLS estimator by where βR 2SLS is the restricted 2SLS estimator calculated by Equation 28.EViews uses the standard formula of the restricted 3SLS estimator (Equation 29) to calculate an estimator for the covariance matrix of the estimated coefficients.
If the same instrumental variables are used in all equations (Z 1 = Z 2 = . . .= Z G ), all the above mentioned approaches lead to identical coefficient estimates and identical covariance matrices of the estimated coefficients.

Residual covariance matrix
Since the (true) disturbances (u i ) of the estimated equations are generally not known, their covariance matrix cannot be determined.Therefore, this covariance matrix is generally calculated from estimated residuals (û i ) that are obtained from a first-step OLS or 2SLS estimation.Then, in a second step, the estimated residual covariance matrix can be employed for a WLS, SUR, W2SLS, or 3SLS estimation.In many cases, the residual covariance matrix is calculated by where T is the number of observations in each equation.However, in finite samples this estimator is biased, because it is not corrected for degrees of freedom.The usual singleequation procedure to correct for degrees of freedom cannot always be applied, because the number of regressors in each equation might differ.Two alternative approaches to calculate the residual covariance matrix are where K i and K j are the number of regressors in equation i and j, respectively.However, these formulas yield unbiased estimators only if K i = K j (Judge et al. 1985, p. 469).
The WLS, SUR, W2SLS and 3SLS coefficient estimates are consistent if the residual covariance matrix is calculated using the residuals from a first-step OLS or 2SLS estimation.There exists also an alternative slightly different approach that consists of three steps.5In a first step, an OLS or 2SLS estimation is applied to obtain residuals to calculate a (first-step) residual covariance matrix.In a second step, the first-step residual covariance matrix is used to estimate the model by WLS or W2SLS and new residuals are obtained to calculate a (second-step) residual covariance matrix.Finally, in the third step, the second-step residual covariance matrix is used to estimate the model by SUR or 3SLS.If the estimated coefficients are not constrained by cross-equation restrictions, OLS and WLS estimates as well as 2SLS and W2SLS estimates are identical.Hence, in this case both approaches generate the same results.
It is also possible to iterate WLS, SUR, W2SLS and 3SLS estimations.At each iteration the residual covariance matrix is calculated from the residuals of the previous iteration.If Equation 37 is applied to calculate the residual covariance matrix, an iterated SUR estimation converges to maximum likelihood (Greene 2003, p. 345).
In some uncommon cases, for instance in pooled estimations, where the coefficients are restricted to be equal in all equations, the means of the residuals of each equation are not equal to zero (û i ̸ = 0).Therefore, it might be argued that the residual covariance matrix should be calculated by subtracting the means from the residuals and substituting ûi − ûi for ûi in Equations 37-40.
If the coefficients are estimated under any restrictions, the residual covariance matrix for a WLS, SUR, W2SLS, or 3SLS estimation can be obtained either from a restricted or from an unrestricted first-step estimation.

Degrees of freedom
To our knowledge the question about how to determine the degrees of freedom for singlecoefficient t tests is not comprehensively discussed in the literature.While sometimes the degrees of freedom of the entire system (total number of observations in all equations minus total number of estimated coefficients) are applied, in other cases the degrees of freedom of each single equation (number of observations in the equations minus number of estimated coefficients in the equation) are used.Asymptotically, this distinction does not make a difference.However, in many empirical applications, the number of observations of each equation is rather small, and therefore, it matters.
If a system of equations is estimated by an unrestricted OLS and the covariance matrix of the coefficients is calculated with Ω in Equation 8 equal to Σ ⊗ I T , the estimated coefficients and their standard errors are identical to an equation-wise OLS estimation.In this case, it is reasonable to use the degrees of freedom of each single equation, because this yields the same P values as the equation-wise OLS estimation.
In contrast, if a system of equations is estimated with many cross-equation restrictions and the covariance matrix of an OLS estimation is calculated with Ω in Equation 8 equal to σ2 I G•T , the system estimation is similar to a single equation estimation.Therefore, in this case, it seems to be reasonable to use the degrees of freedom of the entire system.

Goodness of fit
The goodness of fit of each single equation can be measured by the traditional R 2 values where R 2 i is the R 2 value of the ith equation and y i is the mean value of y i .The goodness of fit of the whole system can be measured by the where ι is a column vector of T ones (McElroy 1977).

Testing linear restrictions
Linear restrictions can be tested by an F test, two Wald tests and a likelihood ratio (LR) test.
The F statistic for systems of equations is where j is the number of restrictions, K is the total number of estimated coefficients, and all other variables are as defined before (Theil 1971, p. 314).Under the null hypothesis, F is F distributed with j and G • T − K degrees of freedom.
However, F in Equation 44 cannot be computed, because Σ is generally unknown.As a solution, Theil (1971, p. 314) proposes to replace the unknown Σ in Equation 44 by the estimated covariance matrix Σ.
Asymptotically, F has the same distribution as F in Equation 44, because the numerator of Equation 45 converges in probability to the numerator of Equation 44 and the denominator of Equation 45 converges in probability to the denominator of Equation 44 (Theil 1971, p. 402).
Furthermore, the denominators of both Equations 44 and 45 converge in probability to 1.
Taking this into account and applying Equation 8, we obtain the usual F statistic of the Wald test.
Under the null hypotheses, also F is asymptotically F distributed with j and G•T −K degrees of freedom.
Multiplying Equation 46with j, we obtain the usual χ 2 statistic for the Wald test Asymptotically, W has a χ 2 distribution with j degrees of freedom under the null hypothesis (Greene 2003, p. 347).
The likelihood-ratio (LR) statistic for systems of equations is where T is the number of observations per equation, and Σ r and Σ u are the residual covariance matrices calculated by Equation 37 of the restricted and unrestricted estimation, respectively.
Asymptotically, LR has a χ 2 distribution with j degrees of freedom under the null hypothesis (Greene 2003, p. 349).

Hausman test
Hausman (1978) developed a test for misspecification.The null hypothesis of the test is that the instrumental variables of each equation are uncorrelated with the disturbance terms of all other equations (E u ⊤ i Z j = 0 ∀ i ̸ = j).Under this null hypothesis, both the 2SLS and the 3SLS estimator are consistent, but the 3SLS estimator is (asymptotically) more efficient.Under the alternative hypothesis, the 2SLS estimator is consistent but the 3SLS estimator is inconsistent, i.e., the instrumental variables of each equation are uncorrelated with the disturbances of the same equation (E u ⊤ i Z i = 0 ∀ i), but the instrumental variables of at least one equation are correlated with the disturbances of another equation (E where β2SLS and COV β2SLS are the estimated coefficient and covariance matrix from a 2SLS estimation, and β3SLS and COV β3SLS are the estimated coefficients and covariance matrix from a 3SLS estimation.Under the null hypothesis, this test statistic has a χ 2 distribution with degrees of freedom equal to the number of estimated coefficients.

Source code
The source code of the systemfit package is publicly available for download from the Comprehensive R Archive Network (CRAN, http://CRAN.R-project.org/).The basic functionality of this package is provided by the function systemfit.Moreover, this package provides tools for statistical tests, functions (methods) to show, extract or calculate results, some convenience functions, and internal helper functions.

The basic function systemfit
The systemfit function estimates systems of linear equations by different estimation methods.Where possible, the user interface and the returned object of this function follow the function lm -the basic tool for linear regressions in R -to make the usage of systemfit as easy as possible for R users.
The econometric estimation is done by applying the formulas in Sections 2.1 and 2.2 or -if the coefficients are estimate under linear restrictions -by the formulas in Section 2.3.If the restrictions on the coefficients are specified symbolically, function makeHypothesis of the car package (Fox 2006(Fox , 2002) ) is used to create the restriction matrix.
The systemfit function returns a list of class systemfit that contains the results that belong to the entire system of equations.One special element of this list is called eq, which is a list that contains one object for each estimated equation.These objects are lists of class systemfit.equationand contain the results that belong only to the regarding equation.A complete description is available in the documentation of this function that is included in the package.A comparison of the elements returned by lm and by systemfit is available in appendix A.

Statistical tests
The linearHypothesis and lrtest methods for systemfit objects as well as the function hausman.systemfitapply the statistical tests described in Sections 2.7 and 2.8.
The linearHypothesis method for systemfit objects can be used to test linear restrictions on the estimated coefficients by Theil's F test or by usual Wald tests.Internally, Theil's F statistic is computed by the hidden function .ftest.systemfit and the Wald tests are computed by the default linearHypothesis method of the car package (Fox 2006(Fox , 2002)).The lrtest method for systemfit objects is a wrapper function to the default lrtest method of the lmtest package (Zeileis and Hothorn 2002), which computes the likelihood-ratio (LR) test statistic.All these functions return an object of class anova that contains -amongst others -the empirical test statistic, the degrees of freedom, and the corresponding P value.
The function hausman.systemfittests the consistency of the 3SLS estimator.It returns an object of class htest, which contains -amongst others -the empirical test statistic, the degrees of freedom, and the P value.

Other methods and functions
The systemfit package provides several methods for objects both of classes systemfit and systemfit.equation:print methods print the estimation results, summary methods calculate summary results, confint methods compute confidence intervals for the coefficients, predict methods calculate predicted values, coef methods extract the estimated coefficients, vcov methods extract their covariance matrix, fitted methods extract the fitted values, residuals methods extract the residuals, formula methods extract the formula(s), terms methods extract the model terms, model.framemethods extract the model frame, and model .matrixmethods extract the model matrix.Some methods can be applied to objects of class systemfit only: a correlation method calculates the correlations between the predictions of two equations, an se.ratio method computes the ratios of the standard errors of the predictions between two models, and a logLik method extracts the log likelihood value.The package provides print methods to print objects of classes summary.systemfit,summary .systemfit.equation, and confint.systemfitthat are returned by the above mentioned summary and confint methods.There exist also two coef methods to extract the estimated coefficients, their standard errors, t values, and P values from objects of classes summary .systemfitand summary.systemfit.equation.6 The convenience function createSystemfitModel creates a model for systemfit by random numbers; systemfit.controlsets suitable default values for the technical control parameters for systemfit.
Finally, the package includes some internal (hidden) helper functions: .prepareData.systemfit,.stackMatList,and .prepareWmatrixfor preparing the data matrices; .calcXtOmegaInv and .calcGLSfor calculating the GLS estimator; .calcResidCovand .calcSigma2for calculating the (co)variances of the residuals; and .ftest.systemfit for calculating Theil's F statistic.If systemfit is applied to a (classical) "seemingly unrelated regression" analysis with panel data, it calls the hidden internal function .systemfitPanel,which reshapes the data, creates the formulas to be estimated, and -if requested -specifies restrictions to ensure that the coefficients of all individuals are equal.

Efficiency of computations
We have followed Bates (2004) to make the code of systemfit faster and more stable.First, if a formula contains an inverse of a matrix that is post-multiplied by a vector or matrix, we use solve(A,b) instead of solve(A) %*% b.Second, we calculate crossproducts by crossprod(X) or crossprod(X,y) instead of t(X) %*% X or t(X) %*% y, respectively.
The matrix Ω −1 that is used to compute the estimated coefficients and their covariance matrix is of size 2, and 2.3).In case of large data sets, Ω −1 becomes computationally infeasible.Therefore, we use the following transformation and compute X ⊤ Ω −1 by dividing the X matrix into submatrices, doing some calculations with these submatrices, adding up some of these submatrices, and finally putting the submatrices together, so that where σ ij are the elements of the matrix Σ −1 , and X i is a submatrix of X that contains the rows that belong to the i's equation.This computation is done inside the internal (hidden) function .calcXtOmegaInv.
Since version 1.0, the systemfit function by default uses the Matrix package (Bates and Maechler 2007) for all computations where matrices are involved.The Matrix package provides classes for different types of matrices.For instance, we choose class dgeMatrix ("real matrices in general storage mode"), for matrices X i in Equation 2, class dgCMatrix ("general, numeric, sparse matrices in the (sorted) compressed sparse column format") for matrix X in Equation 3, and class dspMatrix ("symmetric real matrices in packed storage (one triangle only)") for the residual covariance matrix Σ.If the Matrix package is used, the possibly huge matrix Ω −1 is no longer a problem, because it is a sparse matrix that can be stored in a compressed format (class dgCMatrix).Hence, we no longer need the algorithm in Equation 50.
We have tested different ways to calculate a GLS estimator like in Equation 7 and we found that the following code is the fastest: In this code snippet, residCov is the residual covariance matrix Σ, nObs is the number of observations in each equation T , xMat is the matrix X and yVec is the vector y in Equation 7.
By default, the systemfit function uses the Matrix package to perform GLS estimations, because using this package considerably decreases the computation time for many common models.However, the estimation of small models with small data sets gets slower by using the Matrix package (see appendix B).While this increase in computation time is often imperceptible to human beings, it might matter in some cases such as iterated estimations or Monte Carlo studies.Therefore, the user can opt for not using the Matrix package, but Equation 50with standard R matrices.

Overlap with other functions and packages in R
Single-equation models can be fitted in R by OLS with function lm (package stats) and by 2SLS with function tsls (package sem, Fox 2011).This is also possible with the systemfit function, but systemfit is specialized in estimating systems of equation, i.e., more than one equation.Its capability to estimate single-equation models is just a side-effect.
Function sem (package sem, Fox 2011) can be used to estimate structural equation models in R by limited information maximum likelihood (LIML) and full information maximum likelihood (FIML) assuming normal or multinormal errors, respectively.A special feature of this function is the estimation of models with unobserved ("latent") variables, which is not possible with systemfit.While sem cannot be used to consistently estimate systems of simultaneous equations with some endogenous regressors, it can be used to estimate systems of equations, where all regressors are exogenous.However, the latter is rather cumbersome (see appendix C).Hence, systemfit is the only function in R that can be used to estimate systems of simultaneous equations and it is the most convenient function to estimate systems of equations with purely exogenous regressors.

Using systemfit
In this section we demonstrate how to use the systemfit package.First, we show the standard usage of systemfit by a simple example.Second, several options that can be specified by the user are presented.Then, the usage of systemfit for a (classical) "seemingly unrelated regression" analysis with panel data is described.Finally, we demonstrate how to apply some statistical tests.

Standard usage of systemfit
As described in the previous section, systems of equations can be econometrically estimated with the function systemfit.The only mandatory argument is formula.Typically, it is a list of formulas to be estimated, but it may also be a single formula for estimating a singleequation model.Each formula is a standard regression formula in R (see documentation of formula).
The following demonstration uses an example taken from Kmenta (1986, p. 685).We want to estimate a small model of the US food market: The first equation represents the demand side of the food market.Variable consump (food consumption per capita) is the dependent variable.The regressors are price (ratio of food prices to general consumer prices) and income (disposable income) as well as a constant.
The second equation specifies the supply side of the food market.Variable consump is the dependent variable of this equation as well.The regressors are again price (ratio of food prices to general consumer prices) and a constant as well as farmPrice (ratio of preceding year's prices received by farmers to general consumer prices) and trend (a time trend in years).The first line loads the systemfit package.The second line loads example data that are included with the package.They are attached to the R search path in line three.In the fourth and fifth line, the demand and supply equations are specified, respectively.7In the sixth line, these equations are concatenated into a list and are labeled demand and supply, respectively.8Finally, in the last two lines, the regression is performed and the estimation results are printed.

User options of systemfit
The user can modify the default estimation method by providing additional optional arguments, e.g., to specify instrumental variables or restrictions on the coefficients.All optional arguments are described in the following:

Estimation method
The optional argument method is a string that determines the estimation method.It must be either "OLS", "WLS", "SUR", "2SLS", "W2SLS", or "3SLS".These methods correspond to the estimation methods described in Sections 2.1, 2.2, and 2.3.The following command estimates the model described above as "seemingly unrelated regression".

Instrumental variables
The instruments for a 2SLS, W2SLS or 3SLS estimation can be specified by the argument inst.If the same instruments should be used for all equations, inst must be a one-sided formula.9If different instruments should be used for each equation, inst must be a list that contains a one-sided formula for each equation.The following example uses instrumental variables to estimate the model described above by "three-stage least squares" (3SLS).While the first command specifies the same instruments for all equations, the second uses different instruments: fit3sls <systemfit( eqSystem, method = "3SLS", inst = ~income + farmPrice + trend ) fit3sls2 <systemfit( eqSystem, method = "3SLS", inst = list( ~farmPrice + trend, ~income + farmPrice + trend ) )

Data
Having all data in the global environment or attached to the search path is often inconvenient.Therefore, systemfit has the argument data to specify a data frame that contains the variables of the model.In the following example, we use this argument to specify that the data for the estimation should be taken from the data frame Kmenta.Hence, we no longer need to attach this data frame before calling systemfit: fitsur <systemfit( eqSystem, method = "SUR", data = Kmenta )

Restrictions on the coefficients
As outlined in Section 2.3, restrictions on the coefficients can be specified in two ways.One way is to use the matrix R and the vector q (see Section 2.3).These restrictions can be specified symbolically by argument restrict.matrixas in the generic function linearHypothesis of the car package (Fox 2006(Fox , 2002)).This argument must be a vector of character strings, where each element represents one linear restriction, and each element must be either a linear combination of coefficients, or a linear equation in the coefficients (see documentation of function linearHypothesis in the car package, Fox 2006Fox , 2002)).We illustrate this by estimating the model under the restriction β 2 +β 6 = 0. Since the name of β 2 (coefficient of variable price in equation demand) is demand_price and the name of β 6 (coefficient of variable farmPrice in equation supply) is supply_farmPrice, this restriction can be specified by restrict <-"demand_price + supply_farmPrice = 0" fitsurRmat <systemfit(eqSystem, method = "SUR", restrict.matrix= restrict) Alternatively, the restrictions via matrix R and vector q can be specified numerically.The matrix R can be specified with argument restrict.matrixand the vector q with argument restrict.rhs.
fitsurRmatNum <systemfit(eqSystem, method = "SUR", restrict.matrix= Rmat, restrict.rhs= qvec) The first line creates a 1 × 7 matrix of zeros, where 1 is the number of restrictions and 7 is the number of unrestricted coefficients.The following two lines specify this matrix in a way that the multiplication with the coefficient vector results in β 2 + β 6 .The fourth line creates a vector with a single element that contains the right hand side of the restriction, i.e., zero.
The other way to specify restrictions on the coefficients is to modify the regressor matrix by post-multiplying it with a matrix, say M (see Section 2.3).This kind of restriction can be specified by setting argument restrict.regMatequal to the matrix M .We convert the restriction specified above to β 2 = −β 6 , and set and  β 7 = β M 6 .We can do this in R by modRegMat <matrix(0, nrow = 7, ncol = 6) modRegMat[1:5, 1:5] <diag( 5) The first line creates a 7 × 6 matrix of zeros, where 7 is the number of unrestricted coefficients and 6 is the number of restricted coefficients.The following three lines specify the matrix M (modRegMat) as described before.Finally the coefficients are estimated under the restriction β M 2 = β 2 = −β 6 .Of course, the estimation results do not depend on the method that was used to specify this restriction:

Iteration control
The estimation methods WLS, SUR, W2SLS and 3SLS need a covariance matrix of the residuals that can be calculated from a first-step OLS or 2SLS estimation (see Section 2.4).This procedure can be iterated and at each iteration the covariance matrix is calculated from the previous step estimation.This iteration is repeated until the maximum number of iterations is reached or the coefficient estimates have converged.The maximum number of iterations is specified by argument maxiter.Its default value is one, which means no iteration.The convergence criterion is where β i,g is the ith coefficient of the gth iteration.The default value of the convergence criterion (argument tol) is 10 −5 .
In the following example, we estimate the model described above by iterated SUR: fitsurit <systemfit( eqSystem, method = "SUR", maxiter = 500 ) Moreover, if the coefficients are estimated under restrictions, the user can use argument residCovRestricted to specify whether the residual covariance matrix for a WLS, SUR, W2SLS, or 3SLS estimation should be obtained from a restricted or from an unrestricted first-step estimation (see Section 2.4).If this argument is TRUE (the default), the residual covariance matrix is obtained from a restricted OLS or 2SLS estimation.If it is FALSE, the residual covariance matrix is obtained from an unrestricted first-step estimation.

Residual covariance matrix
Finally, argument residCovWeighted can be used to decide, whether the residual covariance matrix for a SUR (3SLS) estimation should be obtained from a WLS (W2SLS) estimation instead of from an OLS (2SLS) estimation (see Section 2.4).By default, residCovWeighted is FALSE, which means that the residuals of an OLS (2SLS) estimation are used to compute the residual covariance matrix.

3SLS formula
As discussed in Sections 2.2 and 2.

System options
Furthermore, two options regarding some internal calculations are available.First, argument solvetol specifies the tolerance level for detecting linear dependencies when inverting a matrix or calculating a determinant (using functions solve and det).The default value depends on the used computer system and is equal to the default tolerance level of solve and det.
Second, argument useMatrix specifies whether the Matrix package (Bates and Maechler 2007) should be used for all computations where matrices are involved (see Section 3.4).

Returned data objects
Finally, the user can decide whether systemfit should return some data objects.Argument model indicates whether a data frame with the data of the model should be returned.Its default value is TRUE, i.e., the model frame is returned.Arguments x, y, and z specify whether the model matrices (X i ), the responses (y i ), and the matrices of instrumental variables (Z i ), respectively, should be returned.These three arguments are FALSE by default, i.e., these data objects are not returned.

Summary results with summary.systemfit
The summary method can be used to compute and print summary results of objects returned by systemfit.First, the estimation method is reported and a few summary statistics for the entire system and for each equation are given.Then, the covariance matrix used for estimation and the covariance matrix as well as the correlation matrix of the (final) residuals are printed.Finally, the estimation results of each equation are reported: the formula of the estimated equation, the estimated coefficients, their standard errors, t values, P values and codes indicating their statistical significance, as well as some other statistics like the standard error of the residuals and the R 2 value of the equation.

Degrees of freedom for t tests
The summary method for systemfit objects has an optional argument useDfSys.It selects the approach that is applied by systemfit to determine the degrees of freedom of t tests of the estimated coefficients (Section 2.5).If argument useDfSys is TRUE, the degrees of freedom of the whole system are taken.In contrast, if useDfSys is FALSE, the degrees of freedom of the single equation are taken.If the coefficients are estimated under restrictions, argument useDfSys is TRUE by default.However, if no restrictions on the coefficient are specified, the default value of useDfSys is FALSE.

Reduce amount of printed output
The optional arguments residCov and equations can be used reduce the amount of the printed output.Argument residCov specifies whether the covariance matrix and the correlation matrix of the residuals are printed.Argument equations specifies whether summary results of each equation are printed.By default, both arguments are TRUE.The following command returns a sparse summary output:

Panel data
The systemfit function can also be used for a (classical) "seemingly unrelated regression" analysis with panel data.For this type of analysis, the data must be provided in a transformed data frame of class pdata.frame10which can be created with the function pdata.framefrom the R package plm (Croissant and Millo 2007).In contrast to the previously described usage of systemfit, argument formula must be a single equation (object of class formula).This formula is estimated for all individuals.
We demonstrate the application of systemfit to panel data using an example taken from Greene (2003, p. 340) that is based on Grunfeld (1958).We want to estimate a model for gross investment of 5 US firms in the years 1935-1954: where invest is the gross investment of firm i in year t, value is the market value of the firm at the end of the previous year, and capital is the capital stock of the firm at the end of the previous year.
This model can be estimated by ### this code chunk is evaluated only if the 'plm' package is available data( "GrunfeldGreene" ) library( "plm" ) GGPanel <pdata.frame(GrunfeldGreene, c( "firm", "year" ) ) greeneSur <systemfit( invest ~value + capital, method = "SUR", data = GGPanel ) The first line loads the example data set GrunfeldGreene that is included in the systemfit package.The second line loads the plm package and the following line specifies a data frame of class pdata.frame,where the variables firm and year indicate the individual (cross-section) and time identifier, respectively.Finally, a seemingly unrelated regression is performed.
The optional argument pooled is a logical variable indicating whether the coefficients are restricted to be equal for all individuals.By default, this argument is set to FALSE.The following command does a seemingly unrelated regression of the same model as before, but with coefficients restricted to be equal for all individuals.

Testing linear restrictions
As described in Section 2.7, linear restrictions can be tested by an F test, two Wald tests and an LR test.The systemfit package provides the method linearHypothesis for the F and Wald tests as well as the method lrtest for LR tests.
We will now test the restriction β 2 = −β 6 that was specified by the matrix Rmat and the vector qvec in the example above (p.18).The linear restrictions are tested by Theil's F test (Equation 45) first, second by the F statistic of a Wald test (Equation 46), third by the χ 2 statistic of a Wald test (Equation 47), and finally by an LR test (Equation 48).

linearHypothesis(
The first argument of the linearHypothesis method for systemfit objects must be an unrestricted regression returned by systemfit.The second and third arguments are the restriction matrix R and the optional vector q, as described in Section 2.3.Analogously to the argument restrict.matrix of the systemfit function, the restrictions can be specified either in matrix form or symbolically.The optional argument test must be a character string, "FT", "F", or "Chisq", specifying whether to compute Theil's finite-sample F test (with approximate F distribution) the finite-sample Wald test (with approximate F distribution) or the large-sample Wald test (with asymptotic χ 2 distribution).
All arguments of the lrtest method for systemfit objects must be fitted model objects returned by systemfit.It consecutively compares all provided fitted model objects.
All tests print a short description of the test and the tested model objects first.Then, a small table is printed, where each row belongs to one (unrestricted or restricted) model.The second row reports (amongst others) the degree(s) of freedom of the test, the empirical test statistic, and the marginal level of significance (P value).Although all tests check the same hypothesis, there is some variation of the P values.However, all tests suggest the same decision: The null hypothesis β 2 = −β 6 cannot be rejected at any reasonable level of significance.

Hausman test
A Hausman test, which is described in Section 2.8, can be carried out with following commands: fit2sls <systemfit( eqSystem, method = "2SLS", inst = ~income + farmPrice + trend, data = Kmenta ) fit3sls <systemfit( eqSystem, method = "3SLS", inst = ~income + farmPrice + trend, data = Kmenta ) hausman.systemfit( fit2sls, fit3sls ) ## ## Hausman specification test for consistency of the 3SLS estimation ## ## data: Kmenta ## Hausman = 2.5357, df = 7, p-value = 0.9244 First of all, the model is estimated by 2SLS and then by 3SLS.Finally, in the last line the test is carried out by the command hausman.systemfit.This function requires two arguments: the result of a 2SLS estimation and the result of a 3SLS estimation.The Hausman test statistic is 2.536, which has a χ 2 distribution with 7 degrees of freedom under the null hypothesis.The corresponding P value is 0.924.This shows that the null hypothesis is not rejected at any reasonable level of significance.Hence, we can assume that the 3SLS estimator is consistent.

Replication of textbook results
In this section, we reproduce results from several textbook examples using the systemfit package for several reasons.First, a comparison of systemfit's results with results published in the literature confirms the reliability of the systemfit package.Second, this section helps teachers and students become familiar with using the systemfit package.Third, the section encourages reproducible research, which should be a general goal in scientific analysis (Buckheit and Donoho 1995;Schwab, Karrenbach, and Claerbout 2000).For instance, by preparing this section, the exact estimation methods of the replicated analyses have been discovered and a few errors in Greene (2003) have been found (see Greene 2006a).
Initially, the data are loaded and three equations as well as the instrumental variables are specified.As in the example before, the equation system is estimated by OLS, 2SLS, 3SLS, and iterated 3SLS, and commands to print the estimated coefficients are presented.data( "KleinI" ) eqConsump <-consump ~corpProf + corpProfLag + wages eqInvest <-invest ~corpProf + corpProfLag + capitalLag The first calculates the covariance matrix without centering the residuals (see Section 2.4); the returned values are equal to those published in Greene (2003, p. 351).The second command calculates the residual covariance matrix after centering the residuals; these returned values are equal to those published in the errata (Greene 2006a).
After extracting the data from the GrunfeldGreene data set, the individual (cross-section) and time identifiers are indicated.Then, the formula is specified, and the model is estimated by OLS and SUR.Commands to print the estimated coefficients are reported after each estimation.

Non-linear estimation
Finally, the systemfit package provides a function to estimate systems of non-linear equations.However, the function nlsystemfit is currently under development and the results are not yet always reliable due to convergence difficulties.

A. Object returned by systemfit
systemfit returns a list of class systemfit that contains the results that belong to the entire system of equations.One special element of this list is called eq.It is a list that contains one object for each estimated equation.These objects are of the class systemfit.equationand contain the results that belong only to the regarding equation.The elements returned by systemfit are similar to those returned by lm, the basic tool for linear regressions in R. While some counterparts of elements returned by lm can be found directly in objects of class systemfit, other counterparts are available for each equation in objects of class systemfit.equation.This is demonstrated in Table 3.

B. Computation times
Theoretically, one would expect that the calculations with the Matrix package are faster and more robust than calculations with the traditional method.To test this hypothesis, we use function createSystemfitModel to create a medium-sized multi-equation model with 8 equations, 10 regressors in each equation (without constant), and 750 observations.Then, we estimated this model with and without using the Matrix package.Finally, the results are compared.library( "systemfit" ) set.seed( 1 ) systemfitModel <-createSystemfitModel( nEq = 8, nReg = 10, nObs = 750 ) system.time( fitMatrix <systemfit( systemfitModel$formula, method = "SUR", data = systemfitModel$data ) ) ## user system elapsed ## 0.177 0.016 0.193 system.time(fitTrad <systemfit( systemfitModel$formula, method = "SUR", data = systemfitModel$data, useMatrix = FALSE ) ) ## user system elapsed ## 0.406 0.032 0.437 all.equal(fitMatrix, fitTrad ) ## [1] "Component \"call\": target, current do not match when deparsed" ## [2] "Component \"control\": Component \"useMatrix\": 1 element mismatch" The returned computation times clearly show that using the Matrix package makes the estimation faster.The comparison of the estimation results shows that both methods return the same results.The only differences between the returned objects are -as expected -the call and the stored control variable useMatrix.However, the estimation of rather small models is much slower with the Matrix package than without this package.Moreover, the differences in computation time accumulate, if the estimation is iterated.

C. Estimating systems of equations with sem
This section compares the commands to estimate a system of equations by sem and systemfit.This comparison uses Klein's "Model I" (see Section 5.2).Before starting the estimation, we load the sem and systemfit package as well as the required data set.

Table 1 :
It was explained in Section 2.4 that several different methods have been proposed to calculate the residual covariance matrix.The user can specify, which method systemfit should use.Possible values of the argument methodResidCov are presented in Table 1.By default, systemfit uses Equation 38.
Possible values of argument methodResidCovFurthermore, the user can specify whether the means should be subtracted from the residuals before Equations 37, 38, 39, or 40 are applied to calculate the residual covariance matrix (see Section 2.4).The corresponding argument is called centerResiduals.It must be either TRUE (subtract the means) or FALSE (take the unmodified residuals).The default value of centerResiduals is FALSE.

Table 2 :
Possible values of argument method3slsSigma squaredIn case of OLS or 2SLS estimations, argument singleEqSigma can be used to specify, whether different σ 2 s for each single equation or the same σ 2 for all equations should be used.If argument singleEqSigma is TRUE, Ω in Equation8or 12 is set to Σ ⊗ I T .In contrast, if argument singleEqSigma is FALSE, Ω in Equation8or 12 is set to σ2 I G•T .In case of an unrestricted regression, argument singleEqSigma is TRUE by default.However, if the coefficients are estimated under restrictions, this argument is FALSE by default.
3, there exist several different methods to perform a 3SLS estimation.The user can specify the method by argument method3sls.Possible values are presented in Table2.The default value is "GLS".

Table 3 :
Elements returned by systemfit and lm (* if requested by the user with default TRUE, ** if requested by the user with default FALSE).
Theoretically, the LIML results should be identical to OLS results.Therefore, we re-estimate this model by OLS with systemfit.As expected, the results are identical.Now, we estimate the system by full information maximum likelihood (FIML) with sem: ### this code chunk is evaluated only if the 'sem' package is available