Econometrics in R : Past , Present , and Future

Recently, computational methods and software have been receiving more attention in the econometrics literature, emphasizing that they are integral components of modern econometric research. This has also promoted the development of many new econometrics software packages written in R and made available on the Comprehensive R Archive Network. This special volume on “Econometrics in R” features a selection of these recent activities that includes packages for econometric analysis of cross-section, time series and panel data. This introduction to the special volume highlights the contents of the contributions and embeds them into a brief overview of other past, present, and future projects for econometrics in R.

1. Introduction Stigler's (1994) study of citation patterns in statistical research documents an impressive inflow of ideas from econometrics journals.Among the journals he surveys, Econometrica is the most influential according to his "balance of trade" measure.And yet econometrics has lagged behind other statistical fields in embracing computational methods as an intrinsic part of the research process.One can attribute this reticence to many factors: failure of disciplinary mechanism design, excessive faith in market solutions, general indolence, and so forth.But we have detected a recent change in the attitude of econometricians toward computational methods, an increasing willingness especially among younger researchers to regard software development and dissemination as a natural concomitant of econometric research.This change in attitude is also reflected in various journals publishing papers in the field of computational econometrics-e.g., in Computational Economics or several special issues of Computational Statistics & Data Analysis-and econometric software-e.g., the software review section of the Journal of Applied Econometrics.
Certainly, recent developments in computer hardware, software and networking have dramat-ically lowered the costs of such activities and increased their potential benefits.New methods that used to wait for years for commercial implementations can now be shared almost immediately over the Web.Writing software for general use (and abuse) is, of course, quite different than writing code for one-off research projects, but it has its own charms and rewards.The R project (R Development Core Team 2008) has made it considerably easier to undertake such development efforts, providing standards and templates for documentation, version control, and consistency checking.R is still relatively obscure in the econometrics community, but has a devoted and growing following.

Past
As a language with a rich toolbox for matrix-based computations R is a natural candidate for writing econometric software (Cribari-Neto and Zarkos 1999) and its open-source license makes it attractive for teaching econometrics (Racine and Hyndman 2002).Nevertheless, only relatively few packages available from the Comprehensive R Archive Network (CRAN, http://CRAN.R-project.org/) are devoted to econometrics explicitely which is also reflected in the fact that only few such packages have been published in the Journal of Statistical Software (JSS).Notable exceptions include the packages strucchange (Zeileis, Leisch, Hornik, and Kleiber 2002), sandwich (Zeileis 2004(Zeileis , 2006)), and systemfit (Henningsen and Hamann 2007).
In the last couple of years, however, the number of econometrics-related packages on CRAN grew (see e.g., the econometrics CRAN task view at http://CRAN.R-project.org/view=Econometrics) and several of these packages were presented by their authors at econometrics sessions of the useR!conferences in 2004 (http://www.R-project.org/conferences/useR-2004/) and 2006 (http://www.R-project.org/conferences/useR-2006/).At the latter conference, held at Wirtschaftsuniversität Wien in Vienna, Austria, the JSS editors decided to encourage the authors from several sessions to put together "Foometrics in R" special volumes-with values of "foo" including "chemo" (volume 18), "psycho" (volume 20), "environ" (volume 22), and "econo" (volume 27), among others.We hope that the packages/papers in this special volume will stimulate further interest in R development in econometrics.

Present
This special volume features some recent efforts to complement the econometrics functionality available in R. Some of the packages/papers fill "gaps" in the R toolbox for econometrics by providing standard tools previously unavailable in R, others present original implementations of cutting edge methods.
In the first paper, Croissant and Millo (2008) describe their plm package for analysis of panel data using linear models.It starts out with basic panel models estimated via ordinary least squares (OLS)-including the familiar within, between, and random effects models-and continues to present some extensions including models estimated by generalized methods of moments (GMM) or feasible generalized least squares (FGLS).
The two subsequent papers are devoted to the analysis of time series.Hyndman and Khandakar (2008) focus on automatic forecasting of univariate time series using their forecast package.Specifically, functions for exponential smoothing methods are presented in a general state space model framework, as well es tools for automatically selecting ARIMA models.Pfaff (2008) contributes software for modeling and forecasting multivariate time series using (structural) vector autoregressive (VAR and SVAR) and structural vector error correction (SVEC) models in package vars.In addition to standard fitting and prediction functionality, a set of diagnostic procedures-based on both, significance tests and visualizations-is discussed.
The remaining contributions to the special volume present various packages for regression, including methods for binary, count, and censored data which are probably particularly useful in microeconometrics.Hayfield and Racine (2008) implement a rich collection of tools for nonparametric and semiparametric econometrics using a kernel-based approach with datadriven bandwidth selection.In particular, their np package includes functions for regression (nonparametric, partially linear, single index, quantile, among others) and density estimation for a mix of continuous, discrete, and categorical data.Koenker (2008) introduces an extension of his quantreg package: The new function crq() implements quantile regression for censored data using various estimation techniques which are assessed with asymptotic and simulation comparisons.The package sampleSelection of Toomet and Henningsen (2008) provides a class of sample selection models, ranging from the classic Heckman model to other so-called "generalized tobit models".The special volume is concluded by Zeileis, Kleiber, and Jackman (2008) who provide an overview of count data regression in R, dealing both with overdispersion and excess zeros.Starting out from Poisson and negative binomial regression using previously available functionality in R, new functions hurdle() and zeroinfl() in package pscl are introduced for fitting hurdle and zero-inflated versions of the classic count regression models.
Of course, the packages/papers in this special volume are only a selection from the many current projects doing econometrics in R. For somewhat broader overviews (without making claims to be complete) of what is available presently we refer the reader to Farnsworth (2006) and Kleiber and Zeileis (2008) as well as the econometrics task view mentioned above.

Future
We hope that the current trend of increasing interest in R from econometricians continues; not least due to the exciting collection of new tools presented in this special volume wich should help researchers, practitioners and teachers alike to use R and its contributed packages in their work.But further innovations are on their way: At the upcoming useR! 2008 conference (http://www.R-project.org/useR-2008/), held at Technische Universität Dortmund, Germany, there will be several sessions related to econometrics that feature packages with new instruments for the econometric toolbox.All of this should help to spawn many new tools for econometrics written in R. Information about these will hopefully be discussed on the mailing list of the related special interest group "R-SIG-Finance" (see https: //stat.ethz.ch/mailman/listinfo/R-SIG-Finance),made available on CRAN, listed in the econometrics CRAN task view at http://CRAN.R-project.org/view=Econometrics, and published in future issues of the Journal of Statistical Software.