Skip to main content

First Quarter 2021, 
Vol. 103, No. 1
Posted 2021-01-14

FRED-QD: A Quarterly Database for Macroeconomic Research

by Michael W. McCracken and Serena Ng

Abstract

In this article, we present and describe FRED-QD, a large, quarterly frequency macroeconomic database that is currently available and regularly updated at https://research.stlouisfed.org/econ/mccracken/fred-databases/. The data provided are closely modeled to that used in Stock and Watson (2012a). As in our previous work on FRED-MD (McCracken and Ng, 2016), which is at a monthly frequency, our goal is simply to provide a publicly available source of macroeconomic "big data" that is updated in real time using the FRED® data service. We show that factors extracted from the FRED-QD dataset exhibit similar behavior to those extracted from the original Stock and Watson dataset. The dominant factors are shown to be insensitive to outliers, but outliers do affect the relative influence of the series, as indicated by leverage scores. We then investigate the role unit root tests play in the choice of transformation codes, with an emphasis on identifying instances in which the unit root-based codes differ from those already used in the literature. Finally, we show that factors extracted from our dataset are useful for forecasting a range of macroeconomic series and that the choice of transformation codes can contribute substantially to the accuracy of these forecasts.


Michael W. McCracken is an assistant vice president and economist at the Federal Reserve Bank of St. Louis. Serena Ng is the Edwin W. Rickert Professor of Economics at the Department of Economics and Data Science Institute at Columbia University and a research associate at the National Bureau of Economic Research. Financial support to Serena Ng is provided by the National Science Foundation (SES 1558623). The authors thank Aaron Amburgey and Joe McGillicuddy for excellent research assistance and especially thank Yvetta Fortova for being the FRED® insider who operationalized this project. 



INTRODUCTION

In our previous work, McCracken and Ng (2016), we describe and investigate a monthly frequency database of macroeconomic variables called FRED-MD. At some level, FRED-MD is not particularly innovative. It is, after all, just a collection of N = 128 standard U.S. macroeconomic time series that date back to January 1959 and have primarily been taken from FRED®, the data service maintained by the Federal Reserve Bank of St. Louis, and organized into a .csv file. That description, however, misses the point. Our main goal was to facilitate easy access to a standardized example of a data-rich environment that can be used for academic research. By automating this dataset, and maintaining a website that provides monthly frequency vintages, those who are interested in conducting research on big data can focus on the statistical problems associated with big data rather than having to put the dataset together themselves. This dataset frees the practitioner from dealing with issues related to, for example, updating the dataset when new data is released, managing series that become discontinued, and splicing series from different sources. More prosaically, FRED-MD facilitates comparison of methodologies developed for a common purpose.

FRED-MD has been successful. It has been used as a foil for applying big data methods including random subspace methods (Boot and Nibbering, 2019), sufficient dimension reduction (Barbarino and Bura, 2017), dynamic factor models (Stock and Watson, 2016), large Bayesian VARs (Giannone, Lenza, and Primiceri, 2018), various lasso-type regressions (Smeekes and Wijler, 2018), functional principal components, (Hu and Park, 2017), complete subset regression (Kotchoni, Leroux, and Stevanovich, 2019), and random forests (Medeiros et al., 2019). In addition, these various methods have been used to study a wide variety of economic and financial topics including bond risk premia (Bauer and Hamilton, 2017), the presence of real and financial tail risk (Nicolò and Lucchetta, 2016), liquidity shocks (Ellington, Florackis, and Milas, 2017), recession forecasting (Davig and Hall, 2019), identification of uncertainty shocks (Angelini et al., 2019), and identification of monetary policy shocks (Miranda-Agrippino and Ricco, 2017). Finally, and perhaps most rewarding, is that it is described as the inspiration to the development of a Canadian version of FRED-MD (Fortin-Gagnon et al., 2018).

While useful, FRED-MD has a glaring weakness. It does not include quarterly frequency data and thus does not provide information on gross domestic product (GDP), consumption, investment, government spending, or other macroeconomic series that come from the National Income and Product Accounts (NIPA). This is unfortunate because there are plenty of examples in the literature in which a quarterly frequency, data-rich environment is used for economic analysis. Examples include Stock and Watson (2012a,b), Schumacher and Breitung (2008), Gefang, Koop, and Poon (2019), Rossi and Sekhposyan (2015), Gonçalves, Perron, and Djogbenou (2017), Carrasco and Rossi (2016), Koopman and Mesters (2017), and Koop (2013).

In this article, we extend our previous work to a quarterly frequency dataset we call FRED-QD. The dataset is currently available at https://research.stlouisfed.org/econ/mccracken/fred-databases/. Like FRED-MD, FRED-QD is benchmarked to previous work by Stock and Watson (2012a, hereafter S&W). There, the authors organized a collection of N = 200 quarterly frequency macroeconomic series dating back to 1959:Q1 that they then used to analyze the dynamics of the Great Recession. Our quarterly frequency version of their dataset contains nearly all the series they used but, in addition, includes 48 more series, with an emphasis on including series related to non-household balance sheets. In total, the dataset consists of N = 248 quarterly frequency series dating back to 1959:Q1. While many of the series are actually quarterly series, some are higher-frequency series that have been aggregated up to the quarterly frequency—typically as quarterly averages of monthly series.

It's worth noting that we provide the data in levels—without transforming them in any way. As such, some are stationary in levels, while others likely need to be transformed by taking logs, differencing, or both to reasonably be considered stationary. For each series we provide benchmark transformation codes. If the series was in the S&W dataset, we provide the transformation codes. For the additional series, many are taken from FRED-MD and we therefore provide those benchmark transformation codes. One reason to do this is to facilitate replication of the factor analysis provided in S&W as well as other results that may have used a similar dataset. Even so, given the well-documented changes in volatility and persistence of macroeconomic series described in Campbell (2007) and Stock and Watson (2007), it may be a good idea to reconsider the default transformation codes. After providing more details on the data, we investigate this possibility through the lens of unit root tests. While it is often the case that the unit root tests align with the original transformation codes, the tests are not uniformly supportive.

We then investigate whether factors extracted from FRED-QD are useful for forecasting macroeconomic aggregates. In particular, we focus on whether the unit root-implied transformation codes matter for factor-based forecasting.2 Among the series that we forecast, we find that for real and financial series, factors estimated using the unit root-based transformation codes can provide additional predictive content but are more often dominated by those using the original transformation codes. In contrast, we find that when forecasting nominal price series, forecast accuracy is typically better when using factors estimated using the unit root-based codes. This result coincides with evidence provided by Medeiros et al. (2019) and Goulet Coulombe et al. (2019), who find that treating price inflation as I(0) leads to better forecasts of inflation than treating it as I(1)—which is precisely what the benchmark transformation codes recommend.


Read the full article.