A firm-level dataset for analyzing entry, exit, employment and R&D expenditures in the UK: 1997–2012

This data article is related to the research article entitled “Inverted-U relationship between R&D intensity and survival: Evidence on scale and complementarity effects in UK data” (Ugur et al., In press) [1]. It describes the trends in R&D expenditures, employment of R&D personnel and firm entry and exit rates in the UK from 1998 to 2012. We also provide statistics on net employment creation and net R&D investments due to firm entry and exits. In addition, we compute the correlation coefficients between entry and exit rates at the two digit industry level so as to examine whether the correlations are contemporaneous or inter-temporal. Finally, we provide information about the underlying dataset to which secure access is available through UK Data Service Archive 7716 at http://dx.doi.org/10.5255/UKDA-SN-7716-1.


a b s t r a c t
This data article is related to the research article entitled "Inverted-U relationship between R&D intensity and survival: Evidence on scale and complementarity effects in UK data" (Ugur et al., In press) [1]. It describes the trends in R&D expenditures, employment of R&D personnel and firm entry and exit rates in the UK from 1998 to 2012. We also provide statistics on net employment creation and net R&D investments due to firm entry and exits. In addition, we compute the correlation coefficients between entry and exit rates at the two digit industry level so as to examine whether the correlations are contemporaneous or inter-temporal. Finally, we provide information about the underlying dataset to which secure access is available through UK Data Service Archive 7716 at http://dx. doi Value of the data Annual statistics on entry and exit rates in Table 1 highlight the implications of firm dynamics (entry and exit rates) for job creation, job destruction and net R&D expenditure. Furthermore, the underlying dataset can stimulate further research on firm dynamics, labor reallocation and productivity.
The correlation table between entry and exit rates in Table 2 can inform further research on the lack of sorting out effects in firm dynamics in the UK.
The link to the underlying dataset provides researchers with consistent and reliable microdata on UK firms from 1997 to 2012. The dataset has significant potential for future research in areas such as: (a) size distribution of firms; (b) firm diversity and survival; (c) geographical spillovers of R&D; and (d) job creation versus job destruction during the crisis and post-crisis periods.

Data
In this article, first we present two graphs depicting the trends in R&D expenditure by type (Fig. 1) and by R&D personnel (Fig. 2), drawing on the panel dataset we constructed from two Office for National Statistics (ONS) databases for the period 1997-2012. These are followed by Table 1 on annual entry and exit rates, net balances of employment and net balances of R&D investment, using data for 37,930 UK firms from 1998 to 2012. Table 2 follows with correlations between firm entry and exit rates at 3-digit industry level -with and without correction for industry fixed effects.

Dataset: sources and indicative content
Our dataset was obtained by merging the Business Expenditure on Research and Development (BERD) [3] with the Business Structure Database (BSD) [4]. The BERD database is an annual survey of firms with information on research and development. The BSD database is an annual snapshot of the   Inter-Departmental Business Register (IDBR)a live register of all UK firms registered for value-added tax (VAT) and/or Pay-as-You-Earn (PAYE) tax purposes. We merged the two datasets using the unique enterprise identifier. The merged dataset contains 37,930 firms with 185,094 firm/year observations after excluding firms with incorrect birth dates. The merged dataset [2] contains demographic firm information such as births and deaths as well as employment, turnover, R&D measures, SIC codes, etc. We ensured that all key variables were cleaned and consistent across years. Finally, the dataset was augmented with derived variables such as SIC-2007 consistent industry classification codes, UK versus foreign ownership codes, consistent output deflators, Pavitt classes and Herfindahl index at the 3 digit level. Further information on the merging and cleaning process is available in [1].

R&D investment and personnel
Total R&D expenditures from 1997 to 2012 is presented in Fig. 1. R&D expenditures are also broken down into Intramural R&D, Applied R&D and two key sources of R&D funding (own-funded R&D and R&D funded by the UK Central government). This is followed by Fig. 2, which depicts the trend in R&D personnel (scientists and technicians employed with the purpose of conducting R&D) from 1997 to 2012.

Entry and exit by year and industry
Annual entry and exit rates, together with associated changes in employment and total R&D expenditures are given in Table 1. The table summarizes the entry and exit rates by year and over the whole period from 1998-2012. It also summarizes the net balances of employment and R&D expenditures after taking into account the values associated with new entrants and exiting firms. The annual figures allow for comparing entry and exit rates and employment and R&D expenditure balances before, during and after the financial crisis of 2007-2009.
In Table 2, we present the correlations between entry and exit rates at the industry level, with and without correction for industry fixed effects (See [5,6]). Uncorrected entry and exit rates by year and 2-digit industry are calculated in accordance with Eq. (1).
where NR it and XR it are entry and exit rates in industry i in year t; N it and X it are numbers of entrants and exiters in industry i in year t; and F it À 1 is the total number of firms in industry i in the previous year. Uncorrected entry and exit rates enable us to verify if correlations are contemporaneous or intertemporal, and whether the correlations reflect industry-specific fixed effects due to technological conditions. We also calculated entry and exit rates corrected for fixed industry effects. The latter allows for verifying the existence of a 'sorting effect' and is calculated in accordance with Eq. 2, where NR Ã it and NR Ã it are demeaned entry and exit rates; and NR i and XR i are average entry and exit rates by industry. When industry-fixed effects are not corrected for, we can verify whether correlations are contemporaneous or inter-temporal, irrespective of industry-specific technological conditions. The evidence is in line with earlier findings from UK and US data [5,6]. It indicates that periods of high (low) entry are also periods of high (low) exitirrespective of differences in industry-specific technological conditions. However, entry and exit rates are not correlated inter-temporally. Again this is in line with existing findings and indicates that periods of high (low) entry are not followed by periods or high (low) exit. When corrected for industry-specific fixed effects, the correlations between entry and exit rates indicate whether a 'sorting effect' is at work. The latter arises if the marginal entrant in the period of above-average entry is of worse quality and, hence, increases the exit rate in the subsequent period. The findings in the off-diagonal cells of the right panel indicate that a sorting effect is not at work in UK data. The findings in the diagonal cells indicate contemporaneous correlations between entry and exit rates are not driven by industry-specific fixed effects. This is in line with evidence from UK data in [5], but in contrast to evidence from US data in [6], where a sorting effect exists.