and Operations A toolbox for calculating and decomposing Total Factor Productivity indices

Total Factor Productivity Toolbox is a new set of functions to calculate the main Total Factor Productivity (TFP) indices and their decompositions, based on Shephard’s distance functions, and using Data Envelopment Analysis (DEA) programming techniques. The package includes code for the standard Malmquist, Moorsteen–Bjurek, price-weighted and share-weighted TFP indices, allowing for the choice of orientation (input or output), reference period (base, comparison, geometric mean), returns to scale (variable or constant), and speciﬁc decompositions (aggregate, or identifying scale effects, as well as input and output mix effects). Classic deﬁnitions of TFP corresponding to the Laspeyres, Paasche, Fisher, or Törnqvist formulas can also be calculated as particular cases. This paper describes the methodology and implementation of the productivity functions in MATLAB . We compare the results corresponding to the different deﬁnitions by studying productivity trends in the US agriculture at the individual state level. the license.


Introduction
Total factor productivity (TFP) change is an important concept in economics because it measures the ability of firms, industries, and national economies to increase the aggregate volume of outputs they yield, relative to the aggregate volume of inputs they use. TFP constitutes a standard instrument for monitoring and benchmarking observations, and represents the cornerstone in multilateral studies of technological and economic performance, OECD (2001) . Defined as the ratio of an output quantity index to an input quantity index, TFP change can be calculated and decomposed in various ways.
TFP measurement has grown in importance over the past decades due to the increasing availability of data to study the productive performance of units, regardless of their market, governmental, or not-for-profit orientation. However, there are many ways to measure TFP depending on whether quantities and prices are available. TFP measurement requires the aggregation of quantities through suitable functions. If prices are available, it is pos-to calculate the main quantity-only and price-based TFP indices proposed in the literature, along with their associated decompositions. The toolbox allows for a choice of orientation (input or output), reference period (base, comparison, geometric mean), returns to scale (variable or constant), and specific decompositions (aggregate or identifying scale effects, as well as input and output mix effects). The code calculates Shephard's input and output distance functions approximating a production technology empirically through DEA. The toolbox also calculates classic TFP measures that do not rely on distance functions for their definition, including Laspeyres, Paasche, Fisher, and Törnqvist.
Quantity-only TFP indices based on distance functions, corresponding mainly to the Malmquist index, and coupled occasionally with the Moorsteen-Bjurek index, can be found in standard software packages like Stata ( StataCorp, 2015 ); through user-written commands as Lee et al. (2011) ; in LIMDEP ( Econometric Software (2014) ; in dedicated non-commercial software accompanying academic handbooks such as in Cooper et al. (2007) and Bogetoft and Otto (2011) (implemented in R ); in commercial software, including trial versions ( O'Donnell, 2011 andEmrouznejad andThanassoulis, 2011 ); in free-ware programs ( Coelli, 1998 ); and even in tutorials for spreadsheets ( Zhu, 2014 ). For the classic TFP indices using prices as aggregators, Shehata and Mickaiel (2015) developed a Stata module to calculate value index numbers, while Coelli (1999) offered a stand-alone program that calculates Fisher and Törnqvist indices, including transitive versions using the EKS method. O'Donnell (2011) expanded this last option to Lowe-type indices under a commercial license. Recently, Dakpo et al. (2018) have coded an R -based productivity package, based on O'Donnell (2011,2012,2018) . However, while it covers many of the definitions considered in this toolbox, the factors corresponding to technological change, as well as the scale and the input-and output-mix effects vary as result of differences in the underlying theoretical models. As remarked by O'Donnell (2012) , there is a potentially infinite number of exhaustive decompositions of a TFP index. For example, one family of these decompositions, identified by Dakpo et al. (2018) and O'Donnell (2011O'Donnell ( , 2012O'Donnell ( , 2018 , defines technological change globally as the change in maximum productivity between two time periods, while in this toolbox a local measure of technological change is considered. Consequently, the rest of factors must accommodate the numerical difference between the two. However, all these decompositions can be rightly interpreted, while complementing each other. Hence, researchers can rely on one or another depending on their preferred definitions, and compare them using all the available packages. Additionally, as a novelty of this toolbox, we offer the possibility of decomposing share-weighted geometric indices, which is unavailable in previous packages. Finally, a comprehensive review of the available general purpose and dedicated software options for efficiency and productivity analysis can be found in Daraio et al. (2019) .
Although these packages implement the main TFP indices, there is a lack of a full set of functions for MATLAB , and none of them includes a complete decomposition of productivity change according to its multiple components. Thus, besides implementing the basic TFP definitions based on quantities and prices in the MATLAB environment, our toolbox calculates a large array of index numbers capturing (technical) efficiency change, technological change, as well as scale effects, and input and output mix effects. Quantities-only Malmquist and Moorsteen-Bjurek indices, as well as price-based Fisher and Törnqvist indices, are decomposed into mutually exclusive factors with meaningful interpretations in terms of economic index number theory. The Total Factor Productivity Toolbox introduces a set of functions, covering a wide range of TFP indices, and reports numerical results using a common example to ease comparability and to illustrate their use. The toolbox is available as free software, under the GNU Gen-eral Public License version 3, and can be downloaded from http: //www.tfptoolbox.com , with all the supplementary material (data, examples, and source code) to replicate all the results presented in this paper. The toolbox is compatible with the 2017b (8.5) version of MATLAB , ( The MathWorks, Inc., 2017 ). The U.S. agricultural data employed in this article, as well the code used to calculate the various productivity indices are available as supplementary on-line material. The toolbox is also hosted on an open source repository on GitHub. 1 This paper is organized as follows. Section 2 presents the data structures characterizing the production possibility sets, the structure of the functions, results, and the U.S. agricultural data that is used as real-case study and to illustrate the toolbox. Section 3 covers the Malmquist productivity indices, relying on radial output or input distance functions. We show how these indices can be decomposed into factors with meaningful interpretations, such as technical efficiency change, technological change, scale effect, and input and output mix effect. We also relate and interpret these factors in terms of the output code that is obtained when running the specific functions. Malmquist productivity indices take into consideration only one of the two sides of the production process, output or input. In Section 4 we consider the class of Moorsteen-Bjurek indices, defined as ratio of an output quantity index to an input quantity index, which in turn can be expressed in terms of output and input distance functions, respectively. We present the decomposition of these indices using various alternatives to identify the above effects. If input and output prices are available, it is possible to calculate and decompose classical indices such as Fisher and Törnqvist. The price-weighted productivity indices are considered in Section 5 . Rather than multiplying input and output quantities by their prices, one can aggregate individual quantity ratios by a geometric mean, weighing with cost or revenue shares. Section 6 deals with the class of share-weighted indices, whose best known representative is the Törnqvist productivity index. Advanced options, including displaying and exporting results are discussed in Section 2.2 . Section 8 concludes.

Data structures
Total Factor Productivity change measures the temporal variation in the productive performance of a DMU (decision making unit such as a plant, firm, industry, or economy) over time. In time period t = 0 , 1 , . . . , T each DMU transforms a vector of inputs x kt ∈ R N ++ into a vector of outputs y kt ∈ R M All estimation functions return a structure tfpout that contains fields with the estimation results as well as the input of the estimation function. Fields can be accessed directly using the dot notation, and the whole structure can be used as an input to other functions that print or export results (e.g., tfpdisp ). Some of the fields of the tfpout structure are the following: 2 • X and Y : Contain the input and output quantities, respectively. • W and P : Contain the input and output prices, respectively. For example, the default dispstr after using the deatfpgprod function with the geometric mean option is names/tfp.GProd/tfp.EC/tfp.TC/tfp.IME/tfp.SEC .
The fields displayed in the output table must be separated by a / and include the names corresponding to the field names of the tfpout structure. The available fields are those presented in Table 1 .

Exporting results
Results can easily be exported to various file formats for posterior analysis and sharing. First, the tfpout structure should be converted to a MATLAB

Agricultural productivity in the US
To illustrate the toolbox we compare the various productivity trends in US agriculture obtained with the existing TFP def- Table 1 Fields of the tfpout structure available for the dispstr string.  tfp.IME_100 initions. The data has been collected and tabulated by the Economic Research Section (ERS) of the United States Department of Agriculture (USDA) in an effort to study long term productivity trends. It corresponds to a subset of the state-level tables including price indices and implicit quantities of farm outputs and inputs. It is readily available in MATLAB format as supplemental on-line material to this article. We calculate productivity growth between 1960 and 2004, corresponding to the first and last available years in the dataset. The full data, corresponding to Table #23, can be downloaded from https://www.ers.usda.gov/data-products/ agricultural-productivity-in-the-us/ . Our dataset consists of three outputs (livestock, crops, and other farm related output), and four inputs (capital, land, labor, and intermediate inputs). More information on the data, including descriptive statistics and a discussion of the agricultural productivity growth in the US can be found in the above link. Due to its comprehensiveness and reliability, this dataset has been used over the years by many authors, whose results are complemented with those obtained by solving the models included in this toolbox; see, to cite but a few, Färe et al. (2008) , Ball et al. (2001) , and Zofío and Lovell (2001) . Once downloaded, data on input and output quantities and prices can be brought into the workspace by running the following code included in the example file:

Malmquist productivity index
When only quantities are available the most popular measurement instrument is the Malmquist productivity index (MPI). The MPI requires calculation of the output-or input-orientated distance of observation ( x kt , y kt ) in two consecutive periods (say, base period t = 0 and comparison period t = 1 ) to the frontier of a certain benchmark technology exhibiting CRS. The imposition of CRS is necessary for the distance function to satisfy the homogeneity conditions that ensure that the MPI has the required proportionality properties, see Balk and Zofío (2018) . Thus in general the benchmark technology will differ from any actual technology.
It is usual to take the cone technology of a certain period t , Š t , as benchmark. Its output distance function is defined by Ď t Operationally, within the DEA framework, this can be calculated by solving the program Ď t y)) is the point on the frontier of the period t cone technology that is obtained by holding the input quantity vector x constant while radially expanding the output quantity vector y . The input distance function is defined as Ď t i (x , y) ≡ sup { δ | δ > 0 , (x /δ, y) ∈ Š t } , and can be calculated by solving the pro- is the point on the frontier of the period t cone technology that is obtained by holding the output quantity vector y constant while radially contracting the input quantity vector x . Notice that Ď t The counterparts of the above output and input distance functions, defined on the actual technology S t which in general exhibits VRS, denoted by D t o (x , y) and D t i (x , y) , are computed in the same way, with eλ = 1 as additional constraint.

The output-orientated MPI
The output-orientated MPI, conditional on the period t cone technology, for a certain DMU, is defined by 4 Selecting the base period cone technology then leads to M 0 o (x 1 , y 1 , x 0 , y 0 ) , and selecting the comparison period cone technology leads to M 1 o (x 1 , y 1 , x 0 , y 0 ) . The TFP toolbox calculates both, as well as their geometric mean. Let us start with the first option.

The base-period-output-orientated MPI
Following Balk and Zofío (2018) , who provide meaningful theoretical interpretations for the different factors, the first extended decomposition of the base-period-output-orientated MPI (termed In this expression there are four mutually independent factors, with the following interpretation: representing the change in the technical efficiency of the DMU, also known as the catch-up effect . • Technological change: capturing the change in the actual technological frontier, also known as the frontier-shift effect . • Radial scale and input mix effect: to scale efficiency improvements associated to radial increases in the input quantities, and the additional effect from changes in the input quantity mix. 5 • Output mix effect: showing the counterpart effect associated to changes in the output quantity mix. Factors with values greater than 1 contribute to productivity growth (e.g., through technical efficiency gains or technological progress), while factors whose values are smaller than 1 are detrimental. 5 Balk and Zofío (2018) show that SEC 0 o (x 1 , x 0 , y 0 ) can be decomposed into a radial scale effect and an input quantity mix effect. The implementation in a DEA framework is however too complicated to be practically useful.
An alternative decomposition of the base-period-output-orientated MPI reverses the order in which changes in the input and output space take place in the last two factors in expression (2) . This yields M 0 which is termed 'Path B'. The differences between this decomposition and that in expression (2) are subtle but noteworthy. The parts capturing efficiency change and technological change are identical. The factor capturing the radial scale effect and the input mix effect is conditional on y 0 in expression (2) , but on y 1 in expression (3) . The reverse happens with the output mix effect; this effect is conditional on x 1 in expression (2) , but on x 0 in expression (3) . Consequently, there are two, equally meaningful, decompositions of the Malmquist productivity index M 0 o (x 1 , y 1 , x 0 , y 0 ) . If there is no preference for one of them then the geometric mean of expressions (2) and (3) can be taken, reading To compute the MPI in MATLAB the user calls the deatfpm(X, Y, ...) function with i) the orient parameter set to the output orientation oo ; 2) the period parameter set to base ; and iii) the decomposition parameter decomp set to complete . With the optional parameter names we can specify a cell string with the names of the DMUs, which in this case are the names of the U.S. states. 6 The model parameters and all the ancillary information can be found in the editor under tfpm_oo_base_complete . The results are displayed in the 'command window', and the structure with all the variables and results can be found in the workspace by clicking on tfpm_oo_base_complete.tfp . Fig. 1 shows the structures for the model parameters and results in the MATLAB editor. The output of the function successively shows the MPI ( M ), the technical efficiency change factor ( EC ), the technological change factor ( TC ), the geometric mean of the scale effect factors ( SEC ), the geometric mean of the output mix effect factors ( OME ), and the separate values of the two last factors corresponding to 'Path A' and 'Path B' above. These values are identified by the time superscripts of the input and output arguments in each factor; for example, SEC_100 corresponds to SEC 0 (2) and (4) above. The function also returns the main descriptive statistics of the various factors. Finally, a relevant clarification concerns the nature of returns to scale. The Malmquist index is defined with respect to a cone technology, independently of whether the actual technology is characterized by global CRS or not. Thus the default decomposition is based on VRS technologies, which is identified by the corresponding returns-to-scale header below; i.e., "Returns to scale: vrs (Variable) ". The case of a technology characterized by global CRS is considered in the following subsection.
We exemplify individual results by discussing the productivity trends of the first and last three states. Most states increased their agricultural productivity, Illinois (IL) leading with a 592.00% increase, corresponding to the percentage change of the maximum MPI value. Exceptions exhibiting productivity decline are Oklahoma (OK), unreported, and West Virgina (WV), with a 18.23% reduction (the minimum value of the MPI). The main component of productivity growth, on average 215.99%, is technological progress, which generally outpaces efficiency gains by far (on average they contribute 199.44% and 11.01%, respectively). Changes in the scale of production or the output mix do not play a significant role, implying that there have no been relevant changes in the production scale and the input and output structure of the states over the forty-four year period 1960-2004. Similar comments can be made for each individual state. 7

Related decompositions of the base-period-output-orientated MPI
The above four-factor decompositions can be related to simpler proposals in the literature. First, they generalize the earlier proposal by Ray and Desli (1997) . By merging the scale effect and the output mix effect in expressions (2) or (3) , both decompositions reduce to the same three-factor decomposition: where Lovell (2003) suggests that this factor measures the contribution of returns-to-scale to productivity change. Following this interpretation, if it is greater than 1, increasing returns to scale result in productivity growth, while if it is smaller than 1, decreasing returns to scale are detrimental. If equal to one, constant returns to scale prevails at the observed input-output values.
To obtain this simpler decomposition, the syntax is the same as in the extended one but with the parameter decomp set to rts . In the output table the column RTS contains the contribution of returns-to-scale to productivity change.
Notice that the contribution of returns-to-scale to productivity growth, consistent with small values of the scale and output mix effects, is quite limited. In the case of West Virginia (WV) the existence of decreasing returns to scale, along with the subsidence of the production frontier, associated to values of technological change smaller than one, results in the previously reported productivity decline to the tune of −18 . 23% .
Second, the technical efficiency change and technological change terms common to expressions (2), (3) , and (4) correspond to the base-period-output-orientated index proposed by Caves et al. (1982) Substituting expression (6) in any of the above decompositions yields alternative expressions, To obtain this decomposition the parameter decomp must be set to ccd . 7 Descriptive statistics, productivity distributions, and associated graphs, can be easily calculated and plotted in MATLAB .
In the extant literature on productivity measurement, the CCD index frequently figures under the name 'the (output orientated) Malmquist productivity index'. However, the CCD index, expression (6) , cannot be regarded as a productivity index. First, it cannot be written as a measure of aggregate output growth divided by a measure of aggregate input growth, the corresponding aggregators being non-negative non-decreasing linearly homogenous scalar functions. O'Donnell (2012) uses the term "multiplicatively complete" to refer to TFP indexes constructed in this way. More importantly, the CCD index does not satisfy the proportionality property, as initially shown by Grifell-Tatjé and Lovell (1995) , unless the actual base period technology exhibits CRS. In the specific case of a technology satisfying CRS, all the SEC and OME terms in expressions (7), (8) , and (9) become identical to 1, and the MPI M 0 The MPI is then decomposed as initially proposed by The two-factor decomposition of the MPI for a CRS technology can be calculated by setting the parameter decomp to crs , 8 The toolbox offers the possibility of testing the hypothesis that a certain technology is characterized by VRS or CRS. The function implements the algorithms developed by Simar and Wilson (2002) , following Bogetoft and Otto (2011) , to determine whether the VRS and CRS distance function values are significantly different or not. The test can be performed by calling the deatestrtsm(X, Y, ...) function, with the orient parameter set to the same orientation as the MPI.
Notice that, as the MPI does not depend on an actual technology, its outcomes are identical to those in the previous tables. The decompositions in EC and TC factors, however, are different. The differences are not large, as for most of the DMUs the RTS factor in expression (5) resides in the neighbourhood of 1. We again see the prominent role played by technological change in most of the American states, except Oklahoma (OK), not reported in the table above, and West Virginia (WV).

The comparison-period-output-orientated MPI
As anticipated, we can also select the comparison period technology, Š 1 , as benchmark. The comparison-period-output-orientated MPI is then given by M 1 o (x 1 , y 1 , x 0 , y 0 ) . The complete four-factor decomposition analogous to expressions (2) and (3) are, respectively, and M 1 which are identified by 'Path C' and 'Path D' in Balk and Zofío (2018) . Again, if there is no preference for either of the two then it is advised to take the geometric mean of these two decompositions, The comparison-period-output-orientated CCD index is defined by The complete decomposition of the comparison-period-output-orientated MPI is calculated by calling the deatfpm(X, Y, ...) function, replacing base by comparison in the period parameter, and leaving the other parameters unchanged: The output has the same structure as the base period analogue and is therefore not reported here. Also, obtaining the related decompositions is done by setting the parameter decomp to either rts , ccd , or crs .

The geometric-mean-output-orientated MPI
In general the output-orientated MPIs with base and comparison period benchmarks will deliver different results. A compromise between the two MPIs is their geometric mean, This index can be decomposed into the following factors, This decomposition is computed by setting the period parameter to geomean in the deatfpm(X, Y, ...) function.
The corresponding individual terms associated to 'Path A', 'Path B', 'Path C', or 'Path D' can be recovered by running the previous base or comparison options. It is then possible to confirm that the two decompositions are indeed different and their geometric mean is a useful compromise. Also, the related decompositions are obtained by setting the parameter decomp to rts , ccd , or crs .
The importance of the benchmark period is clear in the case of US agriculture. The geometric mean of the base and comparison period benchmarks delivers a 54.94% increase in productivity over the 44 year period. This is substantially lower than the previously reported result from the base period benchmark, 215.99%, suggesting that the comparison period benchmark delivers a rather limited productivity change. Since the EC factor is common to both decompositions (11.01%), the reduction in productivity growth corresponds to lower TC to the tune of 33.52%. Finally, regardless of the choice of the benchmark technology, input and output scale and mix effects barely contribute to productivity growth, their magnitudes being close to one.

The input-orientated MPI
Recall that a cone technology exhibits CRS, Thus the output-orientated MPI defined by expression (1) can also be written as that is, as input-orientated MPI, conditional on the period t cone technology, Š t . The options for decomposing an input-orientated MPI run parallel to those already presented for the output-orientated counterpart, and therefore it is sufficient to describe the changes that need to be introduced in the MATLAB functions to obtain the different decompositions. For formal definitions and a graphical representation of the input decompositions see Balk and Zofío (2018 , Section 4). In general, all that is required is to substitute io for oo in the orient parameter in the MPI function.
The base-period-input-orientated MPI, defined as M 0 i (x 1 , y 1 , x 0 , y 0 ) , is calculated and decomposed by running the following code: Following procedures identical to those for the output-orientated MPI the related decompositions can be obtained. Merging the input mix and scale effects into the returns-to-scale factor yields the three-factor decomposition counterpart to expression (5) . To obtain this simpler decomposition the parameter decomp must be changed to rts . (6) , one can merge the technical efficiency change and technological change factors into the base-period-inputorientated version of CCD index. For this decomposition the parameter decomp must be changed from complete to ccd .

As in expression
The CRS version can be calculated by setting the parameter decomp to crs .
Choosing the cone technology Š 1 as benchmark leads to the comparison-period-input-orientated MPI, M 1 i (x 1 , y 1 , x 0 , y 0 ) . The complete decomposition requires changing base to comparison in the period parameter of the deatfpm(X, Y, ...) function.
Finally, the geometric mean of the base and comparison period perspectives, , as in the case of the output-orientated MPI (expression (15) ) is obtained by setting the period parameter to geomean .
Related decompositions for the comparison-period-and geometric-mean-input-orientated MPI are obtained by setting the parameter decomp to either rts , ccd , or crs . For each variant, the output of the function is interpreted in the same way as their base-period counterparts.

Moorsteen-Bjurek productivity index
The second definition of a productivity index based only on quantities takes into account both the output and input orientations. Specifically, the family of Moorsteen-Bjurek productivity indices (MBPI) is defined as the ratio of a Malmquist output quantity index to a Malmquist input quantity index, conditional on a benchmark technology S t . A Malmquist output quantity index, comparing output quantities y 1 to y 0 , conditional on certain input quantities x , is defined as Q t y 0 ) . Similarly, a Malmquist input quantity index, comparing input quantities x 1 to x 0 , conditional on certain output quantities ȳ , is defined as Q t Both indices can be traced back to suggestions by Moorsteen (1961) , and their properties were extensively discussed in Balk (1998) . Typically, in empirical applications involving many DMUs, x and ȳ would be chosen as vectors of sample means. This is the approach followed in the toolbox. The MBPI is then defined by The last term shows that the MBPI can be seen as a ratio of two productivity levels. Up to a scalar normalization, and conditional on x and ȳ , the productivity level at the input-output situation ( x , y ) is thereby measured as D t where superscript t refers to the benchmark technology S t . Thus, the MBPI belongs to the class of "multiplicatively complete" TFP indices, as defined by O'Donnell (2012) . Based on the properties of the MBPI, Balk and Zofío (2018) show that this index can be decomposed into factors corresponding to those already shown and interpreted. Decompositions can be based on output distance functions or input distance functions. As benchmark the technologies of the base period 0 and comparison 1 are used. We follow the order in which the MPI decompositions were discussed.

The base-period MBPI
Taking as benchmark the base-period technology, along 'Path A' the MBPI is decomposed as The factors on the right-hand side of the equality sign represent, respectively, technical efficiency change, technological change (conditional on ( x 1 , y 1 )), the radial scale and input mix effect (conditional on y 0 ), and the output mix effect (conditional on x 1 ).
With the same benchmark technology, along 'Path B' the following decomposition is obtained: The first two terms are the same but the radial scale and input mix effect are now conditional on y 1 , while the output mix effect is conditional on x 0 . The geometric mean of the two decompositions is computed by the deatfpmb(X, Y, ...) function with: (i) the orient parameter set to output orientation oo ; (ii) the period parameter set to base ; and (iii) the decomposition parameter decomp set to complete . In this function, the (arithmetic) averages of the input quantities x and output quantities ȳ in the two periods 0 and 1 are automatically calculated from the arrays of X and Y .
The output of the function successively shows the MBPI ( MB ), EC , TC , and the geometric mean of the SEC and OME factors. The last four columns show the separate values of the SEC and OME factors corresponding to 'Path A' and 'Path B', identified by the time superscripts of each of their input and output arguments. For instance, SEC_100 identifies the third factor in expression (19) , counterpart Regarding US agricultural productivity growth, it is interesting to compare the results corresponding to the Malmquist and Moorsteen-Bjurek definitions. The difference between the two is rather small, as average growth for the latter amounts to 229.52% versus 215.99% for the former. Since the efficiency change and technological change factors are common to both indices, the difference between the two must be due to the scale and output mix effects. While these effects were clearly negative in the case of the MPI, with values well below unity, here we observe that they positively contribute to productivity growth. Nevertheless we conclude that, for the US agriculture, the choice of the Malmquist or Moorsteen-Bjurek definition does not result in relevant differences at the aggregate or individual level. Indeed, the Spearman rank correlation between the two indices is statistically significant at the 1% confidence level with a value of ρ = 0 . 8629 .

Related decompositions
The MBPIs in expressions (19) and (20) are related to the CCD index, defined in expression (6) . Thus the two expressions can be rewritten as The geometric mean decomposition is obtained by running the following code: If the base period technology exhibits CRS then the corresponding MBPI and its decomposition is obtained by setting the parameter decomp to crs :

The comparison-period MBPI
Similar decompositions of the MBPI can be obtained if the comparison period technology is selected as benchmark, y 1 , x 0 , y 0 ;x , ȳ ) . The geometric mean of the decompositions along 'Paths C and D' is obtained by setting the period parameter to comparison . Related decompositions are obtained by setting the decomp parameter to ccd or crs .

The geometric-mean MBPI
The base-and comparison-period MBPIs yield different results unless the benchmark technologies satisfy extremely restrictive conditions. As we have seen, there are four decompositions of the productivity index MB t (x 1 , y 1 , x 0 , y 0 ;x , ȳ ) from an output orientation. Expressions (19) and (20) provide two decompositions of MB 0 (x 1 , y 1 , x 0 , y 0 ;x , ȳ ) . There are two similar decompositions of MB 1 (x 1 , y 1 , x 0 , y 0 ;x , ȳ ) . By taking their geometric mean, we obtain a decomposition of the geometric mean index MB (x 1 , y 1 , Four th terms paths AC , OME Four th terms paths BD , OME . The first row of expression (23) delivers the efficiency change and technological change effects, respectively. The second and third rows together correspond to the radial scale plus input mix effect (gathering the third factors in the component decompositions), and the fourth and fifth rows together measure the output mix effect (gathering the fourth factors in the component decompositions). This decomposition is obtained by setting in the deatfpmb(X, Y, ...) function the period parameter to geomean .
The separate TC , SEC , and OME components can be recovered from the previous decompositions under the base or comparison specification. The related decompositions can be obtained by changing the argument complete in the decomp parameter to ccd or crs .
Again, for US agriculture, it is possible to compare productivity trends delivered by the geometric-mean-output-orientated Moorsteen-Bjurek index and its Malmquist counterpart. As with the base-period perspective comparison, both sets of results are highly correlated with a Spearman coefficient equal to ρ = 0 . 8763 , statistically significant at the 1% level. Indeed, average productivity growth according to the Moorsteen-Bjurek index amounts 47.96%, slightly smaller than the Malmquist index at 54.97%. Also the maximum and minimum values are very similar, corresponding in both cases to Delaware (DE) and Oklahoma (OK), respectively. The only difference between the two indices is in the scale and output mix effects, but these differences are rather small.

The input-orientated decomposition of MBPI
The counterpart decomposition of the MBPI -see expression (19) -taking the technology of the base period as benchmark, along 'Path E' is given by The factors on the right-hand side of the equality sign represent, respectively, technical efficiency change, technological change (conditional on ( x 1 , y 1 )), the input mix effect (conditional on y 0 ), and the radial scale plus output mix effect (conditional on x 1 ). The decomposition along 'Path F' is the same except that the input mix effect is conditional on y 1 and the radial scale plus output mix effect is conditional on x 0 . The geometric mean of the two decompositions is calculated setting the orient parameter to io , and the period parameter to base in the deatfpmb(X, Y, ...) function.
Related decompositions are obtained by changing the decomp parameter to ccd or crs .
Alternatively, the input-orientated decompositions of the comparison-period MBPI along 'Path G' and 'Path H', and their geometric mean, are obtained by setting the period parameter to comparison . Changing the decomp parameter to ccd or crs yields the related decompositions.
All in all, there are four decompositions of the productivity index MB t (x 1 , y 1 , x 0 , y 0 ;x , ȳ ) from the input orientation: two from the base period perspective, and two from the comparison period perspective. By taking the geometric mean of these four decompositions we obtain a decomposition of the geometric mean index MB(x 1 , y 1 , x 0 , y 0 x 1 , y 1 , x 0 , y 0 ;x , ȳ )] 1 / 2 . This corresponds to the decomposition in expression (23) , but is based on input distance functions instead of output distance functions.
The geometric mean decomposition is obtained by setting the period parameter to geomean .

Price-weighted productivity indices (Fisher)
The generic definition of a price-weighted productivity index comparing ( x 1 , y 1 ) to ( x 0 , y 0 ), conditional on input prices w t and output prices p t , is Defined as a ratio of Lowe quantity indices of output and input, respectively, it can also be seen as a ratio of the productivity levels of period 1 and period 0, respectively. Period t serves as benchmark period, determining the prices with which quantities are multiplied. For t = 0 we obtain the Laspeyres productivity index, and for t = 1 the Paasche productivity index. The geometric mean of these two is the Fisher productivity index.
To calculate these indices, users must provide not only input and output quantities ( X and Y matrices) but also corresponding prices ( W and P matrices), which are three-dimensional arrays, with the third dimension corresponding to the time periods t = 0 , 1 , . . . , T . Note that in principle, the price-weighted indices could be calculated using any set of prices provided by the user. However, if prices are chosen as above, the toolbox calculates the Laspeyres, Paasche, and Fisher indices by default.

The Laspeyres PI
Following 'Path A', the decomposition of the Laspeyres PI is At the right-hand side, the first factor measures technical efficiency change, and the second factor measures technological change. Balk and Zofío (2018) show that the third factor corresponds to the scale (including input mix) effect, while the fourth factor corresponds to the output mix effect. Following the alternative 'Path B' delivers which differs from expression (26) in the conditioning variables of the scale and output mix effects. Taking the geometric mean of these two expressions yields This decomposition is computed by the deatfprod(X, Y, ...) function. The orient parameter should be set to the output orientation oo , and the period parameter to base . The decomposition parameter decomp should be complete .
Again, the output successively shows the Laspeyres PI ( PROD ), EC , TC , the scale (plus input mix) factor SEC , and the output mix factor OME . The last four columns show the separate values of the scale and output mix effects along 'Path A' and 'Path B'. They are identified by the time superscripts of the input and output arguments; for instance, SEC_100 identifies the third right-hand side factor in expression (26) , which is the counterpart to SEC 0 (3) . In particular cases distance functions may be infeasible, and therefore some of the factors in the above decompositions cannot be calculated. This is why the toolbox directly computes the Laspeyres PI by using expression (25) .
For the US agriculture, we can compare the results of the standard Laspeyres PI with the previous quantities-only counterparts. Using prices to aggregate inputs and outputs results in productivity growth magnitudes which are much smaller. Over the whole period Laspeyres productivity ( PROD ) increased by 20.31% on average, which is a fraction (about ten percent) of the 215.99% and 229.52% increase of the base-period Malmquist and Moorsteen-Bjurek productivity indices. Since the EC and TC factors are the same, we see that the SEC and OME effects are responsible for the reduction in productivity. Indeed, these terms are responsible for a negative contribution to the tune of −45 . 55% and −22 . 06% , respectively, over forty-four years. It is then clear that the overall picture greatly changes when moving from quantities-only to price-based productivity indices, and different interpretations and policy conclusions could be drawn depending on the specific definitions. There are also some changes at the individual level. Oregon (OR) now leads the productivity growth with a 75.75% increase, and Oklahoma (OK) resides at the bottom with a 32.88% decrease. However, the rankings are relatively similar, exhibiting positive correlation. The Spearman correlation between the Laspeyres and Malmquist indices is ρ = 0 . 3088 , while that between the Laspeyres and Moorsteen-Bjurek indices is ρ = 0 . 3067 , both statistically significant at the usual 5% level.

Related decompositions of the Laspeyres PI
Using expression (6) , expression (26) can be simplified to The same holds for expressions (27) and (28) . To obtain such a decomposition, the decomp parameter must be set to ccd , combined with the base, comparison, or geometric mean perspective. For the Laspeyres PI we would run If the base period technology is characterized by CRS, then the corresponding decomposition is obtained by setting the parameter decomp to crs .

The Paasche PI
Following "Path C" and 'Path D' and avaraging the resulting decompositions, the decomposition of the Paasche Pi (that is, expression (25) with t = 1 ) reads P ROD (x 1 , y 1 , x 0 , y 0 ; w 1 , p 1 To obtain this decomposition the period parameter must be set to comparison in the deatfprod(X, Y, ...) function. For specific DMUs some factors may be infeasible to compute. The toolbox, however, computes the Paasche PI directly rather than as a product of the components.
The first two factors at the right-hand side of the equality sign in expression (30) correspond to the CCD index M 1 o (x 1 , y 1 , x 0 , y 0 ) , defined by expression (14) . The related decomposition is obtained by changing the decomp parameter from complete to ccd . Also, if CRS holds for the comparison period technology, then the decomposition is obtained by setting the decomp parameter to crs .

The Fisher PI
The Fisher PI is the geometric mean of the Laspeyres and Paasche PIs. By merging expressions (28) and (30) its decomposition is where Q F ( p 1 , y 1 , p 0 , y 0 ) ≡ [( p 0 · y 1 / p 0 · y 0 )( p 1 · y 1 / p 1 · y 0 )] 1/2 is the Fisher output quantity index, and Q F ( w 1 , x 1 , w 0 , This decomposition is obtained by calling the deatfprod(X, Y, ...) function with the period parameter set to geomean , the orientation parameter set to oo , and the decomp parameter to complete .
The output of the function successively shows the Fisher PI ( PROD ), EC , the geometric mean of the TC factors, the geometric mean of the four separate SEC effects as presented in the third row of expression (31) , and the geometric mean of the four separate OME effects as presented in the fourth row of the same expression. Each of these four separate terms in SEC or OME corresponds to a specific path, A, B, C or D, which can be recovered by running the base and comparison period decompositions as explained above. To ensure that the Fisher index is reported, regardless of the fact that some distance function values may be infeasible to compute, the toolbox calculates the geometric mean of the Laspeyres and Paasche PIs.
It is interesting to compare the results obtained for the Fisher index to those of the geometric-mean Malmquist and Moorsteen-Bjurek productivity indices (MPI and MBPI). While the differences between the last two indices were minimal, with high correlation coefficients, this is not the case when compared to the Fisher index. However, the values in this case are in the same order of magnitude, as opposed to the base period counterparts where the spread between the indices was rather large. Average US agricultural productivity change measured by the Fisher index is 39.51%, lower than the MPI and MBPI indices, 54.94% and 47.96%, respectively. The range of productivity change is also rather similar, with the best and worst performing states being Rhode Island (RI) and Oklahoma (OK), the latter state consistently ranking last as in the previous two cases. The dissimilarity across rankings for the geometric-mean indices is captured by the Spearman coefficients: ρ = 0 . 5199 for Fisher-MPI, and ρ = 0 . 5680 for Fisher-MBPI, both statistically significant at the 1% confidence level. These values are notably smaller than the coefficient for MPI-MBPI, ρ = 0 . 8763 . Notice that, as in the base-period case, SEC and OME are responsible for the diffferences between the indices.
Finally, expression (31) with the EC and TC factors replaced by the geometric-mean CCD index is obtained by changing the decomp parameter to ccd . Under CRS the decomposition is obtained by setting the parameter to crs .

The input-orientated decomposition
The Laspeyres, Paasche, and Fisher productivity indices can also be decomposed from the input-orientated perspective. Along 'Path E' and 'Path F' one obtains decompositions for the Laspeyres PI similar to expressions (26) and (27) , except that input distance functions are used instead of output distance functions. Their geometric mean, corresponding to expression (28) , is The first two factors capture technical efficiency change and technological change, respectively. The third and fourth factors measure the input-mix effect and the scale (including output-mix) effect, respectively. This decomposition is obtained through the deatfprod(X, Y, ...) function by setting the orient parameter to the input orientation io , setting the period parameter to base , and setting the decomposition parameter decomp to complete .
To see the CCD index M 0 i (x 1 , y 1 , x 0 , y 0 ) as part of the decomposition, the decomp parameter must be set to ccd . If CRS holds for the base period technology (determined after running the function deatestrtsm(X, Y, ...) with the orient parameter set to io ), then setting the decomp parameter to crs provides the correct decomposition. In this case, the third factor measures the input-mix effect, but the fourth factor measures only the output-mix effect, since under CRS the scale effect vanishes.
By combining the results of following 'Path G' and 'Path H' one obtains the input-orientated decomposition of the Paasche PI. In the deatfprod(X, Y, ...) function the period parameter must then be set to comparison . The related decompositions require setting the decomp parameter to ccd or crs .
Finally, by taking the geometric mean of the decomposition in expression (32) and its comparison period counterpart, one obtains the input-distance-function based decomposition of the Fisher PI. This is calculated by setting the period parameter set to geomean , the orientation parameter to io , and the decomp parameter to complete .
The decomposition featuring the geometric-mean CCD index is obtained by setting the decomp parameter to ccd . The CRS specification, valid if both technologies indeed exhibit CRS, is obtained by setting this parameter to crs .

Share-weighted productivity indices (Törnqvist)
A share-weighted productivity index is defined as a weighted geometric mean of output quantity changes divided by a weighted geometric mean of input quantity changes. In particular, individual input quantity changes, x 1 n /x 0 n , and output quantity changes, y 1 m /y 0 m , are weighted by their respective cost and revenue shares in a certain period t . The generic definition is where s t n ≡ w t n x t n /w t · x t > 0 (n = 1 , . . . , N) , N n =1 s t n = 1 , u t m ≡ p t m y t m /p t · y t > 0 (m = 1 , . . . , M) , M m =1 u t m = 1 . For t = 0 the index is called GeoLaspeyres, and for t = 1 the index is called GeoPaasche. The geometric mean of the two indices is the Törnqvist productivity index.
The first two factors after the equality sign are the same as the first two factors of the Laspeyres PI in expression (26) ; that is, technical efficiency change and technological change, respectively. The third factor captures the scale (including input-mix) effect, and the fourth factor measures the output-mix effect.
Following the alternative 'Path B' delivers which differs from expression (34) in the scale and the output-mix effect. Taking the geometric average of these two expressions yields The function to calculate these decompositions is deatfgprod(X, Y, ...) , with the orient parameter set to output orientation oo and the period parameter to base . The decomposition parameter decomp must be set to complete .
The output of the function successively shows the GeoLaspeyres productivity index ( GPROD ), EC , TC , and the geometric means and component values of the SEC and OME effects corresponding to 'Path A' and 'Path B'. The components are identified by the time superscripts of the input and output arguments in each factor; for instance, SEC_100 represents the third factor in expression (34) , counterpart to SEC 0 o (x 1 , x 0 , y 0 ) in expression (2) . As calculation of some distance function values may be infeasible, the toolbox calculates the GeoLaspeyres PI directly from the data.
The share-weighted results for US agriculture are similar to their price-weighted counterparts. For instance, the GeoLaspeyres and the Laspeyres deliver on average 24.25% and 20.31%, respectively. Also the maximum and minimum values are very similar, but on this occasion corresponding to Georgia (GA) and Florida (FL). This means that the individual rankings differ substantially between the two indices. Thus, using prices or value shares to aggregate outputs and inputs makes a difference. Indeed, the Spearman correlation between the two is ρ = 0 . 5404 , the lowest across all possible pairs so far, but still significant at the 5% level. However, the correlations between the GeoLaspeyres and the Malmquist (MPI) or the Moorsteen-Bjurek (MBPI) indices are even lower: ρ = 0 . 078 and ρ = 0 . 2224 , respectively, and not statistically significant at the standard confidence levels. Since the technical efficiency change ( EC ) and technological change ( TC ) factors are the same, this shows the importance of the scale and output-mix effects. All this calls for caution on the part of practitioners when choosing a specific definition of productivity change. If prices are available and reliable, then using a popular definition may yield results which differ substantially from analogous quantities-only definitions.

Related decompositions of the GeoLaspeyres PI
Recalling expression (6) , expression (34) can be written as Similar simplifications hold for expressions (35) and (36) . To obtain these decompositions, the decomp parameter must be set to ccd .
If the base period technology exhibits CRS then one must set the parameter decomp to crs .

The GeoPaasche PI
Along 'Path C' and 'Path D' we obtain decompositions of the GeoPaasche PI. Their geometric mean, the analogue to expression (36) , reads To obtain this decompositions, in the deatfgprod(X, Y, ...) function the period parameter must be set to comparison .
The first two factors can be aggregated to the CCD index M 1 o (x 1 , y 1 , x 0 , y 0 ) , as defined by expression (14) where the Törnqvist output and input quantity indices are defined by Q T (p 1 , y 1 , p 0 , n /x 0 n ) (s 0 n + s 1 n ) / 2 , respectively. To calculate this decomposition, in the deatfgprod(X, Y, ...) function the user sets period to geomean , orientation to oo , and decomp to complete .
The output of the function successively shows the Törnqvist productivity index ( GPROD ), the technical efficiency change factor ( EC ), the geometric mean of the two technological change factors ( TC ), the geometric mean of the four component scale efficiency effects in the third row of expression (39) ( SEC ), and the geometric mean of the four component output-mix effects in the fourth row of the same expression ( OME ). Each of these four components corresponds to a specific path A, B, C, or D, and can be recovered by running the previous base and comparison period decompositions. The toolbox calculates the Törnqvist index as the geometric mean of the GeoLaspeyres and GeoPaasche indices. Some factors might be missing because of infeasible distance function values.
The three-factor decomposition exhibiting the geometric-mean CCD index is obtained by changing the decomp parameter to ccd . If technologies of base and comparison periods exhibit CRS, then the decomposition is calculated by setting the same parameter to crs .
We can now compare the results of the Törnqvist productivity index to those of the geometric-mean output-orientated Malmquist (MPI), Moorsteen-Bjurek (MBPI), and Fisher indices. Average productivity growth in US agriculture, 37.68%, is the lowest compared to the previous three: 54.94% (MPI), 47.96% (MBPI), and 39.51% (Fisher). All these indices share the same efficiency change ( EC ) and technological change ( TC ) factors, and therefore the differences come from the scale and output-mix effects ( SEC ) and ( OME ). In this case the states exhibiting the maximum and minimum values coincide with those observed for the Fisher index, that is, Rhode Island (RI) and Oklahoma (OK). As for the rankings, the Spearman correlation coefficient between the two is almost one, ρ = 0 . 9948 , and statistically significant. Those for Törnqvist-MPI and Törnqvist-MBPI are lower: ρ = 0 . 4822 and ρ = 0 . 5397 , respectively.

The input-orientated decomposition
All the share-weighted productivity indices discussed in the previous subsection can also be decomposed using input distance functions. Along 'Path E' and 'Path F' one obtains expressions analogous to (34) and (35) , but now using input distance functions. The geometric mean counterpart to expression (36) is The first two factors measure technical efficiency change and technological change, respectively. The third factor measures the input-mix effect and the fourth factor measures the scale (including output-mix) effect. In the deatfgprod(X, Y, ...) function, this decomposition requires setting the orient parameter to io , the period parameter to base , and the decomposition parameter decomp to complete .
To recover the CCD index M 0 i (x 1 , y 1 , x 0 , y 0 ) , the decomp parameter must be set to ccd . If the base period technology exhibits CRS then setting the decomp parameter to crs calculates the appropriate decomposition.
Alternatively, the comparison-period analogue of expression (40) , corresponding to the geometric mean of 'Path G' and 'Path H', is computed by setting the period parameter to comparison . The decompositions featuring the comparison-period CCD index, and the decomposition under CRS (of the comparison period technology), require setting the decomp parameter to ccd or crs , respectively.  Finally, by taking the geometric mean of expression (40) and its comparison-period analogue, one obtains the input-orientated decomposition of the Törnqvist productivity index. This decomposition is obtained from the deatfprod(X, Y, ...) function by setting the period parameter to geomean , the orientation parameter to io , and the decomp parameter to complete .
The three-factor decomposition featuring the geometric-mean CCD indices, is obtained by setting the decomp parameter to ccd . The decomposition under CRS is obtained by setting this parameter to crs .

Empirical results: TFP growth in US agriculture
As for the empirical applications, the literature provides us with a number of implementations. However, most of these implementations, sometimes on microdata, sometimes on macrodata, appear to be partial, usually concerning only one or two decompositions at a time. The unique feature of this paper is that all the theoretically possible decompositions are applied to the same US agricultural dataset with information at the state level. In the particular case of our dataset, we are able to judge the extent to which the various measures and decompositions are empirically different. Focusing on the geometric mean definitions, average productivity growth in US agriculture in the period 1960-2004 lays in the 40 . 0% − 50 . 0% range. Aggregate and individual TFP growth is mostly driven by technological progress, contributing around 30%. There are also mild efficiency gains, but scale, and input-and output-mix effects play a lesser role. Fig. 2 depicts box-plots for the four productivity indices: Malmquist, Moorsteen-Bjurek, Fisher, and Törnqvist. Each box represents the interval between the first and third quartiles of the productivity change distribution, and the horizontal line represents the median. The dispersion of the distributions within these interquartile ranges appears to be rather small. Finally, the higher and lower whiskers signal values one and half times the interquartile range, while the dots represent outliers laying beyond those values. In the case of the Malmquist and Moorsteen-Bjurek indices, the highest and lowest values correspond to the state of Delaware (DE) and Oklahoma (OK), respectively. Note also that the price-based indices exhibit less variability and no outliers.
It is interesting to compare our results with those reported by the USDA (see Table 19, Indices of Total Factor Productivity by State, available at https://www.ers.usda.gov/data-products/agricultural-productivity-in-the-us/ ). The USDA Economic Research Service (ERS) publishes TFP measures based on a sophisticated system of farm production accounts. The measures are based on a translog transformation function which relates the growth rates of multiple outputs to the cost-share weighted growth rates of labor, capital, and intermediate inputs. See Ball et al. (1999) for a detailed description. Table 2 displays correlations and distribution tests between TFP growth reported by the USDA and those obtained by the indices discussed in this paper. The results show a positive and significant correlation -Pearson's r , Spearman's ρ, and Kendall's τ -between the USDA TFP index numbers and the Malmquist and Moorsteen-Bjurek index numbers, and a stronger correlation with the Fisher and Törnqvist index numbers. However, the t test on the equality of means and the Wilcoxon matched-pairs signed-rank test reject the null hypotheses that the means and the distributions of the USDA and this paper's index numbers are the same. The underlying reason is presumably that input cost shares obtained via the estimation procedure employed by USDA substantially differ from those obtained by DEA or conventional indices. It is, however, reassuring to see that the ranking of states tends to be preserved, as shown by the positive Spearman and Kendall rank correlation coefficients.

Conclusion
The Total Factor Productivity Toolbox allows the calculation and decomposition of the most popular productivity indices in an organized environment. The concept of distance function, implemented through non-parametric Data Envelopment Analysis, is the building block for the various decompositions. If only quantities are available, the toolbox calculates and decomposes TFP change according to the Malmquist and Moorsteen-Bjurek definitions. If also prices are available, classical TFP indices such as Fisher and Törnqvist can be calculated and decomposed.
Complex and lengthy decompositions involving panel data are rendered operational in a systematic way. The toolbox allows for many options concerning orientation, benchmark period, VRS or CRS, type of TFP index, and type of decomposition, so as to identify the contribution that different factors make to productivity change. We showed how to organize the data, use the available functions, and interpret the results. To illustrate the toolbox we calculated all the indices on a common example so that results can easily be compared. The toolbox is a valid self-contained MATLAB package for the measurement and decomposition of TFP change.
From the empirical findings, we draw some general conclusions: (i) Since the benchmark period matters empirically, we recommend researchers to use geometric means of base-period and comparison-period benchmark indices, unless there are compelling reasons that a particular benchmark should be used. (ii) The Malmquist and Moorsteen-Bjurek indices, based on quantities only, deliver higher productivity growth figures than the price-weighted and share-weighted indices which are based on quantities and (market) prices. To which extent this is due to the fact that the technologies were estimated by DEA in our empirical exercise remains to be seen. (iii) The results of price-weighted (for example, Fisher) and share-weighted (for example, Törnqvist) indices are numerically very close. Thus the choice between the two families of indices is immaterial. (iv) In our empirical example (local) technological change generally appeared to be the main component of TFP change.
Finally, since the code is freely available in an open source repository, users will benefit from the collaboration and review of the community. They can check and modify the code to adapt it to their own needs and extend it with new definitions.