A Meta-Analysis of Aid Effectiveness Revisiting the Evidence

As research on the empirical link between aid and growth continues to grow, it is time to revisit the accumulated evidence on aid effectiveness. This study extends previous meta-analyses, noting that the increased availability of data enables us to conduct a sub-group analysis by disaggregating the sample into different time horizons to assess whether there are temporal shifts in aid effectiveness. The new and updated results show that the previously reported positive evidence of aid’s impact is robust to the inclusion of more recent studies and this holds for different time horizons as well. The authenticity of the observed effect is further confirmed by results from funnel plots, regression-based tests, and a cumulative meta-analysis for publication bias.


Introduction
Analyzing the aid-growth nexus continues to be an area of focus in development economics. The empirical research on the effect of aid on growth goes back as far as the early 1970s. Though the methodological rigour varies, the profession has made numerous efforts since then to empirically analyze the effectiveness of aid in promoting growth. Results range from 'aid works' to 'aid does not work' and yet in other cases 'aid works but only under certain conditions'. Until 2007, the empirical evidence from individual studies varied but the past decade has witnessed convergence towards a positive assessment regarding the potency of aid in spurring economic growth (see, among others, Arndt, Jones, & Tarp, 2010, 2016. Over the years a variety of efforts have been made in the aid effectiveness literature to scrutinize and critically analyze the nature of the existing mixed aid growth evidence with the aim of showing where the balance of evidence lies. For instance, Hansen and Tarp (2000) carefully analyzed three generations of the aid effective-ness literature, and more recently, Arndt et al. (2010) discussed a fourth generation. Our aim here is to complement these efforts, by synthesizing the existing empirical results from the accumulated evidence on aid and growth. In particular, we are interested in knowing what the range of findings (negative, zero, or positive) that have been evolving over the years, on average, tell us about aid's impact on growth.
Mekasha and Tarp (2013) addressed this issue relying on aid and growth empirical studies carried out over the period from 1970 to 2004. The accumulated evidence showed a positive impact of aid on growth during the 34-year period in question, and the authors documented that this effect is authentic, rather than an artefact of publication selection.
As the sample period in the work of Mekasha and Tarp (2013) only stretches until 2004, and given that more than a decade has passed since then, we present an update of the accumulated evidence here by including aid and growth empirical articles produced after 2004. Apart from enlarging the sample coverage and hence working with a larger sample size, this also deepens the analysis in two important ways: (i) we now cover a longer time period and so are able to conduct a more disaggregated analysis, mainly by splitting the sample into different time periods (sub-groups); and (ii) we are able to assess whether there are temporal shifts in aid effectiveness.
In this line of thinking, the present study answers the following questions. First, does the addition of new studies have any impact on the results documented by Mekasha and Tarp (2013)? Second, has aid effectiveness changed over time and if so, is the change genuine or an artefact of publication bias? Third, is there heterogeneity between studies and if so, what explains the observed heterogeneity? To address these questions, we use a data set of 141 empirical studies on aid and growth that were conducted over the  period. This gives a total of 1,778 estimates for the meta-analysis.
The article is structured as follows. Section 2 first updates the aid effectiveness meta-analysis evidence documented by Mekasha and Tarp (2013) and then proceeds to present a sub-group analysis by disaggregating the data by year of publication. Section 3 presents a cumulative meta-analysis to establish how the weight of the evidence has shifted over time. This is followed by an in-depth investigation of publication bias in Section 4. In Section 5, we present a multivariate meta-regression analysis to understand the source of heterogeneity in effect estimates across studies. Finally, concluding remarks are given in Section 6.

Overall Effect
One of the main objectives of meta-analysis is to obtain an overall effect estimate (weighted average) from a body of literature by combining the appropriate summary statistics from each study. The choice of an appropriate model to combine the summary statistics extracted from each study is a major step in any metaanalysis and this choice depends on the degree of heterogeneity in effect sizes. In this regard, there are two alternative models: a fixed-effects model, which assumes away heterogeneity between studies and hence only uses within-study variances as study weights, and a random-effects model, which takes the across-study variation in the true effect estimates into account and uses both the within and between-study variances as weights.
Denoting the number of studies considered for the meta-analysis by k and the corresponding effect size estimates by x 1 , x 2 , x 3 … x k , the overall effect estimate is: whereŵ i in the case of the random and fixed-effects model is respectively given by 1/( 2 i + 2 ) and 1/ 2 i where 2 i and 2 i are within and between-study variance of effect estimates respectively.
As can be seen from Equation 1, the random-effects model accounts for both within and between study variance to calculate the weighted average effect. Compared to the fixed-effects model, which only accounts for the within-study variance, the random-effects model gives a wider confidence interval for the overall effect and hence conservative estimates compared to the fixed-effects model (see also Kontopantelis, Springate, & Reeves, 2013). The assumption of effect homogeneity by the fixed-effect model is often criticized. In practice, a certain degree of variation in the true effect is expected. This is due to differences in the study populations as well as in the type, duration, and intensity of interventions (see Thompson & Pocock, 1991).
In this study, we rely on a random-effects model to obtain an overall average effect from the aid effectiveness literature using estimates from empirical aid-growth articles that became available over the  period. This choice is motivated by the apparent between-study heterogeneity in aid-growth empirical studies. This can easily be checked using statistical tests and graphical tools as shown in Mekasha and Tarp (2013) which discusses in detail why it is that the randomeffects model is more appropriate in conducting a metaanalysis of aid and growth empirical studies.
The Bootstrapped DerSimonian-Laird (BDL) model was used to estimate the random-effects model. This is a non-iterative moments-based estimator which improves upon the DerSimonian-Laird model, a commonly used random-effects model, by estimating the between-study variance and other heterogeneity parameters applying a non-parametric bootstrap method. The BDL model has proven to be the best method in terms of detecting any heterogeneity, particularly for large-scale meta-analysis (see Kontopantelis et al., 2013).
Against this background, Table 1 presents the weighted average overall effect estimate from the aidgrowth literature. We first disaggregated the sample into 'old period' and 'new period', where the former is the same as the sample period used in Mekasha and Tarp (2013) and the latter is a new sample focusing on the years added in this study. We finally report an overall effect estimate for the full sample period by combining the old and new periods indicated above. Such a subgroup analysis is useful in assessing whether the effect size has shifted over time (see Borenstein, Hedges, Higgins, & Rothstein, 2009). Factors such as improvement in data quality, changes in donor priorities, and the evolution of better estimation techniques, among others, are the likely explanations for potential changes in research findings within the aid effectiveness literature.
As can be seen from Table 1, the overall effect is found to be positive and statistically significant at 5 per cent level of significance. This is true both in the full Notes: BDL refers to Bootstrapped DerSimonian-Laird random-effects model. Bootstrap of 10,000 repetitions is used in all cases. I 2 ranges from 0-100 per cent where a larger score shows a higher level of heterogeneity. Source: authors' estimates. and the disaggregated samples. Even if the magnitude of the effect varies across periods and shows some decline over time, the overall conclusion regarding the potency of foreign aid in spurring growth remains the same. Regarding the practical relevance of the effect size estimate from meta-analysis, as such, no standard cut-off value exists to label an effect estimate as 'small', 'medium', or 'large'. However, according to a preliminary guideline in the literature that suggests a cut-off for economics metaanalysis, the effect sizes (the partial correlations) from our meta-analysis reported in Table 1, fall in the small to medium range. However, given that this is a preliminary guideline, one needs to be cautious about drawing firm conclusions. Further discussion is available in Mekasha and Tarp (2018).
As well as the above analysis, we have also estimated the overall effect at study level, i.e. by taking a single estimate from each study. The results from this exercise are presented in Table A2, which shows that the combined effect remains positive, statistically significant, and is higher compared to the case where the estimation is done based on study by regression level data. Moreover, as a further robustness check, we report in the Appendix a weighted average overall effect using a sample disaggregation based on the discussion in the aid effectiveness literature regarding the different generations of aidgrowth empirical studies (see Arndt et al., 2010). As can be seen from Table A3 in the Appendix, our result remains robust.
Apart from showing the average effect size from studies included in the meta-analysis, the results presented in Table 1 show the level of heterogeneity as indicated by the I 2 statistics. In particular, the I 2 statistic shows the percentage of the between-study heterogeneity that can be attributed to the variability in the true treatment effect instead of sampling variation. An I 2 value of more than 50 per cent is normally considered to be high (see, for example, Kontopantelis et al., 2013).
In Table 1, there is, in all the cases, considerable heterogeneity (in the true effect of aid) across studies, suggesting that the effect homogeneity assumption implied by the fixed-effects model is not valid. In other words, the use of a random-effects model, which allows the true effect of aid to vary between studies, is an appropriate choice.
To put our results into perspective, our finding stands in contrast to the results reported in Doucouliagos and Paldam (2015). These authors mainly focus their analysis on the 2007-2011 period and particularly argue that the 2007-2008 years are 'dark years' in aid effectiveness. They further add that the effect estimates in the 2009-2011 period show presence of an 'upward kink' which, according to these authors, is purely a result of publication bias rather than a real improvement in aid effectiveness.
We use the same dataset as Doucouliagos and Paldam (2015), so checking the assertions made by the authors makes our analysis more complete. We do so by answering the following four questions: (i) is there any reasonable justification behind the classification of the different periods?; (ii) is the 2007-2008 period really a dark period in aid effectiveness?; (iii) is the 'upward kink' real and is there any theoretical/intuitive reason to expect an upward kink in the 2009-2011 period?; (iv) can the concern regarding publication bias be justified by the data at hand?
To begin, we find that the decision to categorize the years 2005 and 2006 as 'old-period' is arbitrary and actually matters for the results. As indicated in Doucouliagos and Paldam (2015): The period covered by Doucouliagos and Paldam (2008) (Table 2), comparing row 2 and row 3 in the middle section, this choice matters for the results; i.e., when one includes years 2005 and 2006 in the 'new-period', the effect of aid is positive (albeit small) and statistically significant, but   Table 2), the result appears to be contrary to what Doucouliagos and Paldam (2015) found. That is, in the 2008-2011 sample period, the impact of aid on growth is, on average, positive (0.05) and is precisely estimated. On the other hand, the bias coefficient is negative and statistically indistinguishable from zero. Moreover, the Doucouliagos and Paldam (2015) claim of an 'upward kink' in the 2009-2011 period is not robust to how one defines periods A and B. Given that there is no clear reason why one should expect any jump in this period, the 'upward kink' reported in Doucouliagos and Paldam (2015) does not seem to reflect real changes. As it will become clear in what follows, this jump is exclusively due to the inclusion of a large set of observations from one single study.  (2008). It is important to highlight that estimating the effect of aid on growth by excluding estimates from Rajan and Subramanian (2008) gives a positive and statistically significant effect of aid on growth for the 2007-2008 period.

Patterns of Evidence over Time-Cumulative Meta-Analysis
Another question of interest to both researchers and policymakers is whether there are temporal changes in aid effectiveness. The article presented here has made effort to assess whether the magnitude and precision of the impact of aid on growth changes with the passage of time or following the addition of newer studies. To this end, the work of Lau et al. (1992) was followed and cumulative meta-analysis was conducted with studies being sequentially added to the analysis according to a variable of interest, and a new-pooled estimate recalculated every time a new study was added to the analysis. Since the objective is to uncover the pattern of evidence over time and to see how the conclusions may have shifted, the variable of interest is the year of publication for each study. Thus, in doing the cumulative meta-analysis, studies were sorted in chronological order for the 1970-2011 period. In cases where studies report multiple estimates, the data were pooled by study and an overall effect estimate calculated for each study. Figure 1 and Table A4 in the Appendix present the results from cumulative random-effects meta-analysis of the aid-growth literature. In Figure 1, the circles show the estimates from the cumulative meta-analysis and the horizontal lines show the 95 per cent confidence interval. Moreover, the vertical dotted line in the middle of the figure shows the combined estimate. The value for each row shows the summary estimate for a meta-analysis based on all studies up-to and including that row. The point estimate in the last row is the same as the effect estimate shown in the summary line as the analysis in the last row includes data from all the 141 studies.
As can be seen from the results in Figure 1 and Table A4, there is evidence of the positive impact of aid on growth since the early 1980s with a magnitude of 0.206. As one moves further down the plot, the effect size shows some decline and stabilizes around a combined effect equal to 0.074 with a confidence interval from 0.051 to 0.097. Over the years, the addition of new studies does not substantially change the aid effectiveness conclusion. In general, even if the answers to the aid effectiveness question in terms of growth impact have evolved over the years, the balance of evidence, on average, points to a positive (albeit small to moderate) and statistically significant impact of aid on growth.

Assessing Publication Bias
One issue that can jeopardize the credibility of results from meta-analysis is the issue of publication bias. It arises if there is a tendency to only publish research findings with statistically significant treatment effect (Sterne, Gavaghan, & Egger, 2000). That is, if studies included in the meta-analysis are a biased sample of the target population of studies (for example, if small studies with statistically insignificant findings remain unpublished/in the grey literature), the meta-analysis may overestimate the true effect (see Borenstein et al., 2009). In the following section, using various methods we, assess whether publication bias is a concern within the aid effectiveness literature.

Funnel Plot
One way to assess the issue of publication bias in a body of literature is to use funnel plots that relate the precision of studies (study size) to the size of the effect estimate. In the absence of publication bias, smaller studies are expected to scatter widely at the bottom of the graph with the spread getting narrower as study precision increases. Thus, if publication bias is not a problem, the plot takes the shape of a symmetrically inverted funnel. Figure 2 presents a funnel plot of the aid effectiveness literature. The vertical line at the centre of the plot shows the combined effect estimate from the aid effectiveness literature. As can be seen from the figure, the estimates appear randomly distributed around the combined effect estimate, and the plot exhibits symmetry showing lack of evidence to suggest the existence of publication bias in the aid-growth literature. Particularly note that smaller studies with statistically insignificant results are not missing. A further check for publication bias relies on contour enhanced funnel plots. This approach uses the idea that the main reason for studies to remain unpublished is lack of statistical significance, with studies that cannot achieve standard levels of statistical significance left out of mainstream publications (Dickersin, 1997).
To check whether this is the case in the aid effectiveness literature, we add contours of statistical significance on the funnel plot shown in Figure 1. This makes it easier to assess the statistical significance of hypothetically missing studies. That is, we can check whether the areas where studies are likely to be missing are areas of low statistical significance and whether areas, where studies are more visible, are areas of high statistical significance.
Publication bias is likely to exist if the areas where studies are missing are areas of low statistical significance. As shown in the contour enhanced funnel plot depicted in Figure 3, this is not the case for the aid effectiveness literature studied here. Overall, the distribution of the estimates is reasonable in the regions of both low and high statistical significance, and there is no evidence that studies with insignificant results have been repressed.

Cumulative Meta-Analysis and Publication Bias
Cumulative meta-analysis can also be used to investigate whether the combined effect estimate presented in Section 2 suffers from publication bias in the literature. This is done first by sorting studies based on their level of precision (from the most precise to the least precise) and then by sequentially adding studies to the analysis.
That is, in the cumulative meta-analysis, the first estimate represents an estimate of the most precise study, and the second estimate represents meta-analysis of the first two precise studies, and so on. The assumption here is that precise studies are less likely to suffer from publication bias, and it is the less precise studies that are more prone to overstating their effect estimates to compensate for their large standard errors in order to achieve a statistically significant effect.
This approach helps us to see if the effect estimates of the less precise studies that are likely to report biased (larger) effect estimates to increase their chances of publication influence the combined effect estimate. Thus, if the effect size increases, as less precise studies are included in the analysis, it is likely that there is a bias from small studies (see Borenstein et al., 2009). Figure 4 presents the cumulative meta-analysis of studies conducted over the 1970-2011 period. Here studies are sorted from most to least precise, and the vertical reference line represents the combined effect estimate based on the random-effects model. While the circles show the cumulative effect estimates, the horizontal lines show the 95 per cent confidence intervals. On the vertical axis, study names ordered based on their level of precision are shown and the horizontal axis shows the partial effect estimate. Since the names of these 141 studies and respective cumulative effect estimates are not visible in this plot, we have also presented the same cumulative meta-analysis in a table format (see Table A5).
As shown in Figure 4 and Table A5, there is no as such consistent pattern of an increase in the cumulative effect   estimate as less and less precise studies are added to the analysis. For instance, the most precise study has an effect estimate of 0.076 with a confidence interval from 0.037 to 0.115, while the cumulative meta-analysis of the ten most precise studies shows an estimate of 0.05. After that, the combined effect estimate starts to increase, reaching 0.07 and 0.08 with the top 20 and 30 most pre-cise studies added, respectively. As more and more (relatively less precise) studies are added, the cumulative effect rather shows a decline reaching 0.05 and gradually converging at 0.074.
In general, further addition of the less and less precise studies does not reveal a steadily increasing clear pattern of the cumulative effect estimates to suggest the existence of publication bias in the literature. It is also worth noting that the confidence intervals from the cumulative meta-analysis of the least precise studies do overlap with that obtained from the cumulative effect estimates of the most precise studies; i.e. comparing the confidence interval from the least precise studies (final rows) with the confidence interval when the 1st, 10th, 20th etc. most precise studies are added to the analysis. This shows that the effect estimates from the most and least precise studies are not statistically significantly different, making the issue of publication bias less of a concern here.

Regression-Based Test
Since visual inspection of a funnel plot is subjective, we also conducted a regression-based test to objectively assess the presence or absence of publication bias. Egger, Smith, Sceider and Minder (1997) is the most commonly used test to assess asymmetry in funnel plots. It regresses the standardized effect from each study on precision (inverse of standard error). The regression to be estimated takes the following form: where t i is the standardized effect and 1/SE i is the measure of precision. The parameters of interest are 0 and 1 which capture bias and genuine effect respectively. A detailed discussion of the test, the importance of doing a multivariate analysis and the choice of covariates can be found in Mekasha and Tarp (2013). The result from the Egger et al. (1997) funnel asymmetry test is reported in Table 3. As can be seen from the results in both the bivariate and multivariate regressions, the bias coefficient is found to be statistically indistinguishable from zero, confirming the absence of publication bias in the aid-growth literature, in line with the funnel plot analysis. Moreover, in both the bivariate and multivariate results, the coefficient of precision (the estimate of the impact of aid on growth) is found to be positive and statistically significant. Note that when we look at our preferred estimation, controlling for all study characteristics (Columns 2, 5 and 6), the estimated effect of aid from the existing literature is 0.13, 0.05, and 0.05 for the 'old period', 'new period', and the 'full sample', respectively, with the coefficients being statistically significant in all cases. This is in stark contrast to the finding of Doucouliagos and Paldam (2015) who reported that this coefficient was insignificant in both a statistical and an economic sense.
Overall, based on graphical tools and the regressionbased tests, publication bias is not found to be a concern in the aid-growth empirical literature. This confirms that the overall effect estimate obtained from the aid effectiveness literature is not an artefact of publication bias.

Meta-Regression Analysis
As seen in Table 1, there is considerable heterogeneity in the aid effectiveness literature. In this section, we explore whether this observed heterogeneity could be attributed to one or more of the study characteristics. To this end, we employ a random-effects meta-regression analysis. In this regression, following estimation of the between-study variance 2 using methods of moments, the coefficient estimates are estimated using weighted least squares where 1/( 2 i + 2 ) is the weight. The results from the meta-regression are presented in Table A6 in the Appendix. According to the statistics reported at the bottom of the table, 72 per cent of the residual variance is due to heterogeneity of the true effect, with the remaining 18 per cent attributed to sampling variability. Moreover, the proportion of betweenstudy variance explained by the covariates can be seen from the adjusted R 2 . This is calculated by comparing the estimated between-study variance with its value when no covariates are included. We note that 25 per cent of the between-study variance is explained by the covariates and the remaining between-study variance is found to be 0.008.
Turning to the role of the study characteristics in explaining the variation in reported effects, it appears that more than 20 covariates are important. However, caution needs to be exercised in interpreting the results from this regression. According to Higgins and Thompson (2004), testing several covariates without adjusting for multiplicity will lead to increased false positive rates in Notes: Standard errors in parentheses. * p < .1, ** p < .05, *** p < 0.01. Old period , new period (2005)(2006)(2007)(2008)(2009)(2010)(2011) and full sample . Source: authors' estimates. meta-regression. To deal with this issue, these authors suggest a permutation test to assess statistical significance in meta-regression and warn researchers not to make claims about statistical significance before conducting such a test. Thus, following the suggestion of Higgins and Thompson (2004), we conduct the permutation test on the meta-regression reported in the Appendix. The results are reported in Table 4. The first column shows permutation p-values without adjustment for multiplicity and the second column shows p-values adjusted for multiplicity. While Table 4 reveals which study characteristics are, statistically speaking, important in explaining the variation in reported effect estimates within the aid-growth literature, Table A6 shows in which direction (how) each particular study characteristic affects the reported estimates. After adjusting for multiple testing, only 10 of the included covariates appear to have a role in explaining the heterogeneity in effect size, shown in bold within Table 4. We highlight that the type of publication outlet, data type (structure), and type of controls included in the growth regression are found to be important in explaining the observed heterogeneity in reported effect estimates of the impact of aid on economic growth. For instance, the positive and statistically significant coefficient on the variable 'Panel' (from Table A6 and Table 4) implies, ceteris paribus, that studies using panel data, on average report higher (positive) partial correlations. Another point worth noting from the results in the tables is that the coefficients of the decade dummies are statistically indistinguishable from zero. This implies that the sample period covered by the original studies does not have a role in explaining the reported variation in research findings on aid and growth.

Conclusion
The main aim of this study was to update the aid effectiveness meta-analysis evidence in Mekasha and Tarp (2013), adding newly available studies which emerged from 2004 to 2011. To this end, we employed a randomeffects model. This is the appropriate choice in the presence of considerable heterogeneity in the true effects, which is the case in the aid effectiveness literature. The positive impact of aid on growth in Mekasha and Tarp (2013) is shown here as being robust to the inclusion of new studies in the meta-analysis and this appears to be true for different time horizons.
Having established this result, we carefully assessed whether publication bias has any impact on the observed effect estimates. Results from funnel plots, a regressionbased test, and a cumulative meta-analysis for publication bias all suggest that publication bias is not a concern within the aid-growth literature and the observed effect is not an artefact hereof. Finally, given the considerable heterogeneity observed in the data, we conducted a meta-regression analysis to explain the heterogeneity in reported effect estimates. After adjusting the p-values for multiple testing, it is found that only ten out Note: see Table A1 for a detailed description of the variables used in Table 4. Source: authors' estimates.
of the 50 study characteristics appear to be important in explaining the observed heterogeneity. These include the type of publication outlet, data types, and the type of controls used in the growth regression. In sum, careful meta-analysis, including more recent studies do not suggest any material changes in the previously established insight that aid promotes growth in a statistically significant manner. The results presented here coupled with the previously documented evidence in Mekasha and Tarp (2013) provide a systematic and objective (quantitative) assessment of the current body of findings within the literature and hence give a clear answer to the question raised by Cassen and Associates (1994): Does Aid Work? Having drawn this conclusion, the following points need attention in future evaluations of aid effectiveness.
First, the evidence presented here is clearly not the full story of aid effectiveness. Promoting economic growth is often not the primary objective of foreign aid, and neither should it be. Following the adoption of the Millennium Development Goals (MDGs) back in 2000 and the Sustainable Development Goals (SDGs) in 2017, donors tend to channel most of their assistance to social sectors such as health and education as well as to poverty reduction interventions in general. With multifaceted objectives, aid effectiveness meta-analysis needs to move beyond examining the role of aid on economic growth. A meta-analysis of aid and poverty reduction would be an interesting future avenue to explore, once sufficient empirical evidence from individual studies has accumulated. Furthermore, on top of the aid effectiveness analysis, careful attention should also be given to the increasing focus on the concept of development effectiveness that covers rather broader outcomes.
Second, there is a need to complement the existing empirical evidence on aid and growth with countryspecific success/failure stories, which we believe are a valid and yet often neglected aspect in the discourse surrounding aid effectiveness. For instance, Arndt, Jones and Tarp (2007) have shown how a high level of sustained aid to Mozambique helped the country establish peace, manage the difficulties of post-war stabilization, and embark on widespread reconstruction. In addition, the experiences of Vietnam and South Korea are also examples regarding the role that aid can play in facilitating the development process of a country.
Last, but by no means least, future aid effectiveness studies need to deal with data and methodological concerns associated with the current aid-growth empirical studies. These concerns include, but are not limited to, the need to control important factors such as export price (terms of trade) shocks, exports and private capital flows, the need for comparing aid effectiveness results using alternative aid data such as Country Programmable Aid which better reflect actual aid flows to countries and which have increasingly become available in recent years. Moreover, in assessing aid effectiveness, it is crucial to look for the longer-term impact of aid as a large propor-tion of aid goes to social sectors like health and education following global development commitments such as the MDGs and SDGs.     Notes: BDL refers to Bootstrapped DerSimonian-Laird random-effects model. Bootstrap of 10,000 repetitions used in all cases. I 2 = a heterogeneity measure ranging from 0-100 per cent where a larger score shows a higher level of heterogeneity. Source: authors' estimates.  (2008)(2009)(2010)(2011) Notes: BDL refers to Bootstrapped DerSimonian-Laird random-effects model. Bootstrap of 10,000 repetitions used in all cases. I 2 = a heterogeneity measure ranging from 0-100 per cent where a larger score shows a higher level of heterogeneity. Source: authors' estimates.