Robust rank‐based meta‐analyses for two‐sample designs with application to platelet counts of malaria infection data

In this paper, we propose robust meta‐analysis procedures for individual studies that report a broad range of robust summary statistics for a two‐sample problem. Summary statistics of individual studies could be presented in different forms including full data, medians of the two samples, the Hodges‐Lehman and Wilcoxon estimates of the location shift parameters. Data synthesis is made under both fixed‐effect and random‐effect meta‐analysis models. We systematically compare these robust meta‐analysis procedures via simulation studies to meta‐analysis procedure based on sample means and variances from individual studies under a wide range of error distributions. We show that the coverage probabilities of the robust meta‐analysis confidence intervals are quite close to the nominal confidence level. We also show that mean square error (MSE) of the robust meta‐analysis estimator is considerably smaller than that of the non‐robust meta‐analysis estimator under the contaminated normal, heavy tailed and skewed error distributions. The robust meta‐analysis procedures are then applied to platelet count reduction for malaria infected patients in Ghana.

Meta-analysis can be modeled either using a fixed-effect or a random-effect model. The fixed-effect model assumes a single parameter (effect size, Δ) that represents all independent studies considered in the data synthesis. The differences among individual studies are attributed to sampling error. This model is also referred as the common-effect model. It provides a weighted average of parameter estimates from several different studies, where the weights are given by the inverse of the estimated variances of these estimates. The random effect-model assumes that the true effect size is not constant. Even though all independent studies share the same common effect size, the effect size varies from study to study to account for the heterogeneity in the population. The random-effect model estimate of the effect size has the same form as the fixed-effect model estimate, but their weights are adjusted to account for between-study variation. Many between-study variance estimators have been developed which includes the DerSimonian-Laird estimator, 9 the Paule-Mandel estimator, 10 and the Restricted Maximum-likelihood estimator, 11 to name a few. By far, the most popular one is the DerSimonian-Laird estimator, which is a part of the traditional random-effect procedure.
Most of the meta-analysis research in the literature rely on the normality of the underlying distribution with few exceptions. Wan et al 12 provided approximate formulas to compute the sample mean and standard deviation from the sample size, range and/or interquartile range. They used these approximated sample means and standard deviations to estimate the effect size using fixed effect meta-analysis estimates. Their approximate formulas still assume that the underlying population has a normal distribution. Hence, the meta-analysis estimate of the common-effect size yields a significant amount of bias for nonnormal distributions. McGrath et al 8 considered synthesizing the medians from several independent studies. They considered both fixed-and random-effect models. Their meta-analysis inference does not require the normality assumptions, but the efficiency of the meta-analysis estimates is lower than the efficiency of the meta-analysis estimates based on the normality assumption. Ozturk and Balakrishnan 13 considered fixed-and random-effect models for the quantiles of the underlying distribution, where the summary statistics from individual studies are reported either as order statistics or quantile confidence intervals.
Traditional meta-analyses are efficient if the random error structure is normal. Real data, though, are often generated by longer (heavier) tailed random error structures. Such data impair the traditional methods, leading to a loss in efficiency of the traditional meta-analysis. In this paper, we present meta-analyses based on robust estimators for two-sample location models with treatments T 1 and T 2 . The research interest in the two-sample model is the inference of location shift between the two treatments. We consider meta-analyses based on sample medians and Hodges-Lehmann (HL) estimators, both the one-and two-sample HL estimators. For long-tailed error structures, these robust estimators are more efficient than the traditional estimators.
We consider two sampling designs for our meta-analysis procedure; (i) a full samples are available, and (ii) only summary statistics are provided: (i) Full samples: With the storage space of modern hard drives and cloud-based servers getting larger, many meta-analyses data are saved as full samples instead of summary statistics, so that the data can be used as a part of future study without losing information. Our full sample methods calculate weighted (inverse variance) medians or HL estimators to estimate Δ, then use the Jackknife to estimate the corresponding variance. Then an approximate confidence interval is constructed for the common effect-size. (ii) Summary statistics: Traditional meta-analysis is based on the sample mean and standard deviation of interest. In practice, it is not uncommon to find studies that only provide medians, quartiles or HL estimators. For example, studies [14][15][16] report the median and quartiles of the aspartate transaminase levels in COVID-19 nonsurvivors. Our methods for the fixed-effect model calculate weighted (based on sample sizes) medians or HL estimators to estimate Δ, then use the Jackknife to estimate the corresponding variance from which an approximate confidence interval for Δ is constructed. For the random-effect model, the variance of the random effect meta-analysis estimate of the effect size is obtained with an appropriate bootstrap resampling method. We also consider the median-based approach of McGrath et al. 8 In Section 2, we introduce the fixed-effect meta analysis model for a two-sample problem and list the design assumptions needed for the development of the procedures. We then outline the construction of the meta-analysis estimator based on a generic estimator of location shift parameter Δ from the individual studies. Sections 2.2 and 2.3 construct meta-analysis estimators for Δ using full data and summary statistics from individual studies, respectively. Section 3 considers the random-effect meta-analysis estimators for full data sets and limited information summary statistics. Section 4 investigates the empirical properties of the robust data synthesis procedures. It is shown that almost all confidence intervals of the location shift parameter Δ based on robust meta-analyses result in valid coverage probabilities for a nominal confidence level of 95%. In Section 5, we apply the methods to an example data set to estimate the pooled platelet counts of malaria patients from seven different studies in Ghana. We provide a concluding remark in Section 6.
We have written R functions that compute the meta-analyses discussed in this paper, which are available upon request. These R functions make use of the CRAN library Rfit by Kloke and McKean. 17

FIXED EFFECT MODELS
We have at our disposal K independent two-sample studies of T 1 versus T 2 . For the kth study, let X k1 , X k2 , … , X kn (1) k denote the n (1) k responses under T 1 and let Y k1 , Y k2 , … , Y kn (2) k denote the n (2) k responses under T 2 . We also make use of the vector notation: X k = (X k1 , … , X kn (1) k ) ′ and Y k = (Y k1 , … , Y kn (2) k ) ′ . We often refer to these as respectively samples 1 and 2 for the kth study. Let n k = n (1) k + n (2) k denote the total sample size for study k. Assume that these sample items follow the location model X ki = (1) k + e ki i = 1, … , n (1) k Y kj = (1) k + Δ + e k,n (1) where the random errors e k1 , e k2 , … , e kn k are iid with continuous pdf f k (x) and cdf F k (x). We assume without loss of generality that the median of e ki is 0 and, further, that f k (0) > 0, (ie, the true median is unique). Then (1) k and (1) k + Δ are the respective medians of X ki and Y kj . Hence, Δ is the shift in location from responses receiving treatment T 1 to responses receiving treatment T 2 . The parameter Δ is our parameter of interest; hence, we consider inference procedures consisting of estimators of Δ along with a confidence interval for Δ.
Because we are discussing rank-based procedures, we set the location functional to be the median; however, we could have used any other location functional as well, for example the mean (assuming that the mean exists). For any location functional, the shift in location from responses receiving treatment T 1 to responses receiving treatment T 2 is Δ. We allow the distributions of the random errors to vary across studies, but we do assume that the shift Δ remains the same. This allows, for example, variation in noise from study to study. Within a study, although, we have assumed homogeneous random error distributions for the samples, we also discuss procedures which allow for heteroscedastic random error distributions, (random error distributions which differ in scale), within a study. Further, our simulation studies include heteroscedastic situations as well as homogeneous situations.
Let n = ∑ K k=1 n k denote the total sample size of all K studies. To discuss the theory of the procedures, we make the following design assumptions: and where 0 < * (k) and * (1) + · · · + * (K) = 1.
Other assumptions are specific to the procedure and are stated when the procedure is presented.

Generic meta-analysis procedure
Below we define the estimators of Δ that we investigate in this paper. The associated meta-analysis of each is similar down to location and scale estimators. To avoid repetition, we discuss the meta-analysis using a generic estimator of Δ. We denote the generic estimator for the kth study byΔ G,k =Δ G,k (X k , Y k ). Assume thatΔ G,k has an asymptotic normal distribution with mean Δ and variance 2 G,k , N(Δ, 2 G,k ). In general, 2 G,k is a function of scale parameters and sample sizes.
Assume that̂2 G,k is a consistent estimator of 2 G,k . Based on the asymptotic distribution ofΔ G,k , the fixed-effect model for Δ G,k is given byΔ in Hettmansperger and McKean 18 for a general rank based location functional estimator, where Z 1 , … , Z k are iid N(0, 1) random variables. Treating the equality in expression (4) as exact, the maximum likelihood estimator of Δ is the weighted averageΔ * where the weights are given by These weights are often called the inverse variance weights. Note that the weights sum to one. For k = 1, 2, … , K, letŵ G,k denote the estimated weights where in expression (6) 2 G,k is replaced by its consistent estimator̂2 G,k . Then the estimator of Δ for the generic procedure isΔ BecauseΔ G is a finite sum of independent random variables each of which is asymptotically normal; the weights sum to 1; and the estimators of the scale parameters are consistent, it follows under the design assumptions (2) and (3) thatΔ G is asymptotically normal with asymptotic mean Δ.
For an associated confidence interval, we need an estimator of the variance ofΔ G . SinceΔ G is a sum of independent random variables this can be calculated, but, as discussed below, in general this variance is a complicated expression and, more importantly, for several of our procedures, due to limited information from the K studies, this is not possible. Meta-analyses, however, involve samples, which suggests the use of a resampling procedure to estimate the variance. For meta-analysis where the effect is a percentile, Ozturk and Balakrishnan 13 provided an alternative method to estimate the variance of the weighted average by using a Jackknife resampling procedure. We next outline this Jackknife procedure to estimate the variance ofΔ G and then using it to obtain a confidence interval for Δ. The empirical evidence in Section 4 verifies the success of this Jackknife in terms of inference.
For i = 1, … , K, letΔ G,−i be the generic estimator of Δ based on K − 1 studies where the ith study has been omitted. The formula forΔ G,−i is given byΔ where the weights areŵ where t ∕2,K−1 is the ∕2 quantile of a t-distribution with K − 1 degrees of freedom. This is the confidence interval of Δ for the generic meta-analysis based on the Jackknife. For a given method, our meta-analysis for it is completely specified once the estimator of the shift parameter Δ is defined for each study and the parameter 2 G,k is identified along with its corresponding estimator̂G ,k . Then based on G,k , k = 1, … , K, the weights are formulated as in expression (6), leading to the weighted estimator of Δ along with the associated confidence interval for Δ based on the Jackknife.
Generally, our meta-analysis procedures satisfy the following equivariances which are stated using the generic procedure. For the kth study (k = 1, … , k), any a > 0, and all b, These imply the following equivariances for the meta-analysis inference.
This implies, for example, that we can use convenient forms of the error distributions in the simulation study, without loss of generality.

Full sample meta-analyses
We discuss our meta-analyses for traditional methods based on means (least squares (LS)) and for three robust methods, one of which is based on medians while the other two are based on Hodges-Lehmann estimators. In terms of rank-based procedures, medians are the inversion of sign tests (the rank score function is the sign function) while Hodges-Lehmann estimators are the inversion of Wilcoxon type tests (the rank score function is the linear function). These are the most popular rank-based methods. Both medians and Hodges-Lehmann estimators are robust. In terms of efficiency, though, the median has the low asymptotic efficiency value of 0.63 relative to the mean when the random error distribution is normal, while, the Hodges-Lehmann has the high asymptotic efficiency value of 0.955 relative to the mean. As discussed in Chapters 1 and 2 of Hettmansperger and McKean, 18 other score functions can be used, but sample medians and Hodges-Lehmann type estimators may also be available when only limited information (not full samples) is available from the studies. We label the procedures with acronyms which we use in the subsequent sections. We first consider the case when full samples are available from the studies.

2.2.1
Least squares meta-analyses: Least squares non-pooled and least squares pooled samples For the least squares procedures, the estimator of Δ for the kth study is the difference in sample meanŝ Provided the random errors have finite variance, the asymptotic distribution ofΔ L,k is normal with mean Δ and variance where 2 k is the variance of the random errors. We define two LS procedures based on the estimator of variance. The first is LSNP that uses the non-pooled estimate of 2 k . In this case the estimate of variance ofΔ L,k for the kth study iŝ where, for j = 1, 2, s j,k is the jth sample standard deviation in study k. Combining these yields the meta estimator of Δ for the LSNP procedure given byΔ As discussed for the generic procedure, we then Jackknife this estimator to obtain an estimate of the variance that we use to compute the associated confidence interval for Δ to complete the inference. This procedure allows for heteroscedastic error distributions within a study. The estimator (14) is the estimator of Δ for the traditional non-pooled least squares meta-analysis. The second LS meta-procedure (LSPS) uses the pooled estimate of 2 k given by Then the estimate of variance ofΔ L,k for the kth study iŝ .
Using these estimates of variance, we can proceed as in the LSNP case to determine the weighted average estimator of Δ and its associated Jackknife confidence interval. The theory for the LSPS procedure assumes homogeneous error distributions within a study.

2.2.2
Median meta-analyses: Median meta-analyses non-pooled and median meta-analyses pooled sample For the kth study, denote the medians of samples 1 and 2 bŷk ,1 and̂k ,2 , respectively. Then the estimator of Δ for the kth study is the differenceΔ The asymptotic distribution of̂k ,j , j = 1, 2, is normal with mean k,j and variance 2 M,k ∕n see, for example, chapter 1 of Hettmansperger and McKean. 18 As an estimator of the scale parameter M,k , we use the consistent estimator which is proportional to the length of a distribution-free confidence interval for the median.
This estimator performed well in terms of the empirical validity, efficiency, and power in a large Monte Carlo study conducted by McKean and Schrader. 19 For the level of confidence, they recommended using 95% confidence interval. Then for the first sample in the kth study, this estimator is given bŷ where c = (n (1) The value of c follows from the normal approximation to the binomial, bin(n, 1∕2), distribution with a continuity correction. The first factor on the right-side of expression (16) is a degree of freedom correction. For our example and simulation study in this paper, we used the Rfit function taustar for computation of̂M ,k,1 . Using independence between the samples in the kth study implies that the asymptotic distribution of Δ M,k is normal with mean Δ and variance As with LS, we present two median meta-analyses, non-pooled (MMNP) and pooled (MMPS). For the non-pooled estimator, use the estimated variancê2 wherêM ,k,1 and̂M ,k,2 are the respective estimates of M,k , (15), for the first and second samples in study k. Hence, the weighted average estimate of Δ for the MMNP procedure is This is the estimator that is Jackknifed to obtain the associated confidence interval for Δ. As with LS non-pooled procedure, the MMNP procedure allows for heteroscedastic error distributions within a study. For the pooled median procedure, the estimated weights are based on wherê2 M,k is the pooled estimator̂2 Usinĝ2 M,k , the weights are obtained and the MMPS weighted average estimator of Δ can be formulated along with its associated Jackknife confidence interval. The theory for this procedure assumes homogeneous error distributions within a study.

2.2.3
Hodges-Lehmann meta-analyses: Hodges-Lehmann non-pooled and Hodges-Lehmann pooled sample In this section, we discuss two procedures based on the difference of the one-sample Hodges-Lehmann location estimators. For the kth study consider the first sample. Then the Hodges-Lehmann estimator is the median of the Walsh averages that we denote aŝH Assuming that the pdf f k (x) of the random errors is symmetric about 0 then̂H ,k,1 has an asymptotic normal distribution with mean k,1 and variance 2 W,k ∕n (1) k where see, for example, chapter 1 of Hettmansperger and McKean. 18 The W stands for Wilcoxon. We use the consistent estimator of the scale parameter W,k,1 proposed by Koul et al, 20 which we label as the KSM estimator; see, also, section 3.7.1 of Hettmansperger and McKean 18 for a discussion of this estimator. We denote the KSM estimator bŷW ,k,1 = . The subscript 1 denotes the first sample, but to avoid confusion we occasionally use the long form also. In this paper, we use the Rfit function gettauF0 for computation of̂W ,k,1 .
Similar results hold for the second sample in the kth study. The estimator of Δ in the kth study is then the differencê Because the samples are independent,Δ H,K is asymptotically normal with mean Δ and variance This result holds if the error pdf is symmetric, an assumption that we do not want to make. If the error distribution is asymmetric, thenΔ H,K is still asymptotically normal with mean Δ but with a different variance; see Høyland. 21 On the other hand, the KSM estimator,̂W ,k , is a consistent estimator of 2 W,k for asymmetric as well as symmetric error distributions. With this in mind, our̂'s for the HL procedures use the KSM estimator.
We define two HL procedures, non-pooled and pooled. The non-pooled procedure uses weights based on wherêW ,k,1 and̂W ,k,2 are the respective estimates of W,k based on the first and second samples in study k; that is, Hence, the weighted average estimate of Δ for the HLNP procedure is This is the estimator that is Jackknifed to obtain the associated confidence interval for Δ. As with LS non-pooled procedure, the HLNP procedure allows for heteroscedastic error distributions within a study. For the pooled HL procedure (HLPS), the estimated weight for the kth study is based on wherê2 H,k is the pooled estimator̂2 Using these weights, the HLPS weighted average estimator of Δ can be formulated along with its associated Jackknife procedure to obtain the confidence interval for Δ.

Wilcoxon meta-analyses: Wilcoxon non-pooled and Wilcoxon pooled sample
Each of the K studies is a two-sample location problem, so a natural rank-based estimator of Δ is the Hodges-Lehmann estimator of the shift Δ defined bŷ The asymptotic distribution ofΔ W,k is normal with mean Δ and variance where W,k is defined in expression (21); see, for example, chapter 2 of Hettmansperger and McKean. 18 To avoid confusion we use W, instead of HL, to denote these Wilcoxon procedures. Consider the residuals Koul et al. 20 Recall that the estimator is a function of the difference of the residuals so the nuisance parameter 1k need not be estimated. For the pooled procedure, WPS, the estimated weight for the kth study is based on̂2 Hence, the weighted average estimate of Δ for the WPS procedure is This is the estimator that is Jackknifed to obtain the associated confidence interval for Δ. The WNP procedure allows for heteroscedastic error distributions within a study. It uses the same estimatorΔ W,k , (26), of Δ but the weights are based on̂2 wherêW ,k,1 and̂W ,k,2 are the respective estimates of W,k based on the first and second samples in study . These are the same weights that are used by the HLNP procedure.

Notes on theory
The rank-based theory cited above holds for Model (1). For heteroscedastic random error distributions within a study, it holds if the pdfs of the random errors are also symmetric. Since the normal distribution is symmetric, this includes the classical Behrens-Fisher problem. For the asymmetric heteroscedastic case the differences of means, medians, and Hodges-Lehmann estimators may be estimating different parameters, as discussed in Høyland. 21

Limited information methods
The robust rank-based meta-analyses discussed in Section 2.2 are essentially based on full samples. The LS analyses, though, require only the sample means and variances from the studies. In this section, we discuss modifications of the rank-based analyses of Section 2.2 due to limited information. Within each method, (median, Hodges-Lehmann, and Wilcoxon), several new methods are discussed. These new meta-analysis methods can be categorized to three groups: (1) meta-analysis uses sample sizes as weights (LI), (2) meta-analysis uses the median of the reported medians of individual studies (LI2), and (3) meta-analysis uses the Walsh averages of the reported medians of individual studies (LI3).

Median methods: MMLI, MMLI2, and MMLI3
Suppose for each study, the sample medians and corresponding distribution-free confidence intervals for the median are available. In this case, using the confidence intervals, the scale parameters M,k are estimated as in (16), except if the confidence level of the interval is (1 − ) and ≠ 0.05 then the critical value z .975 is replaced by z 1− ∕2 . These are essentially the MMNP and MMPS methods. For the next three methods only sample medians are available.
For the method MMLI, suppose for each study k only the sample medianŝk ,2 and̂k ,1 are available. In this case, we still have the same estimators of shift that were used in the MMNP and MMPS procedures; that is, the difference of sample medians,Δ M,k =̂k ,2 −̂k ,1 , k = 1, … , K. For the weighted average, we use the weights given by The estimate of Δ is then the weighted average given by, The Jackknife procedure as described in Section 2.1 is then used to estimate the variance ofΔ MMLI and to form the confidence interval for Δ.
For our MMLI2 procedure we useΔ M2 as the estimator of Δ and this confidence interval with = 0.05.
For the method MMLI3 procedure, form the Walsh averages of the sampleΔ M,1 , … ,Δ M,K ; that is, Then the estimator of Δ is the median of these Walsh averages, and the confidence interval is the distribution-free confidence interval associated with the Hodges-Lehmann estimator, where D ′ (i) are the ordered Walsh averages and d Note that the underlying sample for the methods MMLI2 and MMLI3 consists of independent random variables but not identically distributed random variables, because of the different sample sizes of the studies. In contrast, the MMLI procedures accounts for the different sample size in its weights. This difference may be the reason that the MMLI procedure outperforms the MMLI2 and MMLI3 procedures in terms of efficiency in our simulation studies of Section 4.

2.3.2
Hodges-Lehmann method: HLLI Suppose for the kth study, k = 1, … , K, the Hodges-Lehmann estimates of the medians are available, that is,̂H ,k,1 and H,k,2 . Assume also that the corresponding distribution-free confidence intervals are available. Then the estimate of the shift for the kth study is given by (22). Furthermore, using the distribution-free confidence intervals for each sample in the study, the scale parameter W,k is consistently estimated as discussed in section 1.5.5 of Hettmansperger and McKean. 18 So with this information we essentially have the meta analyses HLNP and HLPS.
Consider next, the case where only the Hodges-Lehmann estimates are available. Then the estimator of shift for the kth study is given in expression (22), that is, The weight for the kth study is p 2 k defined in expression (30). The estimate of Δ is then the weighted average given by, The Jackknife procedure as described in Section 2.1 is then used to estimate the variance ofΔ HLLI and to form the confidence interval for Δ. We call this the HLLI meta-analysis. Similar to the MMLI2 and MMLI3 methods, the procedures HLLI2 and HLLI3 based on the sample of estimated shiftŝ Δ H,k without using sample size weights can be formulated. In general though, our initial simulations studies showed that the HLLI method outperforms them.

Wilcoxon method: Wilcoxon non-pooled LI
If the two-sample Wilcoxon estimates, (26), are available for each study, then limited information Wilcoxon methods can be constructed. In the case that the distribution-free two-sample confidence intervals for Δ are also available then the scale parameter W,k , (21), is consistently estimated as discussed in section 2.4.5 of Hettmansperger and McKean. 18 This procedure is essentially the WP meta-analysis. Suppose then we only have the two-sample Wilcoxon estimates, (26), available; that is, For WNPLI procedure, the weighted average estimator uses the weights defined in expression (30). The estimate of Δ is then the weighted average given by, The Jackknife procedure as described in Section 2.1 is then used to estimate the variance ofΔ WNPLI and to form the confidence interval for Δ. Similar to the sign procedures MMLI2 and MMLI3, procedures WLI2 and WLI3 based on the sample of estimated shiftsΔ W,k without using the sample size weights can be formulated. In general, though, our initial Monte Carlo studies indicate that they are outperformed by the meta-analysis WLI.

RANDOM-EFFECTS MODEL
The meta-analysis random-effect model accounts for variation between studies at the estimation stage. We begin by discussing the meta-analysis of the random-effect model for the generic procedure of Section 2.1. For the kth study, k = 1, … , K, letΔ G,k be the generic estimator of the shift parameter Δ. The random-effect model is defined by, where a k is the random effect due to study k. Assume that a 1 , … , a K are independent and identically distributed with E(a k ) = 0 and V(a k ) = 2 a . The random variables a k and e k are assumed to be independent of one another. Also, in the absence of the random effect, this model is the fixed-effect model forΔ k . Hence, the distribution for e k is the asymptotic distribution ofΔ k − Δ in the fixed-effect model which is N(0, 2 k ); see expression (4). Based on the random-effect model, the asymptotic variance ofΔ k is Thus for the random-effect model,Δ G,k has the asymptotic N(Δ, 2 a + 2 k ) distribution. For the correct weights, an estimator of 2 a is required. We consider the method of moments estimator proposed by DerSimonian and Laird 9 that we now briefly sketch for our procedures.
As a measure of variation, consider the sums of squares between the estimators of study shifts and the final estimator of Δ that is given by Rewrite the final generic estimator, (7), asΔ whereÂ T denotes the sum of the weights. For the derivation, treat the equality in expression (36) as exact. Next, replace the estimators of 2 G,k and A T by their true parameters. It then follows that Var(Δ G ) = 1∕A T . Further by adding in and out the parameter Δ inside the parentheses of the expression for Q 2 and expanding, an algebraic derivation shows that Replacing the left-side with Q 2 and reinserting the estimators of 2 k , results in the method of moments estimator of 2 a given bŷ2 Because k > 0, for k = 1, … , K, the denominator is positive but the numerator, (Q 2 − (K − 1)), may not be positive. Also, E(Q 2 ) = K − 1 for the fixed-effect model. Hence, to avoid negative estimates of variance, we use the conventional estimator Under the random-effect model, the meta-analysis for the generic procedure proceeds as follows. For each study k, k = 1, … , K, the fixed-effect model is fit and the estimate of shiftΔ G,k along with its asymptotic variance 2 G,k are obtained as in Section 2. Next, the weighted averageΔ G , (36), and̂2 G,a , (39) are obtained. Then the new estimator of the variance The final estimator for Δ isΔ * For inference, the Jackknife procedure outlined in Section 2 is followed. This results in estimate of the variance ofΔ * G which we label asB * G . Similar to Section 2, an approximate 100(1 − )% confidence interval for Δ is given by .
wherêM ,k,1 and̂M ,k,2 are the respective estimates of M,k , (15), for the first and second samples in study k. The weighted average estimate of Δ for the MMNP random-effect procedure is theñ This estimator is Jackknifed to obtain the associated confidence interval for Δ. We use the acronym MMNPREST for this meta-analysis. The pooled version is MMPSREST. Likewise, for the other full sample meta-analyses for the random-effect model we use the acronyms: LSNPREST, LSPSREST, HLNPREST, HLPSREST, WPSREST, and WNPREST.

Limited information methods
The meta-analysis procedures for the random-effect model, discussed above, are essentially based on full samples within the studies. In particular, the full samples are used to obtain standard errors of the shift estimators. So these procedures cannot be used for the limited information situations when only the estimates of the shift are available. Instead, we use a bootstrap to obtain an estimate of the variance of the final estimator of Δ that we use to obtain a confidence interval for the shift Δ. This procedure is similar for all the rank-based estimates, so we describe it in terms of a generic estimator.
For study k, k = 1, … , K, letΔ G,k be the generic estimator of the shift Δ. Then the final estimator of Δ is given bỹ where p k are the weights given in expression (30). These weights are functions of the sample sizes only. Since this is an average, a simple bootstrap of the summands suggests itself. Let v G = {Δ G,1 , … ,Δ G,K } be the vector of the items resampled and let v p = {p 1 , … , p k } be the vector of corresponding weights. Then for our bootstrap we draw random samples of size K with replacement from v G along with their corresponding weights from v p . These sampled weights are then normalized so that they sum to one. Suppose then that we draw n B such bootstrap samples. For the ith bootstrap sample, i = 1, … , n B , letΔ * G,i denote the final meta-analysis estimator for the i-th bootstrap replication. Then our estimate of the variance ofΔ G is the sample variance of the sampleΔ * G,1 , … ,Δ * G,n B that we denote by V * G . The confidence interval for Δ is then given by For the limited information procedures based on the sample medians, the Hodges-Lehmann one-sample estimators, and the two-sample Wilcoxon estimators, we need only substitute their respective estimates of the shift in the generic procedure. We label these procedures respectively as MMNPREBS, HLNPREBS, and WNPREBS.

MONTE CARLO STUDIES
In this section, we report the results of four Monte Carlo studies. In the first simulation study the fixed-effect model is used; the second involves the fixed-effect model but the samples within a study are heteroscedastic; the third utilizes the random-effect model; while the fourth is a small empirical power study.

Setup
For the homogeneous fixed-effect simulations, the data were generated from Model (1). There are two settings for the number of studies, K = 15 and K = 75. We consider nine different random error distributions, that cover symmetric, long-tailed, and heavy-tailed distributions. These distributions are the normal distribution, the contaminated normal distribution (CN), the skewed contaminated normal distribution (SCN), the Laplace distribution, the logistic distribution, the logF distribution, the Cauchy distribution, the logexp distribution, and the skewed normal distribution (SN). All the procedures are location and scale equivariant so without loss of generality convenient forms of these distributions were used; for example, the standard normal distribution was used for the normal situations. For the CN distribution, the percentage of contamination was 15% and the ratio of variances was 100 to 1; for the SCN distribution, the percentage of contamination was 15%, the ratio of variances was 100 to 1; and the mean of the contaminated part was set at 3; for the log F distribution, the degrees of freedom were 2 and 0.4; and for the SN distribution, the shape parameter was set at 5. Within a study, sample sizes were randomly selected from the set {25, 30, 35, … , 100}. So there are 9 × 2 = 18 situations. For each situation, 5000 simulations were run. All the fixed-effect meta analyses discussed in Section 2 were run in the simulation. Their estimates of Δ and the corresponding confidence intervals were captured.

Summary of the results
We measure the validity of a meta-analysis in terms of the empirical coverage of the nominal 95% confidence interval that was defined for each analysis in Section 2. In Tables 1,2,3, and 4 for each situation, these empirical coverages are found in the column EC. As a goal, we use the interval (0.9400, 0.9600) as our criteria for labeling a meta-analysis valid. This is the 99.88% confidence interval 0.9500 ± 3.24 √ .95(0.05)∕5000) for binomial proportion with the value 0.95 and sample size 5000. Of the two errors, we are more concerned about the liberal error; that is, empirical coverages falling below 0.9400. So, we label a meta-analysis to have satisfactory validity for a situation if its empirical coverage is ≥ 0.9400.
In the column labeled RE of Tables 1,2,3, and 4 we report the relative efficiencies of the meta-analyses for each situation. For each particular meta-analysis say G, this is the ratio of MSE of the LS based nonpooled meta-analysis LSNP to the MSE of the meta-analysis G. Hence, ratios less than one imply that the LSNP meta-analysis is more empirically efficient while ratios greater than one imply that G is more efficient.
The following is an itemized summary of the results:  For comparisons of the analyses in terms of efficiency, the LS analyses dominated the other analyses at the normal and skewed normal situations, except for the normal situation with K = 75 where it essentially tied the HLLI analysis. The HL and W-type meta-analyses have relative efficiency with their LS counterparts close to 0.95 which is the asymptotic relative efficiency between Wilcoxon methods and LS methods in the one-and two-sample location problems. For the other distributions, the rank-based meta-analyses dominated their LS counterparts. For the heavy-tailed distributions these relative efficiencies were quite large; for example, for K = 15, the relative efficiencies between WNP and LSNP are 22.8, 7.2, and 8.0 for the Cauchy, contaminated normal, and skewed contaminated normal situations, respectively. Among the rank-based analyses, for the Cauchy and Laplace distributions, MMNP performed best while WNP performed best for the other distributions. The relative efficiencies of the Hodges-Lehmann analysis HLNP were close to WNP except for the heavy-tailed skewed distributions where WNP clearly dominated HLNP.

Heteroscedastic fixed-effect Monte Carlo study
In this set of simulations, we consider situations in which there is heteroscedasticity between the samples within the studies. For such situations, the assumption of symmetric error distributions ensures that all our estimators of shift are estimating (consistent for) the same parameter Δ. Hence, for these simulations we omitted the four skewed distributions that were used in the homogeneous study. We consider two sets of simulations. For the first set, the ratio of the scale between the two distributions within a study is set at 2, while for the second set this ratio is set at 3. We denote these cases respectively by a s = 2 and a s = 3. As in the homogeneous case, for each of these simulations there are two parts, one part consisting of of K = 15 studies and the other with K = 75 studies.
The results of the simulations reported in Tables 5,6,7, and 8 follow: 1. Validity. Unless otherwise noted, we discuss the a s = 2 and a s = 3 sets of simulations together. The comments concerning validity for the homogeneous situations generally hold for the heteroscedastic situations. For most situations, the meta-analyses are valid. As in the homogeneous case, the limited information meta-analyses MMLI, HLLI, and WLI are slightly liberal for the K = 15 situations but not the K = 75 situations. It is generally just the reverse for the limited information analysis MMLI2. The LS meta-analyses are not valid for the CN situations for K = 15 for both the a s = 2 and a s = 3 sets of simulations. 2. Efficiency. General comments concerning the empirical relative efficiencies of the meta-analyses are similar to the comments for the homogeneous situations with one notable exception. In the homogeneous situations, within each of   For the normal situations, LSNP is the most efficient meta-analysis. Among the robust meta-analyses, HLNP is the most efficient with relative efficiency about 0.93 overall normal situations. MMNP dominated all other analyses for the Cauchy situations and two of the Laplace situations, while WNP dominated in the K = 75 Laplace situations. For the logistic and contaminated normal situations, either HLNP or WNP dominated. Between HLNP and WNP over all 20 situations, HLNP won 11 times and WNP won 9 times. For the limited informations procedures, generally MMNP was more efficient than MMLI. Over all situations, MMLI dominated MMLI2 and MMLI3 meta-analyses. The analyses HLNP and WNP were more efficient than their limited information counterparts.

Random-effect Monte Carlo study
The set up for the random effects simulations is the same as that for the fixed-effect simulations except for the generation of the random effect. We selected a normal distribution for the random effects with mean 0 and variance 2 a and with two settings for the variance: 2 a = 0.1 and 2 a = 0.3. All the meta-analyses discussed in Section 3 were run.

4.3.1
Results of the random-effect simulations Tables 9,10,11,12,13, and 14 contain the results of the random-effect simulations. As with the fixed-effect simulations the outcomes displayed are the empirical confidence levels for nominal 95% confidence intervals and relative efficiencies that are relative to the LS meta-analysis LSNPREST. The only empirical confidence levels less than 0.94 are those for the LSNPEST, Cauchy K = 15 situations that are 0.934 and 0.931 respectively for the 2 a = 0.1 and 2 a = 0.3 situations. The other meta-analyses are valid for all situations.
In terms of efficiency, generally, the NP meta-analyses are more efficient than their PS counter parts. For the few exceptions where this is reversed, there is only a slight edge for the PS analyses. For the normal situations, the LS meta-analysis LSNPREST outperforms the robust meta-analyses. The empirical relative efficiencies of Wilcoxon type and Hodges-Lehmann type procedures are over 0.95, while the median type procedures have low efficiencies of about 0.80. For the SN situations, the Wilcoxon meta analyses outperform the other analyses but the edge was slight, except for the median based analysis which have RE's in the 0.80's. For the Laplace situations, the median meta-analyses outperform TA B L E 9 Results for K = 15, random-effects simulations. the others; however, the relative efficiencies of the Wilcoxon and Hodges-Lehmann type are quite close to those of the median type. For the logistic situations, the Wilcoxon types outperform the other procedures but the Hodges-Lehmann types are quite close. For the heavy-tailed situations, the robust analyses outperformed LS meta-analyses by a large margin. The Wilcoxon type analyses perform the best except that the median procedures have the best performance for the Cauchy situations.
In general, the Wilcoxon analyses outperform the Hodges-Lehmann analyses. This edge is slight except for the heavy right-tailed distributions where the Wilcoxon did outperform the Hodges-Lehmann analysis. Over all the situations, the limited information analyses perform nearly as well as their full samples counterparts. In terms of the variance 2 a , for most of the situations, the relative efficiencies were higher for the 2 a = 0.1 situations than the 2 a = 0.3 situations.

Power simulation
For a power comparison of the meta-analyses, four fixed-effect procedures were selected and random errors were drawn from four different distributions with same setup as in the homogeneous fixed-effect study. The number of studies was set at K = 15, and within a study, sample sizes were randomly selected from the set {25, 30, 35, … , 100}. The number of simulations for each situation was 5000. The following meta-analyses were run: LSNP, MMNP,HLNP, and WNP. The comparison of power curves of the meta-analyses against effect size Δ are presented in Figure 1.
For the normal situations, MMNP lagged in empirical power to the other meta-analyses whose empirical power curves are indistinguishable on the plot. For the contaminated normal situations, the HLNP, WNP, and MMNP are much more powerful than the LSNP analysis. Further, the Wilcoxon and Hodges-Lehmann analyses are more powerful than the TA B L E 13 Results for K = 75, random-effects simulations. MMNP analysis. For the Laplace situations, the MMNP analysis had a slight edge in empirical power over the WNP and HLNP analyses but a definite edge in power over the LSNP analysis. Finally for the logF situations, the WNP analysis dominated the other analyses in empirical power.

EXAMPLE-PLATELET COUNT REDUCTION FOR MALARIA-INFECTED PATIENTS
The hematological study conducted by Aninagyei 22 is a multi-center case-control study, that contains recorded data from seven public healthcare facilities in Ghana. The dataset contains several immunological, hematological, and clinical attribute variables, but we only considered platelet counts in the blood that is defined as the number of platelets per micro-liter of blood. The grouping variable is whether or not the patient is infected with malaria. The comparison boxplots display the platelet counts in Figure 2 for malaria (left plot) versus non-malaria over the seven studies. The plots indicate that platelet counts are higher for the non-malaria patients. This appears to hold for all seven studies.
The comparison boxplots also indicate that the response variable is drawn from a right-skewed distribution. This is more apparent based on the following residual plot displayed in Figure 3. The left panel of this plot displays the normal q − q-plot of the residuals based on the MMNP fit of the data. The right-tail is longer than expected from a normal distribution while the left-tail is shorter than expected. Hence, the plot indicates a skewed error distribution with heavy right-tails. The right panel displays the residuals based on the LS fit, LSNP. Notice that the extreme residuals standout more for the robust procedure than for the LS procedure.
Using all seven studies, we applied our fixed-effect and random-effect meta-analyses, using full sample and limited information methods, to estimate Δ the true platelet count difference between malaria and non-malaria infected patients. Table 15 presents a comparison of the fixed-effect meta-analyses. The columns contain the estimates of Δ, the corresponding 95% confidence intervals, and the lengths of the confidence intervals. As a measure of efficiency (precision), for each analysis, the last column of the table, with the heading e(⋅, LSNP), contains the ratio of the squared lengths of confidence interval of the LSNP analysis to the squared length of the confidence interval for that analysis. Hence, ratios less than one indicate that the LSNP is more efficient (precise), while ratios greater than one indicate that the given analysis is more F I G U R E 2 Comparison boxplots of platelet counts for malaria and non-malaria infected patients in Ghana over the seven studies.

F I G U R E 3
Normal q − q-plot for the platelet counts dataset. The left panel is based on MMNP residuals while the right panel is based on LSNP residuals. efficient (precise). Theoretically, this ratio converges to the asymptotic relative efficiency; see chapter 2 of Hettmansperger and McKean. 18 In terms of inference, because zero is not in any of the confidence intervals, all fixed effect meta-analyses indicate that the non-malaria group has significantly higher platelet counts than the infected malaria group with 95% confidence. The MMNP analysis estimates the reduction in platelets Δ to be −89.4 platelets per micro-liter of blood with the 95% confidence interval (−99.8, −79.0). In contrast, LSNP analysis estimates Δ to be −80.6. This gap between the estimates is about two median standard errors, (based on the MMNP confidence interval). Table 16 shows the individual studies estimates of Δ for the MMNP and LSNP analyses. For study 6, the outliers in the right-tail of the infected group adversely affected the sample mean resulting in a much smaller estimate of Δ. This is true for studies 2, 3, and 4 also.
In terms of efficiency, though, all the robust meta-analyses, except for the MMLI2 analysis, are much more efficient (more precise) than the LSNP analysis. The LS meta-analyses were impaired by the outliers in the right-tail of the generating distribution. The limited information Wilcoxon, WNPLI, analysis is the most efficient robust meta-analysis. The MM analyses, MMNP and MMLI, agreed and are also highly efficient. The Wilcoxon analysis outperformed the Hodges-Lehmann counterparts, reflecting the skewness of the data. The MMLI2 and MMLI3 analyses perform the worse. These are the only analyses that do not take sample size into account.

TA B L E 15
Meta-analysis fixed-effect estimate of platelet counts for malaria and non-malaria infected patients in Ghana.  Table 17 presents a comparison of random-effect meta-analyses on the example. The format of the table is the same as that for the fixed-effects analyses. The results are similar to the fixed-effect results. All the robust analyses are more efficient (precise) than the least squares LSNPREST analysis. The limited information median, MMREST, analysis is the most efficient meta-analysis with relative efficiency 416.1% relative to the LSNPREST analysis. As with the fixed-effect analyses, the outliers have an adverse effect on the LS procedure. The MMPREST analysis is also highly efficient. As with the fixed-effect results, due to the skewness of the data, the Wilcoxon analysis outperforms the Hodges-Lehmann counterparts.

CONCLUDING REMARKS
In this paper, we have developed meta-analysis procedures for two-sample designs where the error structure of the two-sample model of individual studies may not be normal and may have heavy-tails (skewed as well as symmetric) and be prone to outliers. We have developed analyses for both the full sample studies and summary statistics (limited information) situations. The final estimator of the common effect Δ is a weighted average of robust estimators. Along with the estimation of Δ, we utilize a Jackknife to robustly estimate the variance of the estimate of Δ in order to conduct inference on Δ. We have developed meta-analyses for both fixed-effect and random-effect models. For random-effect full sample analyses we utilize the method of moments estimator of DerSimonian and Laird 9 to estimate the variation between studies. For situations where the studies' information consists of summary statistics, we make use of a bootstrap to estimate the total variance. Our rank-based (nonparametric) meta-analyses are based on robust estimators. These estimators have bounded influence functions and positive breakdown points; see chapters 1 and 2 of the Reference 18 for details. Further, the Hodges-Lehmann and Wilcoxon estimators are highly efficient. For normally distributed random errors, their efficiency relative to the LS estimator is 0.955, while for heavy-tailed error distributions their relative efficiency is generally much larger than 1. These robustness and efficiency properties are verified in our simulation studies in Section 4 for both symmetric and skewed heavy-tailed error distributions. The rank-based meta analysis offers protection against error distributions that are outlier prone and/or have heavy-tails.
In the literature, there are parametric approaches that model aggravated data with likelihoods based on heavy-tailed distributions. For example, Lee and Thomson 23 use symmetric and skewed t-distributions to model the fixed and random effects based on LS aggravated data while Baker and Jackson 24 use mixtures of normal and exponential distributions (the exponentially modified Gaussian distribution) to model the fixed and random effects based on LS aggravated data. It is hoped that using these likelihoods will offer protection against outliers and heavy-tailed distributions.
The parametric approach involves choosing distributional models. These often involve three or more parameters which must be estimated. Due to small sizes of the aggravated data, the computational algorithms may be unstable. Lee and Thompson 23 avoid this by casting the model in a Bayesian framework and using MCMC methods. This, however, involves choosing prior distributions. On the other hand, for the nonparametric (rank-based) approach no choice of models is necessary. In the random effects problem, also, no choice of the distribution of the random effect is necessary. The error distributions can be skewed as well as symmetric and have light to moderate or heavy-tails.
We conducted four Monte Carlo studies: over fixed-effect models; fixed-effects but individual study models are heteroscedastic; random-effect models; and a power comparison study. Error distributions included the normal and heavier-tailed distributions, (skewed and symmetrical). The outcomes of interest are validity that is measured by the empirical confidence level of a nominal 95% confidence interval for Δ and relative efficiency that is measured by the ratio of MSE's of traditional LS meta-analysis to the robust meta-analyses. Overall, the robust meta-analyses were valid with most of the empirical coverages in the interval (94%, 96%). For the normal situations, the empirical relative efficiencies of the Hodges-Lehmann and Wilcoxon type meta-analyses were close to 95%. For the heavier-tailed situations, these relative efficiencies were much higher. Hence, the Hodges-Lehmann and Wilcoxon type meta-analyses are highly efficient, robust meta-analyses. For the heteroscedastic situations the meta-analyses that allow for different variances (non-poolers) considerably outperformed the meta-analyses that did not allow for different variances (poolers). Furthermore, the non-poolers did as well as the poolers in the homogeneous situations. Hence, in general, we recommend non-pooler meta-analyses. The results of the power comparison study reinforce the results of the efficiency studies. For normal error, the LS meta-analysis was slightly more powerful than the Hodges-Lehmann and Wilcoxon type meta-analyses, while for heavy-tailed situations these robust analyses had substantially more power than the LS analysis.
In general, if the user desires a highly efficient, robust meta-analysis, we would recommend the HLNP or WNP meta-analysis for the full samples case and their counterparts HLLI or WNPLI for the summary statistics case. Between HLNP or WNP, neither one consistently dominated the other; although, for heavy-tailed skewed data WNP is more efficient. For very heavy tailed data we also recommend the MMNP meta-analysis. The behavior of the meta-analyses on the malaria data serves as a verification of their behavior on skewed data with heavy-tails.

DATA AVAILABILITY STATEMENT
The data that support the findings of this study are openly available in Harvard Dataverse at http://dx.doi.org/10.7910/ DVN/DMUDQ3.