Rank-based multiple test procedures and simultaneous conﬁdence intervals

: We study simultaneous rank procedures for unbalanced designs with independent observations. The hypotheses are formulated in terms of purely nonparametric treatment eﬀects. In this context, we derive rank- based multiple contrast test procedures and simultaneous conﬁdence intervals which take the correlation between the test statistics into account. Hereby, the individual test decisions and the simultaneous conﬁdence intervals are compatible. This means, whenever an individual hypothesis has been rejected by the multiple contrast test, the corresponding simultaneous conﬁdence interval does not include the null, i.e. the hypothetical value of no treatment eﬀect. The procedures allow for testing arbitrary purely nonparametric multiple linear hypotheses (e.g. many-to-one, all-pairs, changepoint, or even average comparisons). We do not assume homogeneous variances of the data; in particular, the distributions can have diﬀerent shapes even under the null hypothesis. Thus, a solution to the multiple nonparametric Behrens-Fisher problem is presented in this uniﬁed framework.


Introduction
In many experiments more than two treatment groups are involved. Hereby the global null hypothesis, i.e. no impact of the factor "treatment" on the response, is often not the main question. In statistical practice, however, the traditional method for making inferences on the effects of interest is achieved in three steps: First the global null hypothesis is tested by an appropriate procedure (e.g. ANOVA). If the global null hypothesis is rejected, multiple comparisons are commonly used to test the different sub-hypotheses. In the last step of the analysis, confidence intervals for the effects of interest are computed.
Although stepwise procedures using different approaches on the same data are pretty common in practice, they may have the undesirable property that the global null hypothesis may be rejected, but none of the individual hypotheses and vice versa. This means, the global test procedure and the multiple testing procedure may be non-consonant to each other (Gabriel 1969 [14]). Further the confidence intervals may include the null, i.e. the value of no treatment effect, even if the corresponding individual null hypotheses have been rejected. This means, the individual test decisions and the corresponding confidence intervals may be incompatible (Bretz, Genz, and Hothorn 2001) [3]. In randomized clinical trials, the computation of compatible simultaneous confidence intervals (SCI), i.e. confidence intervals which always lead to the same test decisions as the multiple comparisons, is consequently required by regulatory authorities:"Estimates of treatment effects should be accompanied by confidence intervals, whenever possible. . ." (ICH E9 Guideline 1998, chap. 5.5, p. 25 [24]). It is well known that the classical Bonferroni adjustment can be used to perform multiple comparisons as well as for the computation of compatible SCI. This approach, however, has a low power, particulary when the test statistics are not independent. In practice, the computation of correlation accounted compatible SCI is often neglected, because no adequate procedures exist. Thus, there is a desirable demand for statistical procedures, which can be used for (i) testing the global null hypothesis, (ii) performing multiple comparisons, and (iii) for the computation of compatible SCI by taking the correlation between the test statistics into account. Such procedures are particularly of practical importance.
In recent years, parametric multiple contrast test procedures (MCTP) with accompanying compatible SCI for linear contrasts in terms of the expectations of homoscedastic normal samples were derived by Mukerjee, Robertson, and Wright (1987) [28] and Bretz, Genz, and Hothorn (2001) [3]. These MCTP take the correlation between the test statistics into account. Hereby the procedures by Bretz et al. (2001) [3] can be used for testing arbitrary contrasts, e.g. many-to-one, all-pairs, average, or even changepoint comparisons. Thus, MCTP provide an extensive tool for the computation of compatible SCI. They are calculated by establishing the exact joint multivariate t -distribution of different test statistics with correlation matrix R for testing individual hypotheses and control the familywise type I error rate (FWER) in the strong sense (Hochberg and Tamhane 1987 [22]). The results by Bretz et al. (2001) [3] were extended to general linear models by Hothorn, Bretz, and Westfall (2008) [23] and to heteroscedastic models by Hasler and Hothorn (2008) [20] and Herberich, Sikorski, and Hothorn (2010) [21]. For a comprehensive overview of existing parametric methods we refer to Bretz, Hothorn, and Westfall (2010) [4]. We note that the parametric SCI use the critical values from the extreme tail portion of the multivariate t-distribution, which is the portion most sensitive to nonnormality (Gao et al. 2008 [17]). Therefore, when the normality assumption is violated, the problem of robustness will be more serious for SCI compared to individual intervals. In practice, however, skewed data, or even ordered categorical data occur in a natural way. Thus, there is a desirable demand for nonparametric MCTP and compatible SCI, particularly in the case of ordinal data and discontinuous distributions. Munzel and Hothorn (2001) [30], Wolfsegger and Jaki (2006) [40] and Ryu (2009) [35] derive nonparametric SCI for the treatment effects p ij = P (X i < X j ) + 1/2P (X i = X j ), where X i and X j are independently distributed. In the literature, p ij is known as relative effect (Brunner and Puri 2001 [10]; Gao et al. 2008 [17]), or, ordinal effect size measure in case of ordered categorical data (Ryu and Agresti 2008 [36]; Ryu 2009 [35]). We note that p ij is an intransitive measure, i.e. it may occur p 12 ≤ p 23 ≤ p 31 (Brown and Hettmansperger 2002 [5]). Therefore, the use of pairwise defined relative effects in multiple comparisons may result in paradox statements in terms of Efron's dice (see, e.g. Gardner 1974 [18]; Thangavelu and Brunner 2006 [39]) and should be avoided.
The purpose of this article is to propose rank-based MCTP and compatible SCI for transitive relative effects in unbalanced one-way designs with independent observations, a fixed number of levels, and arbitrary contrasts. Their derivation requires the asymptotic joint distribution of rank statistics under ar-bitrary alternatives. The covariance structure of the rank statistics turns out to result in a quite involved representation (Puri 1964 [32]; Rust and Fligner 1984 [33]), who consider only continuous distributions. Here, we will represent the structure of the covariance matrix in a simple and unified way allowing for discontinuous distributions and moreover we will provide procedures for multiple comparisons and SCI. For an user friendly application of the proposed methods, the freely available R-software package nparcomp was developed.
We further note that the MCTP proposed here do not assume homogeneous variances. The distributions can have different shapes even under the null hypothesis. Thus a solution to the multiple nonparametric Behrens-Fisher problem will be presented in a closed form. The new procedures are generalizations of the well-known multiple rank sum tests by Dunn (1964) [11], Steel (1959 [37], 1960 [38]), and   [17], to heteroscedastic designs.
The paper is organized as follows. The nonparametric model, treatment effects, and hypotheses are presented in Section 2. A unified estimation approach for relative effects and the asymptotic multivariate normality of linear rank statistics in this general setup are derived in Section 3, where also consistent estimators of the parameters of this distribution are given. In Section 4, MCTP and compatible SCI are derived. A modification of the test statistics and approximations to their finite-sample distributions are presented in Section 5. As a practical example, a real data set is analyzed in Section 6. All procedures follow from a general asymptotic theory, which is presented in the Appendix.
Throughout the article we use the following notation. By 1 a we denote the a × 1 column vector of 1 ′ s and by I a = diag{1, . . . , 1} the a-dimensional unit matrix. Further ⊗ denotes the Kronecker product and ⊕ the Kronecker sum of two matrices, respectively. Finally vec(·) denotes the vector operator of a matrix, which stacks the columns of a matrix on top of each other.

Nonparametric model and hypotheses
Let X ik be the kth (independent) replicate in the ith group among the total a groups. Let n i be the sample size within the ith group, and N = a i=1 n i is the total sample size. Let F i (x) = P (X ik < x) + 1/2P (X ik = x) denote the normalized version of the distribution function, which is the average of the left and right continuous version of the distribution function. In the context of nonparametric models, the normalized version of the distribution function F i (x) was first mentioned by Lévy (1925) [27]. Later on, it was used by Ruymgaart (1980) [34], Akritas, Arnold, and Brunner (1997) [1], Munzel (1999) [29],   [17], among others, to derive asymptotic results for rank statistics including the case of ties in a unified way. We note that the F i may be arbitrary distributions, with the exception of the trivial case of one-point distributions. The general model specifies only that and does not require that the distributions are related in any parametric way; in particular it does not require homoscedasticity (Akritas et al. 1997 [1]). Factorial designs can be described in this setup by putting a "factor pattern" on the index i in the same way as in the theory of linear models. The vector of the distributions is denoted by F = (F 1 , . . . , F a ) ′ . The general model (2.1) does not entail any parameters by which a difference between the distributions could be described. Therefore, the distribution functions F i (x) are used to define treatment effects by where G = a i=1 w i F i denotes a mean distribution in its weighted (w i = n i /N ; Akritas et al. 1997 [1]) or unweighted (w i = 1/a; Brunner and Puri 2001 [10]; Gao et al. 2008 [17]) form. If p i < p j , then the values from F i tend to be smaller than those from F j . In case of p i = p j , none of the observations from F i and F j tend to be smaller or larger. Gao et al. ( , p. 2576) [17] state that the unweighted relative effect "has the advantage of not being influenced by the allocation of sample sizes in the data". Therefore, we will mainly concentrate on this effect throughout the paper. Paradox statements in terms of Efron's dice cannot occur, because each comparison in (2.2) refers to a fixed reference distribution G (Thangavelu and Brunner 2006 [39]). We will rewrite p j as a linear combination of p ij = F i dF j . Let p = (p 1 , . . . , p a ) ′ = GdF denote the vector of the relative treatment effects. The representation of p j in (2.3) enables a simple representation of the covariance matrix of linear rank statistics under arbitrary alternatives. We further note that the weights w i may be arbitrarily chosen under the constraint a i=1 w i = 1. To have a reasonable interpretation of hypotheses in terms of these generalized relative effects, the weights should not depend on the sample sizes.
In the nonparametric setup discussed above, Akritas et al. (1997) [1] propose to formulate hypotheses by the distribution functions as H F 0 : CF = 0, where C denotes an arbitrary contrast matrix, i.e. C1 a = 0, and derive global test procedures for H F 0 .   [17] consider the family of hypotheses where c ′ ℓ = (c ℓ1 , . . . , c ℓa ) denotes an arbitrary contrast. They derive multiple test procedures for many-to-one and all-pairs comparisons. All test procedures for H F 0 , however, are limited to testing problems and cannot be used to construct confidence intervals for the underlying treatment effects δ ℓ = c ′ ℓ p. In this paper, we consider the family of hypotheses and we derive MCTP for Ω p and compatible SCI for the effects δ ℓ = c ′ ℓ p.
We note that the hypothesis in the classical Behrens-Fisher model is contained in this general setup as a special case. This is easily seen from the fact that p ij = 1/2 if F i and F j are both symmetric distributions with the same center of symmetry. The nonparametric hypothesis H F 0 : CF = 0 is very general and implies H p 0 : . . , a, the nonparametric and parametric hypotheses in terms of the location parameters µ i are equivalent. For a detailed discussion of the hypotheses formulated above we refer to Akritas et al. (1997) [1] and Brunner and Munzel (2000) [8].

Asymptotic normality of linear rank statistics
Rank estimators of the quantities p j defined in (2.3) are derived by replacing the unknown distribution functions F i (x) by their empirical counterparts (Ruymgaart 1980 [34]). Note that jk is the rank of observation X jk among all n i + n j observations in the combined sample (i, j), and R (j) jk is the internal rank of X jk among all n j observations in sample j. If there are no ties, then R (ij) jk is the usual rank of X jk . In the presence of ties, R (ij) jk is the midrank of X jk . The quantities n i F i (X jk ) are also called placements (Orban and Wolfe 1982 [31]).
To estimate the relative effect p ij used in (2.3), we use the normed placements given in (3.1) by jk is the mean of the ranks in sample j. Thus, one obtains an estimator of p j in (2.3) as a linear combination of p ij in (3.2) by This representation of the estimator provides an unified approach for both the usual ranks as well as the so-called pseudo ranks (Gao and Alvo 2005 [15], 2008 [16]; Gao et al. 2008 [17] The usual ranks are obtained by letting w i = n i /N and the pseudo ranks by letting w i = 1/a. Let p = ( p 1 , . . . , p a ) ′ denote the vector of the estimators p j in (3.3). We note that p j is an unbiased and consistent estimator of p j , which follows from the unbiasedness and consistency of p ij (Brunner et al. 2002 [9]).
The asymptotic equivalence stated in the next theorem will facilitate the representation of the asymptotic covariance matrix in a simple and unified form.
where denotes the asymptotic equivalence of two sequences of random variables.  [17] state that the asymptotic covariance matrix of √ N C( p − p) leads to a simple representation under the hypothesis  [17].
The representation of the covariance matrix of (3.4) by means of G leads to the same obstinate structure as the representation by Puri (1964) [32]. Therefore, we first rewrite the right-hand side of (3.4) and by some simple algebraic arguments we obtain for the jth component For a convenient vectorial representation, we define the random vector Z = vec[(Z ij ) i,j=1,...,a ]. Let W = (w 1 , . . . , w a ) ⊗ I a denote the known matrix of weights w 1 , . . . , w a . Thus, an equivalent representation of the asymptotic equivalent sums defined in (3.4) is given by where σ ij,ij = N/n j θ ij,ij + N/n i θ ji,ji and σ ij,ji = −σ ij,ij . Note that the representation of p j as a linear combination of p ij leads to the simple representation of the structure of V N as given in (3.5). The asymptotic normality of the linear rank statistic √ N ( p − p) will be given in the next theorem.
has, asymptotically, as N → ∞, a multivariate normal distribution with expectation 0 and covariance matrix V N .
A consistent estimator of V N will be provided in the next section.

Estimation of the covariance matrix
Note that it is sufficient to derive consistent estimators of the variances θ ij,ij = Var(Y ij1 ) and the covariances θ ij,rs = Cov (Y ij1 , Y rs1 ) as given in (3.6). If the random variables Y ijk were observable, then a natural estimator of the covariance θ ji,si , for example case 3 in (3.6), would be given by the empirical covariance The random variables Y ijk , however, are not observable, and, for the computation of an estimator, they must be replaced by observable random variables, which are "close enough" to the originals in an appropriate norm. To this end, let Y ijk = F i (X jk ) denote the normed placements as given in (3.1) and define the centered placements Then, an estimator of V N = WΣW ′ is given by V N = W Σ N W ′ , where Σ N denotes the matrix Σ with σ ij,rs being replaced by the estimators where σ ij,ij = N/n j θ ij,ij + N/n i θ ji,ji and σ ij,ji = − σ ij,ij . Here denote the empirical variances of Y ijk and the empirical covariances of Y ijk and Y rsk , respectively. Theorem A.4 shows that The asymptotic distribution of √ N ( p − p) and the estimator V N can now be used for the derivation of MCTP for Ω p and compatible SCI for δ ℓ = c ′ ℓ p.

Multiple contrast test procedures
In order to develop MCTP for the family of hypotheses Ω p defined in (2.5), we first need to derive the test statistics for each individual hypothesis H p 0 : . . , q. By the asymptotic normality of √ N c ′ ℓ ( p − p) and Slutsky's theorem, it follows that T p ℓ d → N (0, 1). The test statistics T p ℓ are collected in the vector Note that the MCTP and compatible SCI are derived by establishing the asymptotic joint distribution of the test statistics T defined in (4.1). Although the exact marginal distributions of the test statistics T p ℓ depend on the sample sizes and are not identical for unbalanced designs with finite replications, the marginal distribution of T p ℓ is asymptotically standard normal, as N → ∞. Further the asymptotic distribution of T is derived under arbitrary alternatives. Thus, it is completely specified under any configuration of the null hypotheses.
Proof. Let C = (c ′ ℓ ) ℓ=1,...,q denote the contrast matrix obtained by the q single contrasts c ′ ℓ . The proof follows by the asymptotic normality of √ N C( p − p) and Slutsky's theorem.
Corollary 1 particularly states that the type of contrast is incorporated in the correlation matrix of the vector of test statistics T. For a strong control of the FWER, however, the limiting joint distribution of the test statistics under alternatives is not sufficient. In addition, it must be shown that the limiting distribution of the statistics is completely specified under arbitrary intersections of the null hypotheses, i.e. that the family of hypotheses Ω p and T constitutes a joint testing family. Lemma 1. The family of hypotheses Ω p and the corresponding test statistics T constitute a joint testing family asymptotically.
So far, multiple comparison rank procedures compute the variances and covariances of the statistics under the global null hypothesis (Munzel and Hothorn 2001 [30]). Thus, the resulting test statistics do not constitute a joint testing family asymptotically, and do not provide a strong control of the FWER (Hochberg and Tamhane 1987, p. 249 [22]). Next we will derive a simultaneous test procedure (STP) from the joint testing family {Ω p , T}.
for X = (X 1 , . . . , X q ) ∼ N (0, R) (Bretz et al. 2001 [3]). We write z 1−α,2,R to emphasize that it is the two-sided equicoordinate quantile; one-sided quantiles are written as z 1−α,1,R . For bivariate distributions, z 1−α,2,R geometrically forms a cuboid having a square base. The quantiles become smaller with an increasing correlation (see Figure 1). For the numerical computation of z 1−α,2,R we refer to Bretz et al. (2001) [3] and Genz and Bretz (2009) [19]. The asymptotic correlation matrix R, however, is unknown and must be estimated. Let v ℓℓ and v ℓm denote the consistent estimators of v ℓℓ and v ℓm in Corollary 1 replacing V N with V N as given in Theorem A.4. Then, a consistent estimator of the correlation matrix R is given by R = ( r ℓm ) ℓ,m=1,...,q , where r ℓm = v ℓm / √ v ℓℓ v mm . Thus, the set {Ω p , T, z 1−α,2, R } of hypotheses, corresponding test statistics and one critical value for all individual hypotheses constitutes an asymptotic STP (Gabriel, 1969 [14]). The strong error control of the proposed method is shown in the next theorem. For large sample sizes, the individual hypothesis H p 0 : c ′ ℓ p = 0 will be rejected at a two-sided multiple level α, if |T 0.5 ℓ | ≥ z 1−α,2, R . Asymptotic (1 − α)simultaneous confidence intervals for the treatment effects δ ℓ = c ′ ℓ p are obtained from Note that the test decision for H p 0 : c ′ ℓ p = 0 and the SCI defined in (4.3) are compatible by construction. This means, whenever an individual hypothesis is rejected, the corresponding confidence interval does not include the null. Further, for large sample sizes, the global null hypothesis H p 0 : Cp = 0 will be rejected at a two-sided multiple level α, if max{|T 0.5 1 |, . . . , |T 0.5 q |} ≥ z 1−α,2, R . In practical applications it can be reasonable to consider the one-sided confidence regions For the special cases of trend alternatives and genetic models, compatible SCI based on pairwise rankings are provided by Konietschke and Hothorn (2012) [25] and Konietschke, Libiger, and Hothorn (2012) [26].
Corollary 2. Under the assumptions of Theorem 2, the vector Proof. The proof follows by Theorem 2 and by Cramér's multivariate Delta-Theorem.
Since g(x ℓ ) and g −1 (y ℓ ) are both strictly monotone transformations, the range preserving SCI are compatible to the individual test decisions, by construction.

Small sample approximations and simulation results
The procedures considered in the previous section are valid for large sample sizes. The quality of the approximations by multivariate normal distributions of the proposed methods were investigated by simulation studies for different numbers of factor levels, sample sizes, and different kinds of contrasts. The simulations indicate that the convergence of T defined in (4.1) to its asymptotic multivariate normal distribution is rather slow. In general, the approximation is worse for a large number of factor levels and smaller sample sizes. Thus we also consider a small sample modification of this statistic. We adopt the Box-type approximation (Box 1954 [2]) proposed by Brunner, Dette and Munk (1997) [6] and   [17] to approximate the distribution of T by a multivariate T (ν, 0, R) distribution with ν degrees of freedom, expectation 0 and correlation matrix R.
For each linear contrast c ′ ℓ = (c ℓ1 , . . . , c ℓa ), ℓ = 1, . . . , q, define the random variables A ℓik = c ℓi (G(X ik ) − w i F i (X ik )) − s =i c ℓs w s F s (X ik ). By reorganizing the asymptotic equivalent sums of random variables in (3.4), it is easily seen that With the same arguments as in the proof of Theorem A.4, the unknown variances ω 2 ℓi can be consistently estimated by the empirical variances   [17], the distribution of T can be approximated by a multivariate T (ν, 0, R) distribution with ν = max{1, min ℓ=1,...,q {ν 1 , . . . , ν q }} degrees of freedom, where The quality of the modifications of the MCTP T in (4.1) and T in (4.6) to their finite-sample distributions by multivariate T (ν, 0, R) distributions were investigated for Dunnett-type (D), Tukey-type (T), Average-type (A), and changepoint (C) comparisons in different one-way layouts with sample sizes: The corresponding contrast matrices for these four kinds of different contrasts are given in Section A.5.
The results reported here constitute a representative set from a much larger simulation study using R (www.r-project.org). All simulation results were obtained from 10,000 simulation runs. The equicoordinate quantiles were computed with the R-package mvtnorm (Genz and Bretz 2009 [19]). We also include the parametric counterpart of T proposed by Hasler and Hothorn (2008) [20] (H) in the simulation study. This MCTP denotes T without ranking the data. In case of skewed distributions, H tends to be quite conservative in case of positive dependent test statistics (e.g. many-to-one comparisons), but very liberal when the test statistics are negatively dependent (average comparisons). The simulation results for the different designs are displayed in Table 1.
The simulation results indicate that the rank-based MCTP T controls the FWER quite accurately, even in case of small sample sizes, all considered numbers of factor levels, and arbitrary contrasts for both normal and lognormal distributions (see Table 1). For other distributions, e.g. exponential or even ordered categorical data, the simulation results were quite similar and are not shown here. The MCTP T tends to be slightly liberal in case of extremely  T  T  H  T  T  H  T  T  H  T  T  small sample sizes and large numbers of factor levels. The parametric MCTP H shows a quite liberal or even quite conservative behavior in case of skewed distributions, depending on the chosen contrast. The powers of the rank based MCTP T and T were compared with the power of the parametric MCTP H for all-pairs comparisons. Hereby both the all-pairs power P("reject all false null hypotheses") and the any-pairs power P("reject any true or false null hypothesis") were investigated for a one-point shift alternative δ = (0, 0, 0, δ) ′ . The simulation results for a = 4 levels and equal sample sizes (n i = 25) are shown in Figure 2.
The powers of the tests were investigated for both normal and lognormal distributions. For normal distributions, the powers of the nonparametric rank tests are nearly as powerful (even for relatively small sample sizes) as the parametric MCTP H. For lognormal distributions, the powers of the rank tests are considerably higher than the power of the parametric version. Simulation results for other distributions, unbalanced designs, and different kinds of contrasts were quite similar and are not shown here.

Example
In this section we apply the MCTP and compatible SCI proposed in the previous sections to a dataset with ordinal data analyzed by Akritas et al. (1997) [1]. Originally, two inhalable test substances (drug 1 an drug 2), each in a concentration of 2 ppm, 5 ppm, and 10 ppm, were compared with regard to their irritative activity in the respiratory tract of the rat after subchronic inhalation. In each level, 20 rats were graded on an ordinal scale: 0=no irritation, 1 = slightly irritation, 2=distinct irritation, and 3=severe irritation. Here, we only analyze the data for drug 1 with the R-software package nparcomp. The results and point estimators p 2 , p 5 , and p 10 of the relative treatment effects p 2 , p 5 , and p 10 are displayed in Table 2.  From Table 2, it follows that the scores obtained with concentration 2 ppm tend to be smaller ( p 2 = .31) than the scores in group 5 ppm ( p 5 = .45), and in group 10 ppm ( p 10 = .73), respectively. The MCTP T in (4.6) can be used to test the multiple hypotheses H p 0 : p 2 = p 5 , H p 0 : p 2 = p 10 , and H p 0 : p 5 = p 10 at multiple level α = 5% by taking the correlation between the test statistics into account as well as for the computation of compatible SCI. The estimated degree of freedom of the corresponding multivariate t-distribution is given by ν = 28.72. The adjusted p-values for the individual hypotheses H p 0 : c ′ ℓ p = 0 are calculated by 1−Φ(−| T 0.5 ℓ |1 3 , | T 0.5 ℓ |1 3 , 28.72, 0, R), where Φ(·, ·, ν, 0, R) denotes the cumulative distribution function of the multivariate T (ν, 0, R) distribution.
The results are displayed in Table 3. It follows immediately that the irritation of the respiratory tract of the rats in group 2 ppm is milder than the damage effected by 5 ppm, and 10 ppm, respectively (p < .0001 and p = 1.67 · 10 −3 ). The lower bounds of the 95%-SCI are larger than zero. The data do not provide any evidence to reject the null hypothesis H p 0 : p 2 = p 5 at multiple level α = 5% (p = 6.31 · 10 −2 ; 95%-SCI: [−4 · 10 −3 ; .28]). We can conclude that the damage gets worse with an increasing concentration of the test substance.

Discussion
Recently, Elliot and Hynan (2011) [13] propose a SAS macro implementation of a multiple comparison post hoc test for a Kruskal-Wallis analysis. The procedure is an omnibus test based on two steps: (1) testing the global null hypothesis, and (2) performing multiple comparisons. This nonparametric procedure cannot be used for the computation of confidence intervals for the effects of interest. In this manuscript, rank-based MCTP and compatible SCI for transitive relative effects in unbalanced designs have been introduced. The procedures are based on the asymptotic multivariate normality of linear rank statistics under arbitrary alternatives. Explicit expression for the covariance matrix of the rank statistics, as well as their multivariate normality, are obtained in a technically simple and general framework. Subsequent covariance estimation is achieved in terms of the empirical distribution functions. Under this unified framework, the procedures can be used for testing arbitrary multiple linear hypotheses in terms of relative effects, with an accompanying computation of compatible SCI for the treatment effects. Some simulation results demonstrate the practical benefit of the proposed methods. For a convenient application of the proposed methods, the R-software package nparcomp was developed and is available on CRAN. N , . . . , λ j,N } and D 2 = diag{λ j+1,N , . . . , λ a,N }. Thus, D 1 → 0, by assumption. If j = a, then V N = 0, which can be considered as a multivariate normal one-point distribution. The asymptotic multivariate normality of the sums of independent random variables √ N WZ is now established by the Cramer-Wold device. Let k = (k 1 , . . . , k a ) ′ denote an arbitrary vector of constants. Since S is invertable and thus describes a bijektive map, there exists for each k a vector k with k ′ = k ′ S. From the Lindeberg-Feller limit theorem it follows that This means that the sums of the variances of N k ′ WZ diverge for N → ∞ and Linderberg's condition is fulfilled, because the random variables N/n i Y ijk are uniformly bounded by the assumption that N/n i ≤ N 0 < ∞.

A.2. Proof of Lemma 1
Under the assumptions of Theorem 2, T follows, asymptotically, as N → ∞, a multivariate normal distribution with expectation 0 and correlation matrix R. Thus, the asymptotic joint distribution of T is completely specified under Each test statistic T p ℓ converges, as N → ∞, to the standard normal distribution. In particular, the asymptotic distribution of T p ℓ is independent from the distribution of T p m (ℓ = m). This means, that under arbitrary intersection hypotheses H p 0 : j∈J {H p 0 : c ′ j p = 0}, the asymptotic joint distribution of T J = {T p j , j ∈ J} is completely specified. Here, J ⊆ {1, . . . , q} denotes an arbitrary set of indexes. This completes the proof.
A.4. Estimation of the covariance matrix V N Lemma A.2. Let || · || ∞ denote the sup norm and let D ijk as given in (3.8).
where the last step follows from the Glivenko-Cantelli theorem.
Proof. Since a is bounded, it is sufficient to show consistency elementwise. Consider the covariance θ ji,si = Cov (Y ji1 , Y si1 ) and let θ ji,si as given in (3.7). By the strong law of large numbers, θ ji,si − θ ji,si a.s.
For the other elements the proof is basically the same and is therefore omitted. The rest of the proof follows by considering linear combinations of the estimators.

A.5. Simulation settings
The contrast matrices used in the simulation studies (see Section 5) are given by the Dunnett-type (D) contrast matrix   where N ij = n i + . . . + n j ; i < j. For a comprehensive overview of different contrasts we refer the reader to Bretz et al. (2001) [3].