Ball: An R package for detecting distribution difference and association in metric spaces

The rapid development of modern technology facilitates the appearance of numerous unprecedented complex data which do not satisfy the axioms of Euclidean geometry, while most of the statistical hypothesis tests are available in Euclidean or Hilbert spaces. To properly analyze the data of more complicated structures, efforts have been made to solve the fundamental test problems in more general spaces. In this paper, a publicly available R package Ball is provided to implement Ball statistical test procedures for K-sample distribution comparison and test of mutual independence in metric spaces, which extend the test procedures for two sample distribution comparison and test of independence. The tailormade algorithms as well as engineering techniques are employed on the Ball package to speed up computation to the best of our ability. Two real data analyses and several numerical studies have been performed and the results certify the powerfulness of Ball package in analyzing complex data, e.g., spherical data and symmetric positive matrix data.


Introduction
With the advanced modern instruments such as the Doppler shift acoustic radar, functional magnetic resonance imaging (fMRI) apparatus, and Heidelberg retina tomograph device, a large number of complex datasets are being collected for contemporary scientific research. For example, to investigate whether the wind directions of two places are distinct, meteorologists measure the wind directions by colatitude and longitude coordinates on the earth. Another typical example arises in biology. By using fMRI data, biologists are able to study the association between the brain connectivity and the age. Although these complex datasets Recently, two novel concepts, Ball Divergence (Pan et al. 2018a) and Ball Covariance (Pan et al. 2018b), are proposed to measure the discrepancy between two distributions and the dependence between two random objects in metric spaces, respectively. Ball Divergence (BD) enjoys a remarkable property, homogeneity-zero equivalence, and Ball Covariance (BCOV) holds another brilliant property, independence-zero equivalence. The BD and BCOV statistics, as the empirical versions of BD and BCOV, can tackle the two-sample test and test of independence problems, which are special cases of the K-sample test and test of mutual independence problems. The BD and BCOV statistics are both robust rank statistics, and the test procedures based on them are consistent against any general alternative hypothesis without distribution or moment assumptions (Pan et al. 2018a,b). Besides, the BD statistic is proved to cope well with imbalanced data, and the BCOV statistic can be standardized to the Ball Correlation statistic to extract important features from ultra-high dimensional data (Pan, Wang, Xiao, and Zhu 2019).
In this paper, we introduce a user-friendly R package Ball (Wang et al. 2018). The Ball package contributes to the open-source statistical software community in the following aspects: (i) It provides the BD two-sample test and the BCOV independence test to R users; (ii) It implements several dedicated design algorithms to accelerate the BD two-sample test and the BCOV independence test; (iii) It provides three powerful K-sample BD test statistics and an efficient K-sample permutation test procedure to distinguish distributions in metric spaces; (iv) It extends the BCOV test statistic to detect the mutual dependence among complex random objects in metric spaces; (v) It supports several generic sure independence screening procedures which are capable of extracting important features associated with complex objects in metric spaces. At present, the Ball package is available from the Comprehensive R Archive Network (CRAN) at https://CRAN.R-project.org/package=Ball.
The remaining sections are organized as follows. In Section 2, we propose our Ball test statistics and Ball test procedures to tackle the K-sample test and test of mutual independence problems in metric spaces. In Section 3, we introduce several novel and efficient algorithms for the Ball test statistics and Ball test procedures. Section 4 gives a detailed account for the main functions in the Ball package and provides two real data examples to demonstrate their usages for complex dataset. Section 5 discusses the numerical performance of the Ball test statistics in the K-sample test and the test of mutual independence problems. Finally, the paper concludes with a short summary in Section 6.

Ball test statistics
In this section, we define and illustrate the Ball Divergence (BD) and Ball Covariance (BCOV) statistics in Section 2.1 and 2.2, respectively. We describe the details of the Ball test procedures in Section 2.3.

K-sample test and Ball Divergence statistic
In a metric space (V, d), given K independent observed samples X k = {X ki |i = 1, . . . , n k }, k = 1, . . . , K, assuming that in the k-th sample, X k1 , . . . , X kn k are i.i.d. and associated with the Borel probability measure µ k . The null hypothesis of the K-sample test problem is formulated as H 0 : µ 1 = · · · = µ K .
We first revisit the BD statistic designed for the two-sample test problem (Pan et al. 2018a), a special case of the K-sample test problem, where K = 2 and N = n 1 + n 2 . Let I(·) be an indicator function. For any x, y, z ∈ V , denote B(x, y) as a closed ball with center x and radius d(x, y), and δ(x, y, z) = I(z ∈ B(x, y)). Therefore, δ(x, y, z) takes the value of 1 when z is inside the closed ball B(x, y), or 0 otherwise. Let δ(X 1i , X 1j , X 1t ), P µ 1 µ 2 ij = 1 n 2 n 2 t=1 δ(X 1i , X 1j , X 2t ), then the two-sample BD statistic is defined as Intuitively, if X 1 and X 2 come from the same distribution, the proportions of the elements of X 1 and X 2 in the closed balls B(X 1i , X 1j ) and B(X 2k , X 2l ) are almost the same, in other words, Consequently, BD N approaches to zero in this scenario. Otherwise, if X 1 and X 2 come from two distinct distributions, then BD N is, relatively, far away from zero.
Generally, for K > 2 and N = K k=1 n k , the definition of the K-sample BD statistic could be to directly sum up all of the two-sample BD statistics or to find one group with the largest difference to other groups or to aggregate the K − 1 most significant two-sample BD statistics where BD (1) , . . . , BD (K−1) are the largest K − 1 two-sample BD statistics among the set

Test of mutual independence and Ball Covariance statistic
Assume that (V 1 , d 1 ), . . . , (V K , d K ) are metric spaces, D is the Euclidean norm on R K , and (V, d) is the product metric space of (V 1 , d 1 ), . . . , (V K , d K ), where the product metric is defined by d((x 1 , . . . , x K ), (y 1 , . . . , y K )) = D(d 1 (x 1 , y 1 ), . . . , d K (x K , y K )). Suppose X = {(X i1 , . . . , X iK )|i = 1, . . . , N } is a sample with N i.i.d. observations in the product metric space V , whose associated joint Borel probability measure and marginal Borel probability measures are µ and µ 1 , . . . , µ K . The null hypothesis of the test of mutual independence problem is formulated as For x, y, z ∈ V i , denote B i (x, y) as a closed ball with center x and radius d i (x, y), and δ i (x, y, z) = I(z ∈ B i (x, y)). Thus, δ i (x, y, z) is an indicator taking the value of 1 if z is within the closed ball B i (x, y), or 0 otherwise. Let then our BCOV statistic can be presented as We provide a heuristic explanation for BCov K N . If µ = µ 1 ⊗ · · · ⊗ µ K , the proportion of the elements of X in ⊗ K k=1 B k (X ik , X jk ) should be close to the product of the proportions of X 1k , . . . , X N k in B k (X ik , X jk ), i.e., P µ ij ≈ K k=1 P µ k ij . Since BCov K N is the average of {(P µ ij − K k=1 P µ k ij ) 2 |i, j = 1, . . . , N }, a significantly large BCov K N implies that the null hypothesis µ = µ 1 ⊗ · · · ⊗ µ K is implausible.
The BCOV statistic can be extended with positive weightsω k (X ik , X jk ), k = 1, . . . , K, As a more general dependence measure framework based on the BCOV statistic, such a weighted extension not only allows flexible test statistics but also connects the BCOV statistic with HHG. Several choices for the weights are feasible. For example, we could choose the probability weightω k (X ik , X jk ) = [P µ k ij ] −1 and the Chi-square weightω k (X ik , X jk ) = [P µ k ij (1 − P µ k ij )] −1 , and denote their corresponding statistics as BCov K ∆,N and BCov K χ 2 ,N , respectively. BCov K ∆,N focuses on smaller balls, while BCov K χ 2 ,N standardizes (P µ ij − K k=1 P µ k ij ) 2 by the variance of δ k (X ik , X jk , X tk ) (i, j are fixed). Furthermore, BCov K χ 2 ,N (K = 2) is asymptotically equivalent to HHG (Pan et al. 2018b). From the definitions of BCov K N , BCov K ∆,N , and BCov K χ 2 ,N , we may expect: (i) BCov K N is good at detecting linear relationships, especially when noises are influential, because treating each ball indiscriminately is a reasonable strategy which keeps weights away from potential instabilities; (ii) BCov K ∆,N has more power in detecting strong nonlinear relationships since paying more attention to smaller balls makes BCov K ∆,N tending to detect the locally linear relationship; (iii) BCov K χ 2 ,N (or HHG) is an intermediate between BCov K N and BCov K ∆,N .
The BCOV statistic can be normalized. Let then the normalized version of the BCOV statistic is defined as the square root of is the Ball Correlation statistic (Pan et al. 2018b) which ranges from 0 to 1.

Ball permutation test procedure
After computing an observed Ball test statistic, say B, the permutation methodology described in Efron and Tibshirani (1994) and Davison and Hinkley (1997) is employed to derive the p value of the Ball test procedures in a distribution-free manner. We specify the permutation procedures for the K-sample test and test of mutual independence problems below.
For the K-sample test problem, let k n k be an n k -dimensional vector whose entries are all k, and let L = (1 n 1 , 2 n 2 , . . . , K n K ) be the group label vector of the pooled sample X = X 1 ∪· · ·∪X K . Denote L * as a permutation of L and With the shuffled pooled sample X * , we can compute the K-sample BD statistic. The permutation is replicated M times to derive the K-sample BD statistics under the null hypothesis: B 1 , . . . , B M . Finally, the p value is computed according to: With respect to the test of mutual independence problem, each permutation is performed following this manner: for each k ∈ {1, . . . , K}, X 1k , . . . , X N k are randomly shuffled while X 1k , . . . , X N k (k = k) are fixed. The permutation is replicated M times and the p value is estimated by Equation 1.

Algorithm
The computational complexities of the Ball test statistics are O(N 3 ) if we compute them according to their definitions; however, in many cases, their computational complexities can reduce to a more moderate level (see Table 1) by efficient algorithms utilizing the rank nature of the Ball test statistics. The rank nature motivates the acceleration in three aspects. First, the computation procedure of the Ball test statistics can avoid repeatedly evaluating whether a point is in a closed ball because n t P µsµt ij (s, t ∈ {1, . . . , K}) and N P µ k ij (k = 1, . . . , K) are related to some ranks. Second, for the K-sample permutation test, performing ranking procedures could be more efficient by preserving information for the first time when we compute the K-sample BD statistics. Third, for univariate datasets, we can optimize the ranking procedure for computing n t P µsµt ij (s, t ∈ {1, . . . , K}) and N P µ k ij (k = 1, . . . , K) without preserving auxiliary information. The three aspects will be illustrated in Section 3.1, 3.2, and 3.3, respectively. For simplicity, we assume there is no ties among datasets. For our Ball package, it can properly handle tied data and compute the exact Ball test statistics.
Unfortunately, in the case of measuring mutual dependence among at least three random objects, it is not easy to optimize the computation procedure of the BCOV statistic. Nevertheless, with engineering optimizations such as multi-threading, the time consumption of computing the BCOV statistic can be cut down.

Statistics
Univariate Other

Rank based algorithm
We first recap the O(N 2 log N ) algorithm for BD N proposed by Pan et al. (2018a). Assume the pairwise distance matrix of X = X 1 ∪ X 2 is: Notice that, n 1 P µ 1 µ 1 ij is the rank of d(X 1i , X 1j ) among the i-th row of D X 1 X 1 , and n 1 P µ 1 µ 1 ij + n 2 P µ 1 µ 2 ij is the rank of d(X 1i , X 1j ) among the i-th row of D X X . Consequently, after spending O(N 2 log N ) time on ranking D X 1 X 1 and D X X row by row to obtain their corresponding rank matrices R X 1 X 1 and R X X , we only need O(1) time to compute n 1 P µ 1 µ 1 ij and n 1 P µ 1 µ 1 ij + n 2 P µ 1 µ 2 ij by directly extracting the (i, j)-elements of R X 1 X 1 and R X X . Therefore, computing {(P µ 1 µ 1 ij , P µ 1 µ 2 ij )|i, j = 1, . . . , n 1 } is of O(N 2 log N ) time complexity, and similarly for computing {(P µ 2 µ 1 kl , P µ 2 µ 2 kl )|k, l = 1, . . . , n 2 )}. In summary, for BD N and the K-sample BD statistics, their time complexities are O(N 2 log N ).
With respect to BCov K ω,N (K = 2), the aforementioned algorithm can be directly applied to calculate P µ 1 ij and P µ 2 Owing to Equation 2, the computation of {N P µ ii j |j = 1, . . . , N } could be turned into com- Notice that, given the array . . , N } is the typical "count of smaller numbers after self" problem 2 , which can be solved within O(N log N ) time via well designed algorithms such as binary indexed tree, binary search tree, and merge sort. Our implementation (See Appendix) utilizes merge sort to resolve the problem due to its efficiency. Thus, the time complexity of {N P µ ij |i, j = 1, . . . , N } reduces to O(N 2 log N ), and finally, the computation of BCov K ω,N (K = 2) costs O(N 2 log N ) time.

Optimized K-sample permutation test procedure
In this section, an efficient procedure of O(N 2 ) time complexity is introduced to compute the K-sample BD statistics on a shuffled dataset X * . According to the definition of the Ksample BD statistics, to reduce their time complexity to O(N 2 ), we should reduce the time complexity of any BD ns+nt to O(N 2 ). From Section 3.1, to reduce the time complexity of BD ns+nt to O(N 2 ), we need to compute the row-wise ranks of pairwise distance matrices We propose Algorithm 1 to achieve this goal. The core of Algorithm 1 is utilizing an N × N order information matrix I X X as well as an N -dimensional vector G. For the order information matrix I X X , I X X ij = q means that the j-th smallest element of the i-th row of pairwise distance matrix D X X is D X X iq . I X X can be obtained for the first time when we rank each row of the pairwise distance matrix D X X . Concerning the vector G, it is related to the shuffled group label vector L * and the cumulative sample size vector C = (0, n 1 , . . . , K−1 k=1 n k ). Specifically, when L * i = k, By scanning I X X in a row-wise manner, Algorithm 1 can assign a proper rank value to the element of the row-wise rank matrices of D X * s X * s , D X * t X * t , and D X * s,t X * s,t with the help of the vector G.

Fast algorithm for univariate data
We first consider BD N . For convenience, assume X 11 , . . . , X 1n 1 have been sorted in ascending order. For univariate data, we have Let X 1l ij be the smallest value satisfying X 1l ij ≥ X 1i − |X 1i − X 1j | and X 1r ij be the largest element satisfying X 1r ij ≤ X 1i + |X 1i − X 1j |. Thanks to Equation 3, it is easy to verify that n 1 P µ 1 µ 1 ij = r ij − l ij + 1, and consequently, an alternative way to compute n 1 P µ 1 µ 1 ij is to find out X 1l ij and X 1r ij . Inspired by this, we develop Algorithm 2 to accomplish the computation of {n 1 P µ 1 µ 1 ij |j = 1, . . . , n 1 } in linear time. Through slightly modifying Algorithm 2, 2 https://leetcode.com/problems/count-of-smaller-numbers-after-self/ the computational complexity of {n 1 P µ 1 µ 1 ij + n 2 P µ 1 µ 2 ij |j = 1, . . . , n 1 } also reduces to O(N ), and hence, the computational complexity of {(n 1 P µ 1 µ 1 ij , n 1 P µ 1 µ 2 ij )|i, j = 1, . . . , n 1 } reduces to O(N 2 ). Similarly, the time complexity of {(n 1 P µ 2 µ 1 kl , n 2 P µ 2 µ 2 kl )|i, j = 1, . . . , n 2 } is O(N 2 ). In summary, the computational complexity of BD N is O(N 2 ) for univariate distributions, so are BD S K N , BD MS K N , and BD M K N . As for BCov K ω,N (K = 2), by slightly modifying Algorithm 2, the time complexity of computing . . , N } within quadratic time by Algorithm 3. The key of Algorithm 3 is employing the inclusion-exclusion principle which is also used in Heller, Heller, Kaufman, Brill, and Gorfine (2016). In summary, with Algorithms 2 and 3, the time complexity of BCov K ω,

The Ball package
In this section, we introduce the Ball package, an R package which implements the Ball test statistics and procedures introduced in Section 2 as well as the algorithms illustrated in Section 3. The core of Ball is programmed in C to improve computational efficiency. Moreover, we employ the parallel computing technique in the Ball test procedures to speed up the computation. To be specific, during the permutation procedure, multiple Ball test statistics are concurrently calculated with OpenMP which supports the multi-platform shared memory multiprocessing programming in C level. Aside from speed, the Ball package is concise and user-friendly. An R user can conduct the K-sample and independence tests via bd.test and bcov.test functions in the Ball package, respectively. We supply instructions and primary usages for bd.test and bcov.test functions in Section 4.1. Additionally, in Section 4.2, two real data examples are provided to demonstrate how to use these functions to tackle data drawn from manifold spaces.
• x: a numeric vector, matrix, or data.frame, or a list containing at least two numeric vectors, matrices, or data.frames.
• y: a numeric vector, matrix, or data.frame.
• distance: if distance = TRUE, x is considered as a distance matrix, or a list containing distance matrices in the test of mutual independence problem. And y is considered as a distance matrix only in the test of independence problem. Default: distance = FALSE.
• size: a vector recording the sample size of K groups. It is only available for bd.test.
• weight: a logical value or character string used to choose the form ofω k (X ik , X jk ). If weight = FALSE or weight = "constant", the result of BCov K N based test is displayed. Alternatively, weight = TRUE or weight = "probability" indicates the probability weight is chosen while setting weight = "chisquare" means selecting the Chi-square weight. From the definitions of BCov K N , BCov K ∆,N , and BCov K χ 2 ,N , they are quite similar and could be computed at the same time. Therefore, bcov.test simultaneously computes BCov K N , BCov K ∆,N , BCov K χ 2 ,N , and their corresponding p values. Users could get other statistics and p values from the complete.info element of output without re-running bcov.test. At present, this arguments is only available for bcov.test. Any unambiguous substring can be given. Default: weight = FALSE.
• num.threads: the number of threads used in the Ball test procedures. If num.threads = 0, all available cores are used. Default: num.threads = 0.
• kdb.type: a character string used to choose the K-sample BD statistics. Setting kdb.type = "sum", kdb.type = "summax", or kdb.type = "max", we choose BD S K N , to test the equality of distributions. Notice that, the three Ksample BD statistics are very similar, since their only difference is the aggregation strategy for the two-sample BD statistics. Therefore, the bd.test function simultaneously computes BD S K N , BD MS K N , BD M K N , and their corresponding p values. Users could get other statistics and p values from the complete.info element of output without re-running bd.test. This arguments is only available for the bd.test function. Any unambiguous substring can be given. Default: kdb.type = "sum".
If num.permutations > 0, the output is a htest class object similar to the object returned by the t.test function. The output object contains the Ball test statistic value (statistic), the p value of the test (p.value), the number of permutation replications (replicates), a vector recording the sample size (size), a list mainly containing two vectors, where the first vector is BD S K N , BD MS K N , and BD M K N for bd.test or BCov K N , BCov K ∆,N , and BCov K χ 2 ,N for bcov.test, and the second vector is the corresponding p values of the tests (complete.info), a character string declaring the alternative hypothesis (alternative), a character string describing the hypothesis test (method), and a character string giving the name and helpful information of the data (data.name); if num.permutations = 0, only the Ball test statistic value is returned.
To give quick examples, we carry out the Ball test on two synthetic datasets to check whether the null hypothesis can be rejected when distributions are different or random variables are associated indeed.
The detailed R code is as follows.
R> library("Ball") R> set.seed(1) R> x <-rnorm(50) R> y <-rnorm(50, mean = 1) R> bd.test(x = x, y = y) The output of bcov.test shows that BCov K N is 6.38 × 10 −4 when the constant weight is used. The p value is 0.02, and at the usual significance level of 0.05, we conclude that the three univariate variables are mutually dependent.

Wind direction dataset
We consider the hourly recorded wind speed and wind direction in the Atlantic coast of Galicia in winter from 2003 until 2012, provided in the R package NPCirc (Oliveira, Crujeiras, and Rodríguez-Casal 2014). In this dataset, there exist 19488 observations, and each observation includes six variables: day, month, year, hour, wind speed, and wind direction (in degrees). It is of interest to see whether there are any wind direction differences between the first and last weeks of 2007-08 winter season. We select the wind direction records from November 1 to November 7, 2007, as the first week data, and the records from January 25 to January 31, 2008, as the last week data. The missing records in two weeks are discarded.
R> library("Ball") R> data("speed.wind", package = "NPCirc") R> index1 <-which ( Each wind direction is one-to-one transformed to a two-dimensional point in the Cartesian coordinates, and then, the difference of any two points is measured by the great-circle distance which is programmed in the nhdist function in the Ball package. R> theta <-c(d1, d2) / 360 R> dat <-cbind(cos(theta), sin(theta)) R> dx <-nhdist(dat, method = "geo") In the final step, we pass the distance matrix and the sample size of two groups to the arguments x and size, and set distance = TRUE to declare that the object passed to the arguments x is a distance matrix.
R> size_vec <-c(length(d1), length(d2)) R> bd.test(x = dx, size = size_vec, distance = TRUE) 2-sample Ball Divergence Test data: dx number of observations = 335, group sizes: 168 167 replicates = 99 bd = 0.29114, p-value = 0.01 alternative hypothesis: distributions of samples are distinct As can be seen from the output information of bd.test, BD N is 0.2911 and the p value is 0.01. Consequently, at the usual significance level of 0.05, we should reject the null hypothesis. To further confirm our conclusion, we visualize the wind direction of two groups in Figure 1 with the R package circular (Agostinelli and Lund 2017). Figure 1 shows that the hourly wind directions in the first week is concentrated around the 90 degrees but the wind directions of the last week are widely dispersed. R> library("circular") R> par(mfrow = c(1, 2), mar = c(0, 0, 0, 0)) R> plot(circular(c(d1), units = "degrees"), bin = 100, stack = TRUE, + shrink = 1.5) R> plot(circular(c(d2), units = "degrees"), bin = 100, stack = TRUE, + shrink = 1.5) 0 90 180 270 + q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q 0 90 180 270 + q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q qq q q q q q qq q q q q qq qq q q q qq q q q q q q q q q q q q q q q q q q q q q q q q Figure 1: The raw circular data plot of the hourly wind direction dataset. The left panel is corresponding to the first-week wind directions in 2007-08 winter and the right panel is corresponding to the last week's.

Brain fMRI dataset
We examine a public fMRI dataset from the 1000 Functional Connectomes Project 3 . This project calls on the principal investigators from the member site to donate neuroimaging data such that the broader imaging community have complete access to a large-scale functional imaging dataset. Given the resting-state fMRI and demographics of 86 individuals donated from ICBM, it is of interest to evaluate whether age is associated with brain connectivity. To properly analyze the dataset, we carry out a preprocessing for the three fMRI of each individual with the nilearn ( (iv) average the three 116×116 matrices of each observation in an element-wise manner and save the averaged matrix to disk such that it can be analyzed with R. In the R environment, the collection of the averaged matrix and demographics are combined into a list object, then it is saved to a disk as "niICBM.rda" file.
To achieve our goal, we compute the pairwise distance matrices of the averaged matrices and the age, then save them in the R objects dx and dy, respectively. The output message shows that BCov K ∆,N is 0.0171 and the p value is smaller than the usual significance level 0.05, and thus, we conclude that brain connectivity is associated with age. This result is also revealed by recent research that age strongly effects structural brain connectivity (Damoiseaux 2017). In this example, we use the Euclidean distance to measure the difference between age, and the affine invariant Riemannian metric (Pennec, Fillard, and Ayache 2006) to evaluate the structural difference between the averaged Pearson correlation coefficient matrices. The affine invariant Riemannian metric is implemented in CovTools (Lee and You 2018)

Numerical studies
In this section, the numerical studies are conducted to assess the performance of the Ball test procedures for complex data, including directional data in hyper-sphere spaces, treestructured data in tree metric spaces, symmetric positive definite matrix data in the space of symmetric positive-definite matrices, and functional data in L ∞ spaces. Besides, a runtime analysis is provided in Section 5.3. For comparison, we consider energy distance and HHG for the K-sample test problem, while distance covariance, distance multivariance, and HHG for the test of mutual independence problem. The permutation technique helps us obtain the empirical distributions of these statistics under the null hypothesis, and derive their p values. As suggested in Davison and Hinkley (1997), at least 99 and at most 999 random permutations should suffice, and hence, we compute the p value of each test based on 399 random permutations. All models in Sections 5.1 and 5.2 are repeated 500 times to estimate Type-I errors and powers. In each replication, all methods use the same dataset and the same non-standard distance to make a fair comparison. The significance level is fixed at 0.05.

K-sample test
In this section, we investigate the performance of test statistics on revealing the distribution difference with two kinds of complex data, directional data and tree-structured data. They are frequently encountered by scientists interested in wind directions, marine currents (Marzio, Panzera, and Taylor 2014), and cancer evolution (Abbosh et al. 2017). To sample directional and tree-structured data, we use the rmovMF and rtree functions in the R packages movMF (Hornik and Grün 2014) and ape (Paradis, Schliep, and Schwartz 2018) to draw data from the von Mises-Fisher distribution M (µ, κ) and random tree distribution T (n, {b i } n i=1 ), where µ and κ are direction and concentration parameters while n and {b i } n i=1 are the numbers of tree nodes and the branch lengths of tree. The dissimilarities of two directions and two trees are measured by the great-circle distance and the Kendall Colijn metric (Kendall and Colijn 2015), which are programmed in the nhdist and multiDist functions in the R packages Ball and treespace (Jombart, Kendall, Almagro-Garcia, and Colijn 2017).
We conduct the numerical analyses for directional data in Models 5.1.1, 5.1.3-5.1.5, and 5.1.9 while tree-structured data in other models. Models 5.1.1 and 5.1.2 are designed for Type-I error evaluation, while other models are devoted to evaluating powers. More specifically, Models 5.1.3-5.1.8 focus on the case that any two groups are different, while Models 5.1.9 and 5.1.10 pay attention to the case that only one group is different to other groups. Without loss of generality, we let K = 4 for Models 5.1.1-5.1.8 and K = 10 for Models 5.1.9 and 5.1.10. Each group has the same sample size ranging from 10 to 50.
• Model 5.1.2: Random tree distribution with fifteen nodes. b w i , b x i , b y i , b z i , i = 1, . . . , 15 are independently sampled from the uniform distribution U (0, 1).
Since only one group is different to other groups, it is sufficient to specify the following Models by describing the distributions of two groups.
The Type-I error rates and power estimates are demonstrated in Figures 2 and 3. From Figure 2, all test methods can control the Type-I error rates well around the significance level. Figure 3 shows that BD S K N , BD MS K N , or BD M K N outperforms energy distance and HHG in most cases. More specifically, BD S K N is generally superior to other methods when any two groups are different, and BD MS K N has an advantage when the relatively rare group distinctions increase the difficult of the K-sample test problem. As for BD M K N , it is more stable compared with BD S K N and BD MS K N . BD M K N is better than BD MS K N when any two groups are different, and better than BD S K N when the group distinctions are relatively rare. In Models 5.1.4 and 5.1.5, the empirical powers of energy distance are low and not increasing, yet the K-sample BD statistics and HHG are well-performed. This is not surprising because the great-circle distance is not of strong negative type. We provide an example to illustrate this result as follows. Without loss of generality, we simplify the distributions in Model 5.1.4 to the equal-probability Bernoulli distributions on a circle, where W, X take (0, 1) and (0, −1) while Y, Z take (1, 0) and (−1, 0). Then, the K-sample test problem could be considered as the two-sample test problem between groups X and Y . For the two groups X and Y , the means of the great-circle distance within the group are both π/2, so is the mean of the great-circle distance between the two groups. The energy distance between X and Y is zero according to its definition (Székely and Rizzo 2013) where X and Y are i.i.d. copy of X and Y , respectively. Thus, energy distance fails to detect the distribution difference. On the contrary, BD N is larger than zero since all observations in B((0, 1), (0, 1)) ∪ B((0, −1), (0, −1)) come from the group X and all observations in B ((1, 0), (1, 0)) ∪ B((−1, 0), (−1, 0)) come from the group Y . The result of Model 5.1.5 can be interpreted similarly.

Test of mutual independence
In this section, we evaluate the performance of test methods on detecting the relationship among complex random objects. The complex random objects attracting our attention are symmetric positive definite matrix and functional curve, commonly encountered in contemporary statistical research, for instance, Dryden, Koloydenko, and Zhou (2009) and Wang, Chiou, and Müller (2016). We generate the two types of random objects with the  We design Models 5.2.1-5.2.2 for the examination of Type-I errors and Models 5.2.3-5.2.10 for the assessment of powers. Let the sample size increase from 40 to 120.
• Model 5.2.4: The distributions of Z, 1 , 2 , 3 are the same as in Model 5.2.3.
• Model 5.2.5: c 1 , c 2 , c 3 are independent standard normal random vectors, and 1 (t) and 2 (t) are independent gaussian processes, • The following four models are constructed for evaluating the power of test methods in the test of mutual independence problem. To the best of our knowledge, only multivariance and Ball allow R users to perform the mutual independence test on datasets in metric spaces. And hence, we only compare Ball and multivariance below.
• Model 5.2.7: The distributions of Z, 1 , 2 , 3 are the same as Model 5.2.3, and 4 , 5 are independently drawn from Pareto distribution with location parameter 0.8 and shape parameter 1.
• Model 5.2.9: Z 1 , Z 2 are independently sampled from the binomial distribution B(1, 0.5), Z 3 = I(Z 1 = Z 2 ). Let c 0 = (0, 1, 0), c 1 = (0, 0, 1), • Model 5.2.10: X is sampled from the binomial distribution B(1, 0.5), and 1 (t) and 2 (t) are independent gaussian processes, The Type-I error rates and empirical power are displayed in Figures 4, 5, and 6. As shown in Figure 4, the Type-I error rates of all methods are reasonably controlled around the significance level. From Figure 5, both the BCOV statistics and HHG are competitive and generally exceed distance covariance. From Figure 6, the three BCOV statistics successfully detect the complicated mutual dependence among multiple random objects, and their empirical powers increase as the sample size augments. It is worth noting that Model 5.2.9 is an example of pairwise independence with mutual dependence. The success in revealing the mutual dependence of Model 5.2.9 certifies the power of the BCOV statistics.
To shed a light on the performance difference of the three BCOV statistics, we compare their empirical powers in Models 5.2.3, 5.2.4, and 5.2.7. In Model 5.2.3, the lower bound of the eigenvalues of X is linearly associated with that of Y , and similarly in Model 5.2.7, except that the noise in Model 5.2.7 has an infinite first moment. BCov K χ 2 ,N has a best performance in Model 5.2.3 on account of the nonlinearity of symmetric positive definite matrices spaces which slightly improves the nonlinearity between X and Y . In Model 5.2.7, BCov K N is superior to other methods owing to the approximately linear relationship among random objects and its high robustness. As for Model 5.2.4, the lower bounds of the eigenvalues of X and Y have a strongly nonlinear relationship. At this point, BCov K ∆,N turns to be the first place. It is also worthwhile to take a good look at Models 5.2.6 and 5.2.10. In the two models, the empirical power of distance covariance and distance multivariance stay at a low level as the sample size increases, because the L ∞ norm is not of strong negative type. The following is an explanation of why distance covariance has an unsatisfactory performance in Model 5.2.6. Without loss of generality, we re-define Y (t) = 10f (t; c), then denote Y (t)|X = 0 and Y (t)|X = 1 as Y 1 (t) and Y 2 (t). It is easy to verify that the distance covariance of (X, Y (t)) is the constant-multiple energy distance between groups Y 1 (t) and Y 2 (t). For the two groups Y 1 (t) and Y 2 (t), the means of the L ∞ norm within the group are both 10, so is the mean of the L ∞ norm between the two groups. According to Equation 4, the energy distance between Y 1 (t) and Y 2 (t) is 0, and thus, the distance covariance of (X, Y (t)) is 0, leading to the failure of detecting association. The performance of distance multivariate in Model 5.2.10 could be explained similarly.

Runtime analysis
We adopt Models 5.1.1, 5.2.1 and 5.2.7 in Sections 5.1 and 5.2 to assess the runtime performance of energy (1.7.5), multivariance (2.1.0), HHG (1.3.2), and Ball (1.3.8) using the microbenchmark package (Mersmann 2018). Here, all experiments are conducted with 20 replications, and the averaged runtimes are visualized in Figure 7. The benchmark is a 64-bit Windows platform with Intel Core i7 @ 3.60 GHz.
From Figure 7, energy is the fastest package in the K-sample test and the test of independence problems, and multivariance is the fastest package in the test of mutual independence problem. As the second fastest package, Ball is around four times faster than HHG in the K-sample test and test of independence problems when both of them use one thread, even though BCov K χ 2 ,N (K = 2) and HHG are asymptotically equivalent. Furthermore, we can cut the runtimes of Ball down around one third via doubling threads.
In summary, if runtimes are more concerned, energy or multivariance may be a desirable choice. Otherwise, Ball is a preferable choice due to its powerful performance in various complex data with fewer runtime increase, especially for the K-sample test and the test of independence problems.

Conclusion
We design a user-friendly R package Ball to help data scientists detect the distribution distinction and object association for complex data in metric spaces. Equipped with the novel algorithms, efficient C implementation, advanced multi-threaded technique, and elegant theoretical properties of the Ball test statistics, the Ball test procedures programmed in the Ball package can efficiently analyze complex data in metric spaces.
Future versions of the Ball package will endeavor to speed up the Ball Correlation based generic feature screening procedure (Pan et al. 2019). Furthermore, we intend to develop Python and Julia packages to help data scientists conduct the Ball test procedures and Ball screening procedure with their most familiar program languages. Algorithm 1 Optimized algorithm for computing the row-wise rank matrices of D X *