Ordering Properties of the First Eigenvector of Certain Similarity Matrices

It is shown for coefficient matrices of Russell-Rao coefficients and two asymmetric Dice coefficients that ordinal information on a latent variable model can be obtained from the eigenvector corresponding to the largest eigenvalue.


Introduction
An important role in statistics and data analysis is played by similarity coefficients. A similarity coefficient is a measure of resemblance or association of two data vectors, such as score patterns, variables, and items. For example, in ecological biology similarity coefficients are used for measuring the degree of coexistence between two species types over different locations. In many research studies the data consist of binary (0, 1) vectors: presence or absence of disease; presence or absence of species characteristics; yes or no answers in questionnaires; pass or fail in high-stakes testing. For expressing the degree of resemblance of two binary vectors in a number, a variety of similarity coefficients has been proposed [1][2][3]. Examples are the Jaccard coefficient [4], the Russell-Rao coefficient [5], the Dice coefficient [6], and the simple matching coefficient [7,8]. In choosing a coefficient, a measure has to be considered in the context of the data-analytic study of which it is a part [9]. Because there are so many similarity coefficients for binary data to choose from, it is important that the different coefficients and their properties are better understood.
Instead of studying properties of individual coefficients [10][11][12][13] one may also study properties of coefficient matrices [14]. Coefficient matrices are used as input in various techniques of multivariate data analysis, including factor or component analysis [15,16], hierarchical cluster analysis, and techniques in classification and dissimilarity analysis [17]. Moreover, exploratory data-analytic methods such as principal coordinates analysis and (multiple) correspondence analysis can be defined as eigendecomposition of certain coefficient matrices [15,16,18]. It would be interesting to know what information, if any, is reflected in the eigenvectors of a coefficient matrix that is based on a similarity coefficient for binary vectors.
In this paper we show for several coefficient matrices that ordinal information on latent variable models can be obtained from the eigenvector corresponding to the largest eigenvalue. It is thus possible to uncover meaningful orderings of various models by using eigenvectors. The results are first of all of theoretical interest. They show that some coefficient matrices have more interesting eigenvectors than others. Coefficient matrices based on some coefficients may thus lead to more interesting data-analytic solutions than matrices corresponding to other coefficients. Furthermore, potentially, the results can enhance the interpretation of a data analysis that uses these coefficient matrices as input.
The paper is organized as follows. Notation and two latent variable models are introduced in the next section. In Section 3 several ordering properties of eigenvectors corresponding to a largest eigenvalue are presented. An illustration of the results is presented in Section 4. Section 5 contains a conclusion.

Latent Variable Models
Suppose the data consist of binary (0, 1) vectors of length . It may be assumed that the scores in the binary vectors are realizations of a latent variable model. In this section 2 Journal of Mathematics we introduce two models in the context of nonparametric item response theory [19,20]. In item response theory the vectors are often viewed as items that, for instance, contain the responses (pass, fail) of a high-stakes test for subjects. The items will be indexed by and .
Let denote a one-dimensional latent variable and let ( ) be its probability density function. Let ( ) denote the response function corresponding to the response 1 on item . The unconditional probability of a response 1 on item is then given by Next, assume local independence; that is, conditionally on the responses of a subject on the items are stochastically independent. The joint probability of items and for a value of is then given by ( ) ( ). The corresponding unconditional probability is Throughout the paper we assume that 0 < ≤ 1. Next, we define the latent variable models. Both models have monotone response functions and are frequently applied in the context of measuring ability. The first model is characterized by requirements (3) and (4). The first requirement is that ( ) are monotonically increasing on ; that is, for 1 < 2 . The second requirement is that the items can be ordered such that ( ) are nonintersecting; that is, for < . The case that assumes (3) and (4), together with the assumptions of local independence and a single latent variable, is called the double monotonicity model in nonparametric item response theory [19,20]. A well-known result is that if the double monotonicity model holds, then the items can be ordered such that we have for < , and ≥ for < and ̸ = [19,20]. The second model is characterized by requirements (3) and (7). The response functions ( ) may satisfy various orders of total positivity [21]. If the functions ( ) are totally positive of order 2, the items can be ordered such that holds for 1 < 2 and < . Schriever [22] proved the following result for a set of response functions that are both monotonically increasing and satisfy total positivity of order 2. If the vectors are ordered such that (3) and (7) hold, then holds for < and ̸ = .
We conclude this section with a parametric example that satisfies requirements (3), (4), and (7). A well-known model from the field of item response theory is the Rasch [23] model. A response function of this one-parameter logistic model is given by where is a location parameter. In the context of item response theory the parameter is usually called a difficulty parameter [19,20]. The functions ( , ) form a location family.

Ordering Properties
In this section we present ordering properties for three coefficient matrices. The coefficient matrices of size × are = ( ) , An element of the matrix is a Russell-Rao coefficient for two binary vectors and [5,10]. Some data-analytic properties of the matrix are discussed in Warrens [14]. The elements of the matrices and are conditional probabilities discussed and applied in Dice [6]. The harmonic mean of the two conditional probabilities is equal to the Dice coefficient [6]. Matrix is also called the conditional adjacency matrix in Post and Snijders [24].
A specific result that will be used in the proofs of Theorems 2, 3, and 4 below is the Perron-Frobenius theorem [25,26]. More precisely, only the following weaker version of the Perron-Frobenius theorem will be used.
We first consider the matrix . Let be the eigenvector corresponding to the largest eigenvalue of the matrix . Theorem 2 shows that if the binary vectors can be ordered such that (3) and (4) hold, then this ordering is reflected in the corresponding elements of .

Theorem 2.
Suppose that of the vectors, which without loss of generality can be taken as the first , can be ordered such that (3) and (4) hold. Then the elements of of corresponding to these vectors satisfy 1 > 2 > ⋅ ⋅ ⋅ > > 0.
Proof. Since is nonsingular, is an eigenvector of corresponding to if and only if = −1 is an eigenvector of = −1 corresponding to . Under the conditions of the theorem, the elements of are positive and the elements of 2 are strictly positive. Application of Lemma 1 then yields that the eigenvector of (or 2 ) has strictly positive elements. The assertion then follows from the identity = −1 .
In the remainder of the proof we show that has positive elements and 2 has strictly positive elements. The matrix = −1 has elements for 1 ≤ < and 1 ≤ ≤ and = for ≤ ≤ and 1 ≤ ≤ . Under the conditions of the theorem properties (5) and (6) hold for the first items. By (6), we have ≥ +1, , and the matrix has positive elements except for , +1 for 1 ≤ ≤ − 1. However by (5), we have > +1 and it follows that for 1 ≤ ≤ − 1. Hence, the matrix = has positive elements. Moreover, because the elements in the last row and last column of are strictly positive, it follows that the elements of 2 are strictly positive.
An analogous result holds for the matrix . Let be the eigenvector corresponding to the largest eigenvalue of the matrix . Theorem 3 shows that if the binary vectors can be ordered such that (3) and (4) hold, then this ordering is reflected in the corresponding elements of of .

Theorem 3. Suppose that of the vectors, which without loss of generality can be taken as the first , can be ordered such that (3) and (4) hold. Then the elements of of corresponding to these vectors satisfy
Proof. The proof is similar to the proof of Theorem 2. The matrix = −1 has elements for 1 ≤ < and 1 ≤ ≤ and = (17) for ≤ ≤ and 1 ≤ ≤ . Under the conditions of the theorem properties (5) and (6) hold for the first items. By (6), we have ≥ +1, , and the matrix has positive elements except for , +1 for 1 ≤ ≤ − 1. But by (5), we have > +1 , and it follows that Finally, Theorem 4 below presents an ordering property of the matrix . The ordering holds for a slightly stronger model than the one considered in Theorems 2 and 3. Theorem 4 shows that if the binary vectors can be ordered such that (3), (4), and (7) hold, then this ordering is reflected in the corresponding elements of of .

Theorem 4.
Suppose that of the vectors, which without loss of generality can be taken as the first , can be ordered such that (3), (4), and (7) hold. Then the elements of of corresponding to these vectors satisfy 0 < 1 < 2 < ⋅ ⋅ ⋅ < .

An Illustration
In this section we consider an example from educational testing to illustrate some of the results from Section 3. The data consist of responses of 1000 individuals to five items of the LSAT (Law School Admission Test). The test was designed to measure a one-dimensional latent variable. The example is part of a data set given by Bock and Lieberman [27]. The data set is distributed with the R package "ltm" written by Rizopoulos [28]. Requirements (3), (4), and (7) cannot be checked directly for real life data. However, it can be shown that the Rasch model in (9) fits these data quite well. Using subroutines from the "ltm" package we fitted the Rasch model and the so-called two-parameter logistic model [19,20]. In the Rasch model the items are allowed to differ in location. In the more general two-parameter model the items are also allowed to differ in slope. For these data the two-parameter model has four additional parameters. The log likelihoods of the models are −2466.94 and −2466.65, respectively, and the corresponding likelihood ratio test has a value of = .967. Thus, the extra slope parameters are statistically not warranted.
Requirements  (6) and (8), we suppose that the items are ordered on the proportions of correct responses, from easiest to hardest item (1, 5, 4, 2, and 3). In other words, in the following we assume that the items are ordered such that condition (5) holds.
To study condition (6)  ) . ( The elements on the main diagonal are the proportions of correct responses. If we ignore the elements on the main diagonal it can be verified that the other four elements in each column of are strictly decreasing. Hence, condition (6) holds.
Since conditions (5) and (6) hold for all five LSAT items it follows from Theorem 3 that the ordering of the items is reflected in the eigenvector corresponding to the largest eigenvalue of . The largest eigenvalue is = 3.191 and the associated eigenvector is given by (.516, .495, .446, .420, .336). The item ordering is thus reflected in the elements of the eigenvector.
To verify whether condition (8)  ) . ( If we ignore the elements on the main diagonal it can be verified that the remaining four elements in the first, third, and fourth columns of are strictly increasing. Furthermore, the elements in the second and fifth columns are roughly increasing. In both columns there is one anomaly. We may conclude that condition (8) holds approximately. If the five LSAT items satisfy conditions (5) and (8) it follows from Theorem 4 that the ordering of the items is reflected in the eigenvector corresponding to the largest eigenvalue of . The largest eigenvalue is = 4.106 and the associated eigenvector is given by (.424, .431, .446, .454, .478). The item ordering is thus reflected in the elements of the eigenvector.

Conclusion
Similarity coefficients for binary vectors are frequently used in statistics for analyzing the structure between objects. Examples that are commonly used are the Russell-Rao coefficient [5] and the Dice coefficient [6]. Since the choice of a coefficient depends on the context of the data-analytic study, it is important that the different coefficients and their properties are well understood.

5
In this paper we showed that ordinal information on latent variable models is reflected in the eigenvector corresponding to the largest eigenvalue of the coefficient matrices with Russell-Rao coefficients (Theorem 3) and two asymmetric coefficients used in Dice [6] (Theorems 2 and 4). For other well-known coefficients like the Jaccard coefficient [4] and the simple matching coefficient similar ordering properties could not been found. The results may indicate that the Russell-Rao coefficient and Dice coefficients may lead to more clearly interpretable output if used as input in clustering methods or principal coordinates analysis. However, more research on this topic is needed.