AIJ: joint test for simultaneous detection of imprinting and non-imprinting allelic expression imbalance

Epigenetics is the study of heritable changes in gene expression or cellular phenotype caused by mechanisms other than changes in the underlying DNA sequence. Genomic imprinting is an epigenetically regulated process by which imprinted genes are expressed in a parent-of-origin-specific manner. It can be confounded with a phenomenon, allelic expression imbalance (AEI), which, in this paper, refers to asymmetric expression of the two alleles of a heterozygous subject at a single nucleotide polymorphism not caused by imprinting (non-imprinting AEI). Since existing methods in the literature are not amenable to distinguishing imprinting from non-imprinting AEI for data without replicates, we propose AIJ, a joint test for simultaneous detection of imprinting and non-imprinting AEI that accounts for potential confounding using RNA-seq data based on a reciprocal cross design. Through a simulation study, we show that AIJ is more powerful compared to two frequently used methods that do not account for confounding. To illustrate the practical utility of AIJ, we applied the method to a mouse dataset and identified genes with the imprinting effect and/or non-imprinting AEI phenomenon, with some already confirmed in an existing database. The results are also largely consistent with a study on human data for a set of orthologous genes, affirming earlier conclusion in the literature that non-imprinting AEI events are evolutionarily conserved.


Introduction
Epigenetics is the study of heritable changes in gene expression or cellular phenotype caused by mechanisms without changing the underlying DNA sequence [1]. Genetic imprinting, a frequently studied epigenetic effect, is an epigenetically regulated process by which imprinted genes are expressed [23,28]. However, such methods are not amenable to joint imprinting and AEI analysis.
To tease out AEI from imprinting and to increase statistical power for analyzing non-replicated data, in this paper, we propose AIJ, AEI-Imprinting-Joint test, a simultaneous test for detecting genetic imprinting and AEI jointly based on mouse RNA-seq reads generated from a RCD experiment. Through a simulation study, we show that AIJ is more powerful compared to two commonly used methods that do not account for confounding. To illustrate the practical utility of AIJ, we applied the method to a mouse dataset and identified genes with imprinting and/or AEI.

Reciprocal cross design, notation, and existing hypotheses
We consider two pure mouse strains, denoted by S 1 and S 2 . Since pure strains have homozygous genotypes at any single nucleotide polymorphism (SNP) locus, for convenience, we simply use S 1 and S 2 to denote the alleles (at any SNP locus) as well, albeit a slight abuse of notation. Then we consider a cross in which S 2 is a female mouse while S 1 is a male mouse (top line of Table 1). Let a and b denote the number of S 1 and S 2 alleles (i.e. the numbers of reads containing the S 1 and S 2 alleles), respectively. Then, n 1 = a + b is the total number of reads that cover this SNP [7]. Consequently, p 1 = a/n 1 provides an estimate for p 1 , the relative expression level of the paternal allele S 1 . We also consider a cross in which S 1 is a female mouse while S 2 is a male mouse (bottom line of Table 1), a reciprocal of the first cross, hence the name reciprocal cross design. Using similar notation, we let c and d be the number of S 1 and S 2 alleles, respectively, and we use n 2 = c + d to denote the total number of reads that cover this SNP. Thenp 2 = c/n 2 provides an estimate for p 2 , the relative expression level of the maternal allele S 1 . Table 1. RNA-seq read distributions from a reciprocal cross design of pure strains.
Cross * S 1 Allele Reads S 2 Allele Reads Total Reads Proportion S 2 × S 1 a b n 1 = a + bp 1 = a/n 1 S 1 × S 2 c d n 2 = c + dp 2 = c/n 2 * Each crosse is arranged as female × male.
In Wang et al. [6], the intended use of the RNA-seq data from a RCD experiment was to study the imprinting effect, hence the null and alternative hypotheses were H0 : p 1 = p 2 versus H1 : p 1 p 2 .
(2.1) It is clear from the above setup that the test for imprinting is regardless of whether there is any AEI. When there are both imprinting and AEI but the imprinting effect is "weaker" in magnitude than AEI (i.e. both p 1 and p 2 are < 0.5 or > 0.5), then the test may suffer from power loss. If there is no imprinting but there is AEI, then there is no power to detect such a phenomenon since the test is not designed for it.
On the other hand, if one is interested in testing for AEI assuming no imprinting, then it is in fact sufficient to carry out the experiment running only one of the two reciprocal crosses specified in Table 1. Specifically, the appropriate hypotheses to test for AEI [20] are given in (2.2) and (2.3), for data generated from the S 2 × S 1 or the S 1 × S 2 crosses, respectively: H0 : p 1 = 0.5 versus H1 : p 1 0.5, (2.2) H0 : p 2 = 0.5 versus H1 : p 2 0.5. (2.3) Surprisingly, in Gregg et al. [7], which also used RCD to study imprinting effects, tests for both (2.2) and (2.3) were carried out to detect imprinting effects: a conclusion of the existence of imprinting is made only when both null hypotheses are rejected and when the estimated proportionsp 1 andp 2 are of different directionality (i.e.p 1 > 0.5 andp 2 < 0.5 orp 1 < 0.5 andp 2 > 0.5). Since a SNP with both imprinting and AEI may lead to both p 1 and p 2 being less than 0.5 or greater than 0.5, the test proposed by Gregg et al. [7] would have no power for detecting SNPs under such a scenario, which in fact does occur in real data frequently (see results in the real data analysis and discussion -Sections 4 and 5).

Hypotheses for joint testing of AEI and imprinting
For joint testing of imprinting and AEI as proposed in the current paper, the hypotheses of interest are H0 : p 1 = p 2 = 0.5 versus H1 : p 1 0.5 or p 2 0.5. (2.4) The above null hypothesis H0 stipulates no imprinting nor AEI, whereas the alternative hypothesis signifies either imprinting or AEI, or both. The alternative hypothesis can be further decomposed into three specific cases: H1.A : p 1 = p 2 but neither equal to 0.5 (no imprinting but AEI) H1.B : p 1 p 2 and one equals to 0.5 but not both (imprinting but no AEI) H1.C : p 1 p 2 and neither equal to 0.5 (both imprinting and AEI) To better understand H1.B, suppose p 1 = 0.5. Then, there is no distortion of the 1:1 expression ratio, indicating that there is no AEI of any kind. This, together with p 2 0.5, implies that there is imprinting but no non-imprinting AEI. For H1.C, on the other hand, p 1 p 2 implies that there must be imprinting effect, which, coupled with the fact that neither of the two crosses observes the 1:1 ratio, indicates that there is also non-imprinting AEI. Figure 1(a) is a schematic diagram depicting the division of the p 1 − p 2 plane into the H0, H1.A, H1.B, and H1.C (encompassing both H1.C1 and H1.C2) regions. Comparing the hypotheses specified in (2.4) with those in (2.2) and (2.3) combined, one can see that the null hypotheses are the same, whereas the alternatives in (2.2) and (2.3) jointly only cover part of the alternative hypothesis in (2.4); namely, they do not cover H1.B in (2.4). On the other hand, the decision rule in Gregg et al. [7] injects the restriction on the opposite directionality of p 1 and p 2 . Therefore, its actual alternative specifies parameters falling into the H1.C1 region only, leaving the rest of the area on the p 1 − p 2 plane as the de facto null. Thus, the actual null and alternative hypotheses tested in Gregg et al. [7] are not those expressed in (2.2) and (2.3) combined.
In the next three subsections, we will describe the tests used by Wang et al. [6] and Gregg et al. [7] (as they will be used for comparison), followed by the description of our proposed AIJ testing procedure. In describing each of these three methods, we will discuss their rejection regions as well as formulas for computing the p-values. Note that we selected to compare with these two tests since they are capable of handling non-replicated data. This issue will be elaborated in the Discussion (Section 5).

Storer-Kim test
The Storer-Kim (SK) test is an approximate unconditional test for the equality of two binomial proportions [29]. Compared to the Fisher's exact test, SK has been shown to be more powerful for study designs in which the conditional assumption is violated in general [30]; its greater power for analyzing NGS data has also been confirmed for a DNA methylation study [31]. Further, SK is computationally much less expensive than its exact version, and its empirical type I error rate rarely exceeds the nominal level [29]. These desirable properties led to its use by Wang et al. [6] for testing the hypotheses in (2.1).
Let X and Y be the random variables denoting the S 1 allele counts from the two crosses, with x and y denoting their realizations, respectively (note that the observed counts a and c in Table 1 are the specific realizations from the RCD experiment). Then X ∼ Bin(n 1 , p 1 ) and Y ∼ Bin(n 2 , p 2 ). The p-value of the SK test can then be found as follows: b(x, n 1 ,p)b(y, n 2 ,p)I(|Z(x, y, n 1 , n 2 )| ≥ |Z(a, c, n 1 , n 2 )|), wherep = (a + c)/(n 1 + n 2 ); b(x, n 1 ,p) and b(y, n 2 ,p) are the probability mass function of Bin(n 1 ,p) and Bin(n 2 ,p) evaluated at x and y, respectively; I(·) is the usual indicator function taking values of 1 or 0; and Z(x, y, n 1 , n 2 ) = (x/n 1 − y/n 2 )/ (1/n 1 + 1/n 2 )p(1 −p). For a preset significance level α, H0 is rejected if P S K ≤ α.
Given n 1 , n 2 , and α, the rejection region is: where f (n 1 , n 2 , α) is a function of the parameters. This rejection region is shown schematically in Figure 1(b) for a set of parameter values, which is comprised of the upper and the lower triangles (the regions shaded blue, yellow, brown, and gray -essentially regions in the x − y plane for which |x/n 1 − y/n 2 | is greater than a constant, with the boundaries for the two triangles being parallel lines). This rejection region makes intuitive sense for testing the hypotheses in (2.1): a SNP is declared to have an imprinting effect if the observed proportions of the S 1 allele in the two crosses are substantially different.

Two-Chi-Squares test
The Two-Chi-Squares (TCS) test is a procedure employed by Gregg et al. [7] to test the two sets of hypotheses specified in (2.2) and (2.3). Although TCS tests these two sets of hypotheses separately, each with the usual Chi-square test, the conclusion is drawn jointly based on the results from these two tests, plus an additional constraint as discussed earlier. Using the same notation as in Section 2.3, we note that X and Y follow binomial distributions Bin(n 1 , 0.5) and Bin(n 2 , 0.5) under the respective null hypothesis. Without loss of generality but noting the symmetry given the two-sided nature of the alternative hypotheses, we assume that the observed counts a and c are less than n 1 /2 and n 2 /2, respectively. Then the p-values for testing (2.2) and (2.3) are calculated, respectively, as where b(x, n 1 , 0.5) and b(y, n 2 , 0.5) are the probability mass function of Bin(n 1 , 0.5) and Bin(n 2 , 0.5) evaluated at x and y, respectively.
A SNP is concluded to have an imprinting effect if both P TCS 1 and P TCS 2 are less than a preset significance level α, and furthermore, (p 1 − 0.5)(p 2 − 0.5) < 0, wherep 1 = a/n 1 andp 2 = c/n 2 . The last expression implies that a SNP is said to have an imprinting effect if the maternal allele is significantly overexpressed while the paternal allele is significantly underexpressed, or vice versa. Note that this imposes a stronger condition than the usual definition of imprinting, and the conclusion is thus confounded with the existence of AEI. For instance, p 1 = 0.5 and p 2 < 0.5, or p 2 = 0.5 and p 1 < 0.5 are indicative of maternal and paternal imprinting, respectively, but the TCS testing procedure will fail to detect variants with such effects. Further, p 1 < p 2 < 0.5 may arise if both paternal imprinting and AEI (with the S 1 allele expressed less than the S 2 allele) exist.
Shown schematically in Figure 1(b), this rejection region is composed of the upper-left and lower-right rectangles (the regions shaded by yellow, brown, and red -essentially the yellow rectangles without the corners being cut off). Since two separate Chi-square tests are performed compared to a single test for KS (and AIJ to be discussed in the following subsection), we need to adjust the significance level in order for the tests to be comparable. Given the symmetry of the two tails of each Chi-square test, we have where α is the common nominal significance level for test (2.2) and (2.3). For example, if we set α = 0.05, then the overall significance level for the TCS test is α TCS = 0.00125. Therefore, in order to compare the power of the tests, we need to set the significance level for SK and AIJ to be α TCS as well.

AEI-Imprinting-Joint test
We propose an AEI-Imprinting-Joint (AIJ) test, which will not only address the potential confounding issue, but will also lead to an increase in statistical power for testing for imprinting regardless of whether AEI exists. To describe it intuitively, our proposed test rejects the null hypothesis H0 in (2.4) if eitherp 1 orp 2 is significantly away from 0.5. Using the same notation as above, under the null hypothesis, both X and Y follow independent binomial distributions Bin(n 1 , 0.5) and Bin(n 2 , 0.5), respectively; hence, the probability for X = x and Y = y is Let P (k) be the sorted P x,y 's from the smallest to the largest, k = 1, 2, · · · , (n 1 + 1) * (n 2 + 1). Given a significance level α, let s be the largest integer such that s k=1 P (k) ≤ α. Then, we define the rejection region of AIJ as follows: This rejection region, shown schematically in Figure 1(b), is the region in the x − y plane (composed of the areas shaded yellow, blue, and green) outside of the ellipsoid centered at (n 1 /2, n 2 /2). Therefore, an observed pair of value (a, c) from a RCD experiment is rejected if it is significantly away from the center of the x − y plane. Let Then the p-value is: where the sum is over all pairs of (x, y) in the set S . When H0 is rejected, further analysis may be carried out to delineate the three possible cases as detailed in H1.A, H1.B, and H1.C. We describe one simple procedure in the following; another potential approach is also proposed in the Discussion Section. We first construct the confidence interval (CI) for p 1 and p 2 (CI p 1 and CI p 2 ) as follows, noting the binomial distributions and the natural bound for a probability: If the rectangle formed by these CIs, CI p 1 × CI p 2 , intersects with the diagonal line of the p 1 − p 2 plane, then H1.A (no imprinting but AEI) is concluded. If the rectangle does not intersect with the diagonal line but intersects with either the p 1 = 0.5 or p 2 = 0.5 line (but not both), then H1.B (no AEI but imprinting) is concluded. Finally, if the rectangle does not intersect with the diagonal line, nor the p 1 = 0.5 or p 2 = 0.5 lines, then A 1C (both imprinting and AEI) is concluded.

Settings
We first compare the proposed joint test, AIJ, with the SK and the TCS tests through a simulation study. The simulated data sets were modeled after the Gregg et al. [7] RNA-seq data obtained from an RCD experiment to study murine embryonic day 15 (E15) brains. The Gregg [7] dataset consists of data for approximately 200,000 SNPs after quality control. The total RNA-seq read counts and the number of reads covering each SNP, (n 1 , n 2 ) and (a, c), are available. Further details about the dataset are provided in the application (Section 4). In our simulation, about 10% of the total read counts (20,000 pairs of (n 1 , n 2 )) were first randomly selected under each of the five scenarios specified in Table 2. Among them, 16,000 pairs (80%) were designated for studying the type I error rates, in which the observed counts a and c were generated under H0: p 1 = p 2 = 0.5, and the actual n 1 and n 2 . For the remaining 4,000 pairs, the observed counts a and c were generated under H1. To better understand the performance of the three comparison methods, we divided these 4,000 pairs into four regions (scenarios) representing the various parameter ranges under the three detailed cases of the alternative. These four regions are shown and labelled in Figure 1(a) as we described before, and their descriptions and pair distributions as provided in Table 2. The relative proportion of pairs distributed to these five scenarios (H0 plus the four under H1) are roughly according to the results from the real data analysis (Section 4). No imprinting and no AEI p 1 = 0.5 and p 2 = 0.5 16,000 H1.A No imprinting but AEI p 1 = p 2 0.5 1,000 H1.B Imprinting but no AEI p 1 = 0.5 or p 2 = 0.5 600 H1.C1 Imprinting and AEI Imprinting and AEI (p 1 − 0.5)(p 2 − 0.5) > 0 2100 a For H1.C2, it is obvious that we also assume that p 1 p 2 . b The number of SNPs is distributed roughly according to the data in [7].

Study of type I error rates
A simulated dataset under H0 is displayed in Figure 2(a). The two axes specify the ranges of p 1 and p 2 (both between 0 and 1). The value in each square displays the number of SNPs falling into it, with the total being 16,000. The range of p 1 (and p 2 ) was divided into nine intervals, with the interior boundary values starting at 0.15 and ending at 0.85 with an increment of 0.1. We see that the middle interval (0.45, 0.55) is further split into two, (0.45, 0.5) and (0.5, 0.55), to accommodate the nature of the TCS test, as the decision on whether a SNP has an imprinting effect is based on the opposite directionality of the estimatedp 1 andp 2 (one has to be greater than, and the other smaller than 0.5). We first note that for easy visualization of results, the block sizes were not drawn to proportion. We further note that, in the observed counts displayed, although we set p 1 = p 2 = 0.5, the distribution covers a wide range due to random sampling from the binomial distributions. For simplicity, we consider the middle four squares as constituting the null hypothesis, referred to as the H0 region (the area in red). As such, many SNPs are located outside of the region despite there are no imprinting nor AEI for these SNPs. We applied the three comparison methods, TCS, SK, and AIJ, to analyze this dataset by setting the significance level to be 0.00125 for all three tests, and the results are presented in Figure 2(b-d), respectively. We can see that the number of false rejections for TCS, SK, and AIJ are 10, 22, and 22, respectively, leading to the empirical type I error rates of 0.00063, 0.0014, and 0.0014 (displayed on the first row in Table 3). These results show that, while SK and AIJ appear to have the correct size, the TCS test seems to be quite conservative.

Study of power under four scenarios
A simulated dataset under H1.A is displayed in Figure 3(a). These data were simulated under binomial Bin(n 1 , p 1 ) and binomial Bin(n 2 , p 2 ), where p 1 = p 2 , representing the scenario where there is AEI but no imprinting (Figure 1(a)). The total number of SNPs studied under this scenario is 1000 ( Table 2). The results from applying the three tests to this dataset are shown in Figure 3(b-d). We can see that SK detected one significant SNP while TCS detected none. On the other hand, AIJ detected 645 significant SNPs, giving a 64.5% power. The alternative hypothesis under the H1.B case specifies that there is imprinting but no AEI (either p 1 = 0.5 or p 2 = 0.5 but not both; Figure 1(a)). A simulated dataset with 600 pairs of SNPs from the RCD experiment is displayed in Figure 4(a). The results from applying the three tests to this dataset are shown in Figure 4(b-d). We can see that TCS detected only 7 SNPs (power 1.2%), while SK was much more powerful, detected 187 (power 31.2%) significant SNPs. AIJ detected 261 (power 43.5%), which is a bit more powerful than SK, and much better than TCS which has practically no power. There are several regions in the p 1 − p 2 plane that represent the alternative hypothesis under scenario H1.C (i.e., having both imprinting and AEI). Since the two H1.C1 regions are different from the four H1.C2 regions in terms of their "proximity" to the other alternative scenarios (Figure 1(a)), they were studied separately. A simulated dataset for 300 SNPs in the H1.C1 regions is shown in Figure 5(a), and the results for the three tests are provided in Figure 5(b-d). A total of 149 (power 49.7%), 213 (power 71.0%), and 215 (power 71.7%) significant SNPs were detected by TCS, SK, and AIJ, respectively. These results show a marked improvement for both the TCS and SK tests, which is expected because the counts are mainly located in the top-left or bottom-right corners, where both SK and TCS were designed to detect (see rejection regions for these two tests in Figure 1(b)). Nevertheless, TCS continued to lag behind despite the much improved performance over those in the other alternative hypothesis scenarios. Finally, we considered the four H1.C2 regions. A simulated dataset for a total    of 2100 SNPs is given in Figure 6(a) as an illustration, and the results for the three tests are provided in Figure 6(b-d). A total of 1 (power 0.0005%), 381 (power 18%), and 1575 (power 75%) significant SNPs were detected by the TCS, SK, and AIJ, respectively. The superior performance of AIJ over the other two tests is not surprising either given the rejection regions of these three tests (Figure 1(b)). From the results described above and presented in Figures 3-6, AIJ is seen to have much higher overall power than the other two tests. To ensure that the performance of these tests are consistent from replicate to replicate, we performed these analyses for a total of 1000 replications for each of the alternative hypothesis scenarios considered. The results, expressed as average power and standard deviation (SD), are given in Table 3. One can see that the average powers for all scenarios are very close to the powers for one replicate reported in Figures 3-6, and all with small SD's. It is worth pointing out again that the better performance of AIJ is not at the expense of type I error rate, as it closely tracks the nominal level (Table 3).

Analysis of a mouse dataset
To further evaluate the methods on a real dataset, we considered the RCD experiment carried out by Gregg et al. [7] to identify imprinted genes. Illumina RNA-seq experiments were performed on murine E15 brain tissues obtained from the reciprocal cross progeny of CAST (CAST/EiJ) and C57 (C57BL/6J) mice. SNPs were identified by separately sequencing the CAST and C57 transcriptomes of the pure-bred parents, and the subsequent base calls were used to distinguish transcription from maternal and paternal allele. As such, for each SNP, the counts of the CAST allele transmitted from the male (a) and from the female (c) mouse were recorded ( Table 1). As in the simulation study, we set the nominal significance level for all three tests to be 0.00125 for comparison purpose. Among the more than 200,000 SNPs contained in the dataset, 1,915, 3,695, and 37,258 SNPs were inferred to be significant by TCS, SK, and AIJ, respectively. To gain a better understanding of some of the SNPs identified, we compare the results with a very small but well-curated database of confirmed imprinted genes [3]. In the database, there are 49 imprinted genes in mouse brain tissues. Among them, 23 are present in the E15 data of Gregg [7]; that is, the E15 data contain SNPs that belong to 23 of the 49 imprinted genes in the database. The names of these 23 genes, along with their directionality (maternally or paternally imprinted) from the Morison database, are provided in Table 4. For our analysis, a gene is concluded by a particular method to be imprinted if the null hypothesis was rejected for at least one SNP within the gene. Further, for the purpose of comparison, a SNP concluded to have a significant effect is said to have a paternal (maternal) imprinting effect ifp 1 <p 2 (p 1 >p 2 ). The results from the three tests are given in the the last three columns in Table 4. As we can see from the table, the AIJ test identified one more imprinted genes than SK and two more than TCS. Based on these results, two observations are noteworthy. First, although the difference in the numbers appears to be small, the results are not surprising. Since the confirmed genes collected in the databases were selected most likely based on traditional methods such as the TCS and SK, these methods are expected to performed well in this set of genes. As such, it is reassuring to see that AIJ performs equally well for this set of genes, and in fact slightly better. Second, the directionality of imprinting effects for the SNPs within a gene are consistent for most of the genes and are the same as the results in the database. For those few where not all the SNPs have the same directionality of imprinting effects, most of the SNPs agree with the results in the database. Given that AIJ detected many more significant SNPs than the other two tests, we are interested in investigating the properties of these SNPs that lead to such a result. A closer inspection shows that, for the AIJ test, about 3,000 of the SNPs detected to be significant fall into the H1.C1 region (as marked in Figure 1(a)), which is comparable to the number detected by SK. This is completely consistent with the results from the simulation study, which shows that SK and AIJ have similar power for detecting SNPs falling into the H1.C1 region. However, a super majority of the significant SNPs (about 33,000, or 89% of all detected) fall into the H1.A or H1.C2 regions (Figure 1(a) -AEI with or without imprinting), for which SK and TCS has little to no power, according to our simulation study (Table 3). Let us consider the CD81 gene as an example to demonstrate this point. This gene has been implicated to be maternally imprinted in non-brain, but not brain, tissues of mice [3], and several SNPs in this gene were detected only by AIJ. In particular, shown in Figure 1(b) is a SNP whose observed counts in the reciprocal experiment were (a, c) = (13, 21) with coverage (n 1 , n 2 ) = (56, 67), leading to an estimate ofp 1 = 0.23 andp 2 = 0.31. This pair of estimated values place the SNP in the green region of the p 1 − p 2 plane (the rejection region of AIJ only); therefore it was only detectable by AIJ, but not the other two tests. With a nominal significance level of α = 0.05, the confidence intervals for p 1 and p 2 were estimated to be (0.14, 0.32) and (0.22, 0.41), respectively. Since the rectangle formed by these two intervals intersect with the p 1 = p 2 line but neither the p 1 = 0.5 nor the p 2 = 0.5 lines, this SNP is deemed to exhibit AEI but no imprinting effect, leading to the conclusion for H1.B, which is consistent with the Morison database.

Discussion
There has been a great deal of interest in detecting imprinted genes as such genes play a crucial role in normal mammalian development as well as diseases. For mouse models, reciprocal cross design aided by RNA-seq has been used to study imprinting effect, and statistical methods have been employed for such tasks. However, as we have shown in this paper, with RNA-seq data from RCD, one cannot simply separate the effect of imprinting from the non-imprinting AEI phenomenon using these methods. To address this challenging issue, we have proposed AIJ, a statistical method that can jointly test for imprinting and non-imprinting AEI. Through simulation and real data analysis, we show that AIJ can be much more powerful than the two commonly used methods for imprinting testing that do not require replicated samples. Although it is not surprising that AIJ has power for detecting AEI when imprinting does not exist, AIJ also has much greater power for detecting imprinting in the pres-ence of non-imprinting AEI when the proportions of allelic expression of the allele of interest are both above, or below the expected 1/2 for both reciprocal crosses. Examining the range of the proportions under the various scenarios of the alternative and the rejection regions of the tests (Figure 1) clearly supports the above conclusion. Ube2d3 Tspyl4 Non-imprinting AEI (random monoallelic experssion) Ifitm2, Pip4k2a Zdhhc17 a Genes whose classifications are consistent between the AIJ results and those of Chuang are in bold.
To further interpret our results, we turn to a recent human study as little is available in the literature for differentiating AEI caused by imprinting or non-imprinting factors in mice. In this human study, Chuang et al. [11] used family trios to identify various causes of expression asymmetry. They considered imprinting, genetic variation, and random monoallelic expression as the underlying causes. For comparison purpose so that the terminologies are similar, asymmetry expression due to imprinting in the Chuang et al. study is simply termed imprinting here whereas that due to the other factors is referred to as non-imprinting AEI. Although the Chuang et al. study is on human data, we nonetheless believe that it is reasonable to use their results for comparison since the majority of the SNPs identified in both studies are non-imprinting AEIs and since non-imprinting AEIs are shown to be evolutionarily conserved [11]. Specifically, we look into 22 orthologous genes in the E15 mouse data that have been implicated by AIJ and were also identified to have imprinting effect or non-imprinting AEI phenomenon by Chuang et al. [11] using the human data. Of these 22 genes, the results from Chuang et al. are as follows: 2 were inferred to have imprinting effects on gene expression; whereas the remaining 20 were classified as non-imprinting AEI (17 were believed to be caused or likely caused by genetic variation whereas 3 were believed to due to random monoallelic expression). Of the 2 genes inferred to have an imprinting effect by Chuang et al., both were inferred to have non-imprinting AEI by AIJ, and we note that it appears that neither genes have been found to be imprinted when we searched the literature. In comparison, among the 20 genes that were seen to exhibit non-imprinting AEI in Chuang et al., 17, 2, and 1 were inferred to have non-imprinting AEI only, imprinting but no non-imprinting AEI, and both imprinting and non-imprinting AEI, respectively, by AIJ (Table 5). This comparison shows that the results from these two studies on the orthologous genes are extremely consistent, which can be taken as affirming the conclusion that non-imprinting AEIs are evolutionarily conserved. This observation, taken together with the fact that over 80% of the significant SNPs identified by AIJ exhibit non-imprinting AEI whereas the "vast majority" of those identified in Chuang et al. were concluded to be due to genetic variation, leads to the conclusion that AIJ is a promising methodology that can lead to biologically relevant results.
From a methodological standpoint, for the AIJ test, when the null hypothesis is rejected, how to proceed to reach a conclusion that can delineate the three scenarios deserves further discussion. As we describe in Section 2, our proposed procedure is to construct a confidence interval for each of the expression proportions of the allele of interest in both reciprocal crosses. The rectangle formed from these two intervals can then be used to make inference about the three alternative scenarios. On the other hand, one may directly construct a confidence region and make inference based on it directly without the need to resort to the additional ad hoc procedure to delineate the alternative when the null is rejected. Specifically, one may construct a 100(1−α)% confidence region for p 1 and p 2 centered around the observedp 1 andp 2 . Suppose that X follows a binomial distribution Bin(n 1 ,p 1 ) and Y follows a binomial distribution Bin(n 2 ,p 2 ), and that X and Y are independent. LetP xy be the probability that X = x and Y = y:P xy = To find the confidence region, letP (k) be the sortedP xy 's from the smallest to the largest, k = 1, 2, . . . , (n 1 + 1) * (n 2 + 1). Given a significance level α, s is the largest integer that satisfies s k=1P (k) ≤ α. Therefore, the confidence region is the collection of (x, y) corresponding to k = s + 1, . . . , (n 1 + 1) * (n 2 + 1). As an example, we consider SNP uc007iba.1 1091 in gene Grb10 from the E15 dataset [7] in which (a, c) = (66, 28) and (n 1 , n 2 ) = (110, 106), leading top 1 = 0.6 and p 2 = 0.26. The constructed 99% confidence region (white ellipsoid in Figure 1(a)) is centered around the estimatedp 1 andp 2 values, and the confidence region overlaps with the line p 1 = 0.5, but not the p 1 = p 2 line, nor the center of the p 1 − p 2 plane denoting H0. Thus, this SNP is likely to have imprinting but no non-imprinting AEI, consistent with the Morison database.
Although AIJ is proposed for analyzing data without replicated samples and is shown to be more powerful than two other methods that are also amenable to such data, for enrichment of discussion, we nonetheless also applied AIJ to another dataset that has two replicated samples and compared its performance with a joint testing method proposed for data with replications. Specifically, we considered the mice trophoblast stem cells (TSC) data, containing two replicated samples, from a reciprocal cross between CAST/EiJ and C57BL/6J analyzed in Takada et al. [8]. Therein, a method based on generalized linear model (GLM) was proposed to decompose cis-regulatory and imprinting effects, which requires replicated samples to estimate variability.
Using a nominal significance level of 0.00125, AIJ identified a total of 1875 genes that were considered to be imprinted or exhibit AEI, while GLM identified 1278. Among them, 381 and 447 pointed to the existence of both imprinting and AEI based on AIJ and GLM, respectively. Interestingly, the super majority of them, 330 for AIJ and 407 for GLM, fall into the four H1.C2 regions (Figure 1(a), where the estimated p 1 and p 2 are either both above 0.5 or both below 0.5), consistent with the result from the Gregg data for the same reciprocal design, albeit with their study focusing on brain tissues. These results, consistently identified by two rather different methods and across two different datasets, indicate that there are most likely genes for which the impact of AEI is stronger than that of imprinting on gene expression. On the other hand, AIJ identified a larger number of imprinted genes without AEI compared to GLM, which accounts for the larger number of genes identified by AIJ. Figure 7. Venn diagram depicting the intersecting regions representing genes identified by four analyses. Takada: the GLM method proposed in [8]; AIJ both: AIJ analyzing the combined data from the two replicated samples; AIJ sample1: AIJ analyzing only data from Sample 1; AIJ sample2: AIJ analyzing only data from Sample 2.
To dissect the impact of having replicated samples on AIJ, which was proposed for non-replicated data, and to study consistency, we also analyzed each of the two samples separately using AIJ (note that we could not perform the separate analysis using GLM). Our results (Figure 7) show that there are a large amount of overlaps. For example, 78% of those identified by Sample 1 was also identified by Sample 2, and not surprisingly, even a larger percentage (94%) of those identified by Sample 1 was also identified by AIJ when using the combined data from the two samples. On the other hand, one can see that AIJ and GLM identified 158 genes that were not implicated when only one of the two samples were analyzed, suggesting the increase in power for AIJ when using the combined data from both replicated samples. In summary, in this paper, we show that for a specific heterozygous SNP, the relative expression level of the two different alleles can be quantified to detect either imprinting or AEI based on a reciprocal cross design without replicated samples. The crucial feature is that the origin of the two different alleles can be clearly identified, and the relative expression level of the two parental alleles can be quantified. This RCD for mice can be used to study other organisms. For plants, Gehring et al.
[32] performed RNA-seq from embryo and endosperm derived from reciprocal crosses between two Arabidopsis thaliana accessions, and identified more than 200 loci that exhibit parentof-origin effects on gene expression. Therefore, the more powerful AIJ test proposed in this paper is also applicable there. However, for humans, an experimental cross is obviously infeasible, and thus a different design and an alternative data collection process are needed to understand the confounding effects of imprinting and AEI. To this end, a parents-child trio design may be considered [33]; however, the operational characteristics of our joint testing procedure (AIJ) in this setting remain to be studied.