A network based covariance test for detecting multivariate eQTL in saccharomyces cerevisiae

Background Expression quantitative trait locus (eQTL) analysis has been widely used to understand how genetic variations affect gene expressions in the biological systems. Traditional eQTL is investigated in a pair-wise manner in which one SNP affects the expression of one gene. In this way, some associated markers found in GWAS have been related to disease mechanism by eQTL study. However, in real life, biological process is usually performed by a group of genes. Although some methods have been proposed to identify a group of SNPs that affect the mean of gene expressions in the network, the change of co-expression pattern has not been considered. So we propose a process and algorithm to identify the marker which affects the co-expression pattern of a pathway. Considering two genes may have different correlations under different isoforms which is hard to detect by the linear test, we also consider the nonlinear test. Results When we applied our method to yeast eQTL dataset profiled under both the glucose and ethanol conditions, we identified a total of 166 modules, with each module consisting of a group of genes and one eQTL where the eQTL regulate the co-expression patterns of the group of genes. We found that many of these modules have biological significance. Conclusions We propose a network based covariance test to identify the SNP which affects the structure of a pathway. We also consider the nonlinear test as considering two genes may have different correlations under different isoforms which is hard to detect by linear test. Electronic supplementary material The online version of this article (doi:10.1186/s12918-015-0245-0) contains supplementary material, which is available to authorized users.

Now consider a Hilbert space F of functions from R p to R. Then F is a reproducing kernel Hilbert space(RKHS) if for each x ∈ R p , the Dirac evaluation operator δ x :F→ R, which maps f ∈ F to f (x) ∈ R, is a bounded linear functional. To each point x ∈ R p , there corresponds an element φ(x) ∈ F such that φ(x), φ(x ) F =k (x,x ), where k : R p ×R p →R is a unique positive definite kernel.
Hilbert-Schmidt Norm. Denote by C : G → F a linear operator. Then provided the sum converges, the Hilbert-Schmidt(HS) norm of C is defined as where ν i and µ j are orthonormal bases of F and G respectively. It is easy to see that this generalises the Frobenius norm on matrices.
Hilbert-Schmidt Operator. A linear operator C : G → F is called a Hilbert-Schmidt operator if its HS norm exists. The set of Hilbert-Schmidt operators HS(G,F): G → F is a separable Hilbert space with inner product Tensor Product. Let f ∈ F and g ∈ G. Then the tensor product operator f ⊗ g : G → F is defined as Moreover, by the definition of the HS norm, we can compute the HS norm of f ⊗ g via
where φ is the feature map from X to the RKHS F, and ψ maps from Y to G. Finally, µ x 2 F can be computed by applying the expectation twice via Here the expectation is taken over independent copies x,x' taken from p x .
Covariance Operator. The covariance operator associated with the joint measure p x on (X ,Γ) is a linear operator Σ xx : F → F defined as Similarly, Σ yy : G → G is defined as

Hilbert -Schmidt Different Covariance Criterion
Now, we assume that X = Y,Γ = Λ, so φ = ψ Defition(HSDCC).Given separable RKHSs F, G and joint measures p x , p y over (X ,Γ) and (Y,Λ),we define the Hilbert-Schmidt Different Covariance Criterion(HSDCC) as the squared HS-norm of the difference of covariance Σ xx and Σ yy : To compute it we need to express HSDCC in terms of kernel functions. This is achieved by the following lemma: Lemma 1 (HSDCC in terms of kernels). Proof: We then give the unbiased statistics to HSDCC(P x , P y , F) like [2] 2 Figures We then give the unbiased statistics to HSDCC(P x , P y , F) like [2] 2 Figures   The two covariance matrices have eight di↵erent elements, each with a magnitude generated from Unif (0, 400) ⇤ max 1jp jj ; Bottomleft: The two covariance matrices have 500t di↵erent elements, each with a magnitude generated from Unif (0, 4) ⇤ max 1jp jj ; Bottomright: The two covariance matrices have 500 di↵erent elements, each with a magnitude generated from Unif(0, 400) ⇤ max 1jp jj ; Figure 2 Comparison between Chen's linear method and other method. Number of sample class is 40 and 60, respectively and number of variables is 50. Topleft: The two covariance matrices have eight different elements, each with a magnitude generated from U nif (0, 4) * max 1≤j≤p σ jj ; Topright: The two covariance matrices have eight different elements, each with a magnitude generated from U nif (0, 400) * max 1≤j≤p σ jj ; Bottomleft: The two covariance matrices have 500t different elements, each with a magnitude generated from U nif (0, 4) * max 1≤j≤p σ jj ; Bottomright: The two covariance matrices have 500 different elements, each with a magnitude generated from U nif (0, 400) * max 1≤j≤p σ jj ;