Differential network analysis from cross-platform gene expression data

Understanding how the structure of gene dependency network changes between two patient-specific groups is an important task for genomic research. Although many computational approaches have been proposed to undertake this task, most of them estimate correlation networks from group-specific gene expression data independently without considering the common structure shared between different groups. In addition, with the development of high-throughput technologies, we can collect gene expression profiles of same patients from multiple platforms. Therefore, inferring differential networks by considering cross-platform gene expression profiles will improve the reliability of network inference. We introduce a two dimensional joint graphical lasso (TDJGL) model to simultaneously estimate group-specific gene dependency networks from gene expression profiles collected from different platforms and infer differential networks. TDJGL can borrow strength across different patient groups and data platforms to improve the accuracy of estimated networks. Simulation studies demonstrate that TDJGL provides more accurate estimates of gene networks and differential networks than previous competing approaches. We apply TDJGL to the PI3K/AKT/mTOR pathway in ovarian tumors to build differential networks associated with platinum resistance. The hub genes of our inferred differential networks are significantly enriched with known platinum resistance-related genes and include potential platinum resistance-related genes.

in the main text. Results are averaged over 100 random generations of the data.   Figure S3: Performance of the compared methods on community network with p = 100, K = 3, τ = 10% and (a) n = 50, (b) n = 100, (c) n = 200. Each colored line corresponds to a fixed value of λ 2 (ω 2 for GGL), as λ 1 (ω 1 for GGL) is varied. Variables corresponding to the axes are explained in Table 1 Figure S4: Performance of the compared models on scale-free network with p = 100, K = 3, τ = 20% and (a) n = 50, (b) n = 100, (c) n = 200. Each colored line corresponds to a fixed value of λ 2 (ω 2 for GGL), as λ 1 (ω 1 for GGL) is varied. Variables corresponding to the axes are explained in Table 1 Table 1 in the main text. Results are averaged over 100 random generations of the data. HuEx (1206) Figure S6: Overlaps between the edges (and differential edges) detected by TDJGL from the three platforms for (a) platinumresistant tumors, (b) platinum-sensitive tumors and (c) differential networks.

Brief review of ADMM algorithms
In this section, we briefly review the standard alternating direction method of multipliers (ADMM) algorithms [1]. ADMM is a technique for solving optimization problem in the following form [2]: subject to X ∈ X . (1) ADMM is attractive when the proximal operator of f (X) + g (X) cannot be easily obtained, but the proximal operator of f (X) and the proximal operator of g (X) can be easily computed. The approach consists of the following three steps [2]: 1. Rewrite the problem (1) as where functions f and g are decoupled by introducing a new optimization variable, Z.

Form the augmented Lagrangian
where Λ is the Lagrange multiplier and ρ > 0 is a penalty parameter.
3. Iterate the following two steps until convergence (a) Update each primal variable in turn by minimizing the augmented Lagrangian (3) with respect to that variable, while holding all other variables. The update in the tth iteration are as follows: (b) Update the dual variable using a dual-ascent update:

ADMM algorithm for fused graphical lasso with weighted 1 penalty
Following in the method of [3], we use the ADMM algorithm to solve the fused graphical lasso with weighted 1 penalty (problem (5) in the main text) To solve the problem (4), we reformulate it by introducing new variables Z k1 and Z k2 , so as to decouple some of the terms in the objective function that are difficult to optimize jointly: The augmented Lagrangian to (5) is given by where U k1 and U k2 are dual variables and ρ severs as the penalty parameter. The ADMM algorithm updates each primal variable while holding the other variables fixed and then updates the dual variables using a dual-ascent update rule. We now derive the update rules for the variables.

Θ k1 and Θ k2 update
Before introducing the update rules for Θ k1 and Θ k2 , we first define the Expand operator: where UDU T is the eigenvalue decomposition of a symmetric matrix A. The Expand operator has been used to solve the graphical lasso problem in previous studies [2,4,3]. Note that Now it follows from the definition of the Expand operator that The update for Θ k2 can be derived in a similar way:

Z k1 and Z k2 update
Minimizing augmented Lagrangian (6) with respect to Z k1 and Z k2 can be written as follows: Now (11) is completely separable with respect to each pair of matrix elements (i, j). That is, we can solve for each (i, j): where a kc ij = z kc ij + u kc ij ρ . This is a special case of the fused lasso signal approximator [3,5] and it has a very simple closed form solution. When λ 1 = 0, the solution to (12) takes the form When λ 1 > 0, the solution to (12) can be obtained through soft-thresholding (13) by λ 1 ω ij . Here the soft-thresholding is defined as S(x, c) = sign(x) (|x| − c) + , where a + = max(a, 0).

Complete ADMM algorithm
Based on the augmented Lagrangian, the complete ADMM algorithm for (4) is given in Algorithm 1. We find it is useful in practice to vary the value of penalty parameter ρ for each iteration. Therefore, we update ρ in each iteration based on the primal residues and dual residues. We present the update rule for ρ at the end of this section. Please note that according to the ADMM algorithm, the estimatesΘ k1 andΘ k1 are not exactly sparse, whileẐ k1 andẐ k1 are sparse which are obtained through soft-thresholding.
• Output Θ k1 , Θ k1 , Z k1 and Z k1 and the dual residual [1] is defined as We consider the process converges if both primal residual and dual residual are sufficiently small [1]. More specially, we introduce small positive constants abs and rel , and declare P t and D t small if and We set abs = 10 −3 and rel = 10 −3 as suggested by Boyd et al. [1]. The choice of the value of abs depends on the scale of the variable values.

Varying penalty parameter
In practice, it is useful to use different penalty parameter ρ for each iteration, which might improve the convergence as well as make performance less dependent on the initial value of penalty parameter [1]. Therefore, we update the value of ρ in the tth iteration according to the primal residuals and dual residuals: where µ > 1, τ incr > 1 and τ decr > 1 are the adaptation parameters. We set µ = 10, τ incr = 2 and τ decr = 2 according to [1].

Complete algorithm of the TDJGL model
By taking Equation (4) into Equation (3) in the main text, the TDJGL model can be written as follows: Θ kc ∈ S p ++ , for k = 1, . . . , K and c = 1, 2.
Given the sample covariance matrices S kc c=1,2 k=1,...,K and the two parameters λ 1 and λ 2 , we can find the estimates of precision matrices Θ kc

Model selection
For the TDJGL model, parameter λ 1 controls the sparsity of the estimated networks, and parameter λ 2 has an influence on the sparsity of resulting differential networks. Therefore, the choice of λ 1 and λ 2 is critical. We determine the values of parameters in data-driven method via stability selection [6]. Stability selection, which seeks the parameters leading to most stable set of edges, has better results for network inference than other model selection methods including cross validation, Akaike information criterion and Bayesian information criterion [7,8,9,10]. We choose λ 1 and λ 2 so as to use the least amount of regularization that simultaneously makes network sparse and stable. Here we resort to a recently developed stability selection method called StARS [7]. Because λ 1 mainly influences the sparsity and stability of resulting gene networks, while λ 2 mainly controls the sparsity and stability of estimated differential networks, we determine their values separately. We first determine the value of λ 1 while setting λ 2 = 0. Then we determine the value of λ 2 while setting λ 1 with the value chosen in the previous step.
After determining λ 1 , we choose λ 2 from a given vector of regularization parameter Λ 2 according to stability of inferred differential networks. For now, we set λ 1 = λ (λ2) 1,opt which is determined above. We estimate 2K networks Ê kc s (λ 1 , λ 2 ) c=1,2 k=1,...,K for each D s and each λ 2 from Λ 2 . Then we construct K differential networks, DE k s (λ 1 , λ 2 ) k=1,...,K , based on the estimated networks. The optimal value for λ 2 is chosen according to the average variance over the differential edges inferred from sub-sampled data: In this study, we set the number of random sample sets N = 20 and the stability parameter β = 0.01.

Criteria for platinum response groups
The criterion that has been used in [11,12] is adopted to define platinum-based chemotherapy response groups. In particular, we download the clinical information (Biotab format) of ovarian tumors from the TCGA website. We obtain the drug information from the ''nationwidechildrens.

Comparison with other graphical lasso models on the ovarian cancer data
We compare TDJGL with FGL, GGL and GL on the ovarian cancer data. For FGL, we run it separately for each platform and each time it is applied across the two patient groups. For GGL, we run it separately for each patient group and each time it is applied across all the three platforms. For GL, we run it separately for each patient group and each platform type. In order to provide interpretable results, we select the tuning parameters of the compared methods to give the similar number of edges and differential edges as those of TDJGL. Unlike TDJGL and FGL, GGL and GL cannot control the similarities of precision matrices between different patient groups. They tend to identify too much differential edges. To better interpret the results of GGL and GL, we sort the absolute values of differential scores in decreasing order, and take the top #DE (the number of differential edges identified by TDJGL) edges for each model. A common challenge in evaluating gene network inference and differential network analysis using real data is the lack of the gold standards. That is, in our ovarian cancer data analysis, we cannot obtain the true gene networks in the platinumresistant tumors and the platinum-sensitive tumors. Therefore, it is difficult to compare different methods in terms of the accuracy of identifying group-specific gene networks and differential networks. In this study, we adopt an alternative way to evaluate performance. First, we compare the methods based on the overlaps between edges inferred from different platforms, which can assess the consistency. A method that produces a greater number of edges shared by different platforms is more consistent. Then, we compare the hub nodes in the differential networks in terms of known drug resistance-related genes and cancer-related genes. A method that works better in capturing known functionally important genes in the differential networks might have better performance in inferring the differential networks.
We observe that overlaps between edges (and differential edges) identified by FGL and GL from the different platforms are quite low (Figures S7 and S9). For GGL which encourages a similar network structure across all platforms, more than half of identified edges are shared by all the three platforms ( Figure S8 (a)-(b)). Because GGL does not consider the similarity of differential networks across platforms, a great number of differential edges detected by GGL are supported by only one platform ( Figure S8 (c)). As mentioned in the main text, both gene networks and differential networks inferred by TDJGL share a great number of edges across all the three platforms ( Figure S6). We also compare the hub nodes in the differential networks inferred by different methods. For FGL, GGL and GL, we consider the 18 genes which have the largest degree of connectivity as hub genes. From Table S1, we find that the set of hub genes determined by TDGJL includes more cisplatin resistance-related genes, drug resistance-related genes and cancer-related genes than those determined by the other three methods.  Figure S7: Overlaps between the edges (and differential edges) detected by FGL from the three platforms for (a) platinumresistant tumors, (b) platinum-sensitive tumors and (c) differential networks.
(a) (b) (c) Figure S8: Overlaps between the edges (and differential edges) detected by GGL from the three platforms for (a) platinumresistant tumors, (b) platinum-sensitive tumors and (c) differential networks. Figure S9: Overlaps between the edges (and differential edges) detected by GL from the three platforms for (a) platinum resistant tumors, (b) platinum-sensitive tumors and (c) differential networks.