Abstract
The Gaussian graphical model (GGM) is a probabilistic modelling approach used in the system biology to represent the relationship between genes with an undirected graph. In graphical models, the genes and their interactions are denoted by nodes and the edges between nodes. Hereby, in this model, it is assumed that the structure of the system can be described by the inverse of the covariance matrix, \(\varTheta \), which is also called as the precision, when the observations are formulated via a lasso regression under the multivariate normality assumption of states. There are several approaches to estimate \(\varTheta \) in GGM. The most well-known ones are the neighborhood selection algorithm and the graphical lasso (glasso) approach. On the other hand, the multivariate adaptive regression splines (MARS) is a non-parametric regression technique to model nonlinear and highly dependent data successfully. From previous simulation studies, it has been found that MARS can be a strong alternative of GGM if the model is constructed similar to a lasso model and the interaction terms in the optimal model are ignored to get comparable results with respect to the GGM findings. Moreover, it has been detected that the major challenge in both modelling approaches is the high sparsity of \(\varTheta \) due to the possible non-linear interactions between genes, in particular, when the dimensions of the networks are realistically large. In this study, as the novelty, we suggest the Bernstein operators, namely, Bernstein and Szasz polynomials, in the raw data before any lasso type of modelling and associated inference approaches. Because from the findings via GGM with small and moderately large systems, we have observed that the Bernstein polynomials can increase the accuracy of the estimates. Hence, in this work, we perform these operators firstly into the most well-known inference approaches used in GGM under realistically large networks. Then, we investigate the assessment of these transformations for the MARS modelling as the alternative of GGM again under the same large complexity. By this way, we aim to propose these transformation techniques for all sorts of modellings under the steady-state condition of the protein-protein interaction networks in order to get more accurate estimates without any computational cost. In the evaluation of the results, we compare the precision and F-measures of the simulated datasets.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Sonka, M., Hlavac, V., Boyle, R.: Image Processing, Analysis and Machine Vision, 2nd edn. U.K. International Thomson, London (1999)
Dobra, A., Eicher, T., Lenkoski, A.: Modeling uncertainty in macroeconomic growth determinants using Gaussian graphical models. Stat. Method. 7, 292–306 (2010)
Werhli, A., Grzegorczyk, M., Husmeier, D.: Comparative evaluation of reverse engineering gene regulatory networks with relevance networks, graphical Gaussian models and Bayesian networks. Bioinformatics 22(20), 2523–2523 (2006)
Liu, Y., Kosut, O., Wilsky, A.: Sampling from gaussian graphical models using subgraph perturbations. In: Proceedings of the 2013 IEEE International Symposium on Information Theory (2013)
Li, H., Gui, J.: Gradient directed regularization for sparse Gaussian concentration graphs with applications to inference of genetic networks. Biostatistics 7, 302–317 (2006)
Yuan, M., Lin, Y.: Model selection and estimation in the Gaussian graphical model. Biometrica 94(10), 19–35 (2007)
Friedman, J., Hastie, R., Tibshirani, R.: S parse inverse covariance estimation with the graphical lasso. Biostatistics 9, 432–441 (2007)
Witten, D.M., Friedman, J.H., Simon, N.: New insights and faster computations for the graphical lasso. J. Comput. Graph. Stat. 20(4), 892–900 (2011)
Meinshaussen, N., Buhlmann, P.: High dimensional graphs and variable selection with the Lasso. Ann. Stat. 34(3), 1436–1462 (2006)
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B 58, 267–288 (1996)
Friedman, J.: Multivariate Adaptive regression splines. Ann. Stat. 19(1), 1–67 (1991)
Deichmann, J., Esghi, A., Haughton, D., Sayek, S., Teebagy, N.: Application of multiple adaptive regression splines (MARS) in direct response modelling. J. Int. Mark. 16, 15–27 (2002)
Andres, J.D., Sanchez, F., Lorca, P., Juez, F.A.: Hybrid device of self organizing maps and MARS for the forecasting of firms bankruptcy. J. Account. Manag. Inform. Syst. 10(3), 351 (2011)
Tayyebia, B.A., Pijanowskib, B.C.: Modeling multiple land use changes using ANN, CART and MARS: Comparing tradeoffs in goodness of fit and explanatory power of data mining tools. Int. J. Appl. Earth Obs. Geoinf. 28 (2014)
Lewis, P., Stewens, J.: Nonlinear modelling of time series using MARS. J. Am. Stat. Assoc. 87, 864–877 (1991)
Attoh-Okine, N.O., Cooger, K., Mensah, S.: Multivariate Adaptive Regression (MARS) and Hinged Hyperplanes (HHP) for Doweled Pavement Performance Modeling Construction and Building Materials. J. Constr. Build. Mater. 23(9), 3020 (2009)
Babu, G.J., Canty, A.J., Chaubey, P.Y.: Application of Bernstein polynomials for smooth estimation of a distribution and density function. J. Stat. Plann. Infer. 105, 377–392 (2001)
Phillips, G.M.: Bernstein polynomials based on the q-integers, the heritage of P. L. Chebyshev: a Festschrift in honor of the 70th birthday of T. J. Rivlin Ann. Numer. Math. 4(1–4), 511–518 (1997)
Liao, C.W., Huang, J.S.: Stroke segmentation by bernstein-bezier curve fitting. Pattern Recogn. 23(5), 475–484 (2001)
Belluci, M.: On the explicit representation of orthonormal Bernstein polynomials. arXiv:1404.2293v2 (2014)
Dempster, A.P.: Covariance selection. Biometrics 28(1), 157–175 (1972)
Whittaker, J.: Graphical Models in Applied Multivariate Statistics. John Wiley and Sons, New York (1990)
Craven, P., Wahba, G.: Smoothing noisy data with spline functions. Numer. Math. 31, 377–403 (1979)
Szasz, O.: Generalizations of S Bernstein polynomials to the infinite interval. J. Res. Nat. Bur. Stan. 45, 239–245 (1950)
Mirakyan, G.M.: Approximation of continuous functions with the aid of polynomials of the form \(e^{-nx} \sum _{k=0}^{M}c_{mn}C_{k, n}x^{k}\). Akad. Nauk SSSR 31, 201–205 (1941)
Barabasi, A.L., Oltvai, Z.N.: Network biology: Understanding the cell’s functional organization. Nat. Rev. Genet. 5, 101–113 (2004)
Kampen, N.: Stochastic Processes in Physics and Chemistry, North Holland (1981)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing Switzerland
About this paper
Cite this paper
Ağraz, M., Purutçuoğlu, V. (2016). Transformations of Data in Deterministic Modelling of Biological Networks. In: Anastassiou, G., Duman, O. (eds) Intelligent Mathematics II: Applied Mathematics and Approximation Theory. Advances in Intelligent Systems and Computing, vol 441. Springer, Cham. https://doi.org/10.1007/978-3-319-30322-2_24
Download citation
DOI: https://doi.org/10.1007/978-3-319-30322-2_24
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-30320-8
Online ISBN: 978-3-319-30322-2
eBook Packages: EngineeringEngineering (R0)