Abstract
We evaluated statistical approaches to facilitate and improve multi-stage designs for clinical proteomic studies which plan to transit from laboratory discovery to clinical utility. To find the design with the greatest expected number of true discoveries under constraints on cost and false discovery, the operating characteristics of the multi-stage study were optimized as a function of sample sizes and nominal type-I error rates at each stage. A nested simulated annealing algorithm was used to find the best solution in the bounded spaces constructed by multiple design parameters. This approach is demonstrated to be feasible and lead to efficient designs. The use of biological grouping information in the study design was also investigated using synthetic datasets based on a cardiac proteomic study, and an actual dataset from a clinical immunology proteomic study. When different protein patterns presented, performance improved when the grouping was informative, with little loss in performance when the grouping was uninformative.
The authors wish to express their gratitude to Sharon Browning (Department of Statistics, University of Washington, Seattle), Rohan Ameratunga and Wikke Koopman (Lab PLUS, Auckland District Health Board, New Zealand), Patrick Gladding (North Shore Hospital, Auckland, New Zealand) and Jocelyne Benatar (Cardiac Vascular Research Unit, Auckland City Hospital) for useful discussions and involvement in the cardiac and immunology proteomics studies. The authors wish to acknowledge the financial support from Green Lane Research and Education Trust and A+ Charitable Trust for the cardiac and immunology proteomic studies. The authors also wish to acknowledge the contribution of the NeSI high-performance computing facilities and the staff at the Centre for eResearch at the University of Auckland (Gene Soudlenkov and Sina Masoud-Ansari). New Zealand’s national facilities are provided by the New Zealand eScience Infrastructure (NeSI) and funded jointly by NeSI’s collaborator institutions and through the Ministry of Business, Innovation and Employment’s Infrastructure programs (URL: http://www.nesi.org.nz). The authors also acknowledge the associate editor and two anonymous reviewers for their constructive comments.
References
Anderson, L. (2005) “Candidate-based proteomics in the search for biomarkers of cardiovascular disease,” J. Physiol., 563(1), 23–60.10.1113/jphysiol.2004.080473Search in Google Scholar PubMed PubMed Central
Beck, M., A. Schmidt, J. Malmstroem, M. Claassen, A. Ori, A. Szymborska, F. Herzog, O. Rinner, J. Ellenberg and R. Adbersold (2011) “The quantitative proteome of a human cell line,” Mol. Syst. Biol., 7, 549, 1–8.10.1038/msb.2011.82Search in Google Scholar PubMed PubMed Central
Belisle, C. J. P. (1992) “Convergence theorems for a class of simulated annealing algorithm on Rd,” J. Appl. Probab., 29, 885–895.10.2307/3214721Search in Google Scholar
Chornoguz, O., L. Grmai, P. Sinha, K. A. Artemenko, R. A. Zubarev and S. Ostrand-Rosenberg (2010) “Proteomic pathway analysis reveals inflammation increases myeloid-derived suppressor cell resistance to apoptosis,” Mol. Cell. Proteomics, 10(3), 1–9.Search in Google Scholar
Greef, J. V. D., S. Martin, P. Juhasz, A. Adourian, T. Plasterer, E. R. Verheij and R. N. McBurney (2007) “The art and practice of systems biology in medicine: mapping patterns of relationship,” J. Proteome Res., 6, 1540–1558.10.1021/pr0606530Search in Google Scholar PubMed
Greenbaum, D., C. Colangelo, K. Williams and M. Gerstein (2003) “Comparing protein abundance and mRNA expression levels on a genomic scale,” Genome Biol., 4(9), 117.1–117.8.10.1186/gb-2003-4-9-117Search in Google Scholar PubMed PubMed Central
Hajek, B. (1988) “Cooling schedules for optimal annealing,” Math. Opera. Res., 13, 311–329.10.1287/moor.13.2.311Search in Google Scholar
Hoorn, E. J., J. D. Hoffert and M. A. Knepper (2005) “Combined proteomics and pathways analysis of collecting duct reveals a protein regulatory network activated in vasopressin escape,” J. Am. Soc. Nephrol., 16(10), 2852–2863.10.1681/ASN.2005030322Search in Google Scholar PubMed PubMed Central
Meani, F., S. Pecorelli, L. Liotta and E. F. Petricoin (2009) “Clinical application of proteomics in ovarian cancer prevention and treatment,” Mol. Diagn. Ther., 13(5), 297–311.10.1007/BF03256335Search in Google Scholar PubMed
Moerkerke, B. and E. Goetghebeur (2008) “Optimal screening for promising genes in 2-stage designs,” Biostatistics, 9(4), 700–714.10.1093/biostatistics/kxn002Search in Google Scholar PubMed PubMed Central
National Cancer Institue. (2007). Building the Foundation for Clinical Cancer Proteomics Clinical proteomic technologies for cancer 2007 Annual Report. Retrieved from http://proteomics.cancer.gov/.Search in Google Scholar
Nikolaev, A. G., and S. H. Jacobson (2010) Simulated annealing. In: J.-Y. P. M. Gendreau (Ed.), Handbook of Metaheuristics. New York: Springer.10.1007/978-1-4419-1665-5_1Search in Google Scholar
Nocedal, J., and S. J.Wright (1999) Numerical optimization, New York: Springer.10.1007/b98874Search in Google Scholar
Park, M. A., L. T. Li, J. B. Hagan, D. E. Maddox and R. S. Abraham (2008) “Common variable immunodeficiency: a new look at an old disease,” Lancet, 372, 489–502.10.1016/S0140-6736(08)61199-XSearch in Google Scholar
Patterson, S. D., J. E. V. Eyk and R. E. Banks (2010) “Report from the Wellcome Trust/EBI “Perspectives in Clinical Proteomics” retreat- a strategy to implement next-generation proteomic analyses to the clinic for patient benifit: pathway translation,” Proteomics Clin. Appl., 4, 883–887.Search in Google Scholar
Satagopan, J. M. and R. C. Elston (2003) “Optimal two-stage Genotyping in population-based association studies,” Genet. Epidemiol., 25(2), 149–157.10.1002/gepi.10260Search in Google Scholar PubMed PubMed Central
Skol, A. D., L. J. Scott, G. R. Abecasis and M. Boehnke (2007) “Optimal designs for two-stage genome-wide association studies,” Genet. Epidemiol., 31(7), 776–788.10.1002/gepi.20240Search in Google Scholar PubMed
Spencer, C. C. A., Z. Su, P. Donnelly and J. Marchini (2009) “Designing genome-wide association studies: sample size, power, imputation, and the choice of genotyping chip,” PLoS Genetics, 5(5), e1000477.10.1371/journal.pgen.1000477Search in Google Scholar PubMed PubMed Central
Steffens, B. (2010) “Feasible and successful: Genome-wide interaction analysis involving all 1.9 × 1011 pair-wise interaction tests,” Hum. Heredity, 69(4), 268–284.10.1159/000295896Search in Google Scholar PubMed
Wang, H., D. C. Thomas, I. Pe’er and D. O. Stram (2006) “Optimal two-stage genotyping designs for genome-wide association scans,” Genet. Epidemiol., 30, 356–368.10.1002/gepi.20150Search in Google Scholar PubMed
Whitford, D. (2005). An introduction to protein structure and function. In Proteins structure and function, USA: John Wiley & Sons, Ltd.Search in Google Scholar
Zeng, I. S. L., S. R. Browning, P. Gladding, M. Jullig, M. Middleditch and R. A. H. Stewart (2009) “A multi-feature reproducibility assessment of mass spectral data in clinical proteomic studies,” Clin Proteomic., 5, 170–177.10.1007/s12014-009-9039-ySearch in Google Scholar
Zotenko, E., K. S. Guimarães, R. Jothi and T. M. Przytycka (2006) “Decomposition of overlapping protein complexes: A graph theoretical method for analyzing static and dynamic protein associations,” Algorithms Mol. Biol., 1(7), 1–11.10.1007/978-3-540-48540-7_3Search in Google Scholar
Zuo, Y., G. Zou, J. Wang, H. Zhao and H. Liang (2008) “Optimal two-stage design for case-control association analysis incorporating genotyping errors,” Ann. Hum. Genet., 72, 375–387.10.1111/j.1469-1809.2007.00419.xSearch in Google Scholar PubMed PubMed Central
©2013 by Walter de Gruyter Berlin Boston