NGS and Male Infertility: Biomarkers Wanted

Male factor abnormalities account for 30 to 50% of all infertility cases [1]. Nowadays it is clear that genetic abnormalities are a significant cause of male infertility, with about 20% of patients with sperm defects presenting gene mutations [2]. Next-generation sequencing (NGS) technologies allowed the exponential increase in knowledge of new genes involved in male infertility, offering an enormous amount of data in a fast way [3]. Currently, whole exome sequencing (WES) is the most used approach to study the genetics of infertility. Although WES covers only the coding regions of the genome, it is cost-effective and the data are easy to interpret. Also, there is a high likelihood of identifying significant variants, since approximately 85% of disease-causing mutations are thought to occur in gene coding regions [4]. On the other hand, whole genome sequencing (WGS) analysis provides the most complete information of the genetic variants in an individual, but there are still regions (repetitive or satellite) that cannot be sequenced by WGS. Also, WGS is too pricey (too time and labour consuming) for most research and clinical laboratories, the data produced are technically demanding, and their functional interpretation is challenging, with most of the information gathered having an unknown meaning [5]. Nevertheless, it is becoming clear that non-coding regions of the Commentary Pereira and Sousa; ARRB, 8(5): 1-4, 2015; Article no.ARRB.20263 2 genome also play crucial roles in normal physiology and development, namely in testis and epididymis development and in spermatogenesis [6,7]. Non-coding regions of the genome were already found to be useful for disease diagnosis (e.g. expression profiles of micro RNA are able to accurately identify the origin of some tumours, enabling their classification), prognosis (in some tumours), and therapies (RNA-based and RNA-targeted therapies) [8]. Thus, non-coding regions of the genome should not be forgotten in the biomedical research and, consequently, WES is not enough: There is an urgent need that WGS becomes the standard approach. For that to happen, the inherent costs should decrease, and extremely high-performance computing and intensive bioinformatics support have to be developed.


INTRODUCTION
The standard clinical evaluation of infertile men, i.e. physical examination, clinical semen analysis (morphology, concentration, motility, etc.), and karyotype analysis, fails to identify the causes in 30-50% of infertility cases [1]. Therefore, more efficient biomarkers that allow an improved and faster diagnosis are needed. Microdeletions in the Y chromosome, such as azoospermia factor (AZF a, b, and c), were some of the first genetic markers identified, and still act as biomarkers for male infertility. With the use of the aforementioned NGS technologies, numerous genes involved in male infertility have already been identified [2,9]. However, for the identified mutations to be used as a biomarker, further experimentation and determination of relevance and validity are required [10]. This includes proving causality by functional tests and confirming the usefulness of the genetic panel in a large scale screening. Unfortunately, there is still a gap between the identification of gene mutations and the development and validation of new biomarkers. Hence, few genetic biomarkers are clinically available.
Many human diseases, including human infertility, have an epigenetic etiology [11]. Several studies showed an association between sperm DNA methylation and poor semen parameters [12,13], and indicated the potential utility of DNA methylation in male infertility evaluation [14]. NGS has also been shown capable to perform the characterization of DNA methylation patterns, posttranslational modifications of histones, and nucleosome positioning on a genome-wide scale, with a method called MBD-isolated Genome Sequencing (MiGS) [15].
Despite NGS being a valuable tool, it is important to keep in mind that a genetic mutation by itself may not represent a direct causality, as other factors such as alternative splicing and posttranslational modifications might also be involved in human infertility [11,12,16,17]. Thus, to develop new biomarkers it is of utmost interest to combine functional genomic experiments using large-scale assays, such as NGS (both for DNA and RNA), with proteomic tools (like MALDI-TOF/TOF and LC-MS/MS), to measure and track in parallel the genes/transcripts/proteins. In addition to those technical limitations in biomarker identification, the difficulty to get government approval and financial support, for a technology that is expensive and time consuming, is also slowing down the transition of biomarkers to clinics.
Beyond the biomarker identification for diagnosis application, NGS is also a valuable tool to help in the development of cell therapies. Although viable human sperm has not yet been derived from human embryonic stem cells (ESCs), amazing progress has been attained: For instance, the generation of primordial germ celllike cells (PGCLCs) in mice, with a robust capacity for spermatogenesis, from ESCs and from induced pluripotent stem cells (iPSCs) [18]. In humans, until now, the differentiation of human embryonic stem cells and iPSC has produced primordial germ cells, spermatogonia stem cells, spermatocytes and haploid round spermatids, but not sperm [19][20][21]. Additionally, it is known that factors such as VASA, STELLA and DAZL play a role in PGC formation [21,22], whereas factors such as DAZ, SCP3 and BOULE promote later stages of meiosis and development of haploid spermatids [19,22]. We believe that NGS will help to identify factors involved in reprogramming human iPSC into spermatozoa in a near future, by allowing the identification of the correct factors that can sustain meiosis and spermiogenesis in-vitro. For that, it is of utmost interest to combine efforts between research laboratories to track the several steps of male spermatogenesis using NGS techniques in order to identify its master regulators. The creation of male germ cells not only will lead to better models for disease modelling but also, within a couple of years, will lead to a novel form of assisted reproductive technology, enabling infertile azoospermic patients to have their own genetically-related children. Although the clinical applications of iPSC have been criticized due to ethical, safety and functionality concerns, there are approved clinical trials on-going. For instance, a study is about to start aiming to generate haploid germ cells from iPSC (NCT01454765). Thus, we believe that with the advance of the scientific knowledge about iPSC differentiation, these concerns will be excluded.

CONCLUSION
To conclude, NGS technologies have a broad applicability in functional genomics research from exploring gene expression profiling, genome annotation, small ncRNA discovery and profiling, to the detection of aberrant transcription and epigenetic modifications. Combined with other high-throughput techniques, it will allow a better diagnosis and a more personalized medicine. However, it still faces some challenges, mainly because it is still pricy, requires robust bioinformatics tools and, most important, still lacks an efficient data analysis pipeline. But the field is growing fast and soon these limitations will be overcome.