Novel Tissue Types for the Development of Genomic Biomarkers

Imagine a simple clinical test that can not only diagnose a disease, but that can also identify the exact, personal therapeutic regime to cure it. Not only that, imagine tests that can accurately predict the potential of developing a disease and provide an individualized roadmap on how it will progress. Now imagine that all you had to do was to spit in a vial, or have a few hairs plucked for the analysis. While the promise of “personalized medicine” is technologically a reality, it relies on the development of disease and progression biomarkers.


Introduction
Imagine a simple clinical test that can not only diagnose a disease, but that can also identify the exact, personal therapeutic regime to cure it. Not only that, imagine tests that can accurately predict the potential of developing a disease and provide an individualized roadmap on how it will progress. Now imagine that all you had to do was to spit in a vial, or have a few hairs plucked for the analysis. While the promise of "personalized medicine" is technologically a reality, it relies on the development of disease and progression biomarkers.
The ideal biomarker should have a number of characteristics, including: having an analyte that is accessible using noninvasive protocols, inexpensive to quantify, specific to the disease of interest, translatable from model systems to humans, and the ability to provide a reliable early indication of disease before clinical symptoms appear. Biomarkers that can be used to stratify disease and assess response to therapeutics are also medically valuable.
Although most current biomarkers utilize protein or metabolic analytes, it can be difficult to develop new protein-based biomarkers. This is due to the inherent complexity of the protein composition of biological samples, the assorted posttranslational modifications of proteins, and the low abundance of many proteins of interest in most biological samples (especially blood). Similarly, the detection of metabolic analytes is difficult due to the complex biological matrix from which they are measured.
Detecting specific nucleic acids, while not trivial, is generally much easier. Synthetic complimentary oligonucleotides can deliver sufficient detection specificity in most cases, and PCR or other DNA amplification methods can be used to improve the detection limit. There are numerous examples of genomic biomarkers that have become powerful tools for molecular diagnostics and outcome prediction (Cronin et al., 2007;Guttmacher & Collins, 2002;Hamburg & Collins, 2010;Klein et al., 2009;Tainsky, 2009;L. J. van 't Veer et al., 2002). RNA and DNA biomarkers are used routinely for screening patients to diagnose and subtype disease, as well as to monitor therapy and predict progression. Discovery of microRNAs, and lately lncRNAs (long non-coding RNAs), further increased their importance and broadened their clinical application (Gibb, Brown, & Lam, 2011;Laterza et www.intechopen.com al., 2009). Low complexity, no known post-processing modifications, simple detection and amplification methods, tissue-specific expression profiles, and sequence conservation between humans and model organisms make extracellular miRNAs ideal candidates for genomic biomarkers to reflect and study various physiopathological conditions of the body.
Ideally, the most clinically powerful information would come directly from the tissue of interest. To understand cancer, one must look at malignant cells, much as one must analyze brain tissue to understand the complexities of neuroscience. However, many of these tissues are difficult to access or impossible to reach without potential injury to the patient. Alternative, or "surrogate", tissues can provide a means of assessing the genomic changes in the tissue of interest, without fear of harming the donor. For example, surrogate tissues may contact the tissue of interest and retain sloughed cells, secreted molecules or the contents of dying cells. While these molecular signals may not exactly mirror the tissue of origin, in many cases they are reproducible and can clearly point to underlying biology. Clinical material suitable for biomarker testing can be divided into 2 different types. The first are those that require minimally invasive procedures to obtain. This type includes blood, cerebrospinal fluid, tissue biopsies and so on. Type 2 tissues are those that can be obtained without any invasive means: hair, saliva, tears, epidermal cells, urine, etc. In some cases, acquisition of the material may not be passive. Examples of Type 1 and Type 2 samples are listed in Table 1 The easier it is to provide a sample for biomarker testing, the greater will be the utilization and utility. There is emerging data that many tissues and fluids that have been largely ignored, hold numerous important analytes that can be exploited for biomarker development. Relative ease of acquisition and rich genomic information, make these surrogate tissues ideally suited for the development of new biomarkers. By casting a wider net over the potential sources of biomarkers, we can increase the odds of finding clinically important ones that will make predictive, personalized healthcare a reality (Hood & Friend, 2011). In this review we will provide examples of various surrogate tissues that are being utilized for the development of genomic biomarkers, and highlight important concepts for successful collection and handling of them. www.intechopen.com

Whole blood
Peripheral blood remains the most commonly studied tissue due to the minimally invasive nature of sample collection and the vascularization of most tissues. Peripheral whole blood is a rich source of validated and potential biomarkers, whether they are protein, genomic, or metabolic in nature. While the methods for extraction and profiling of blood DNA are well established, the isolation of RNA and microRNA from whole blood, and studies on their transcript abundance (commonly called gene expression studies), still pose many technical challenges. These include transcriptomic changes induced by ex vivo handling and the interference of highly abundant globin mRNA.
Pre-analytical variables such as the degradation of RNA by endogenous RNases and unintentional expression of individual genes after drawing blood could lead to false assessment of potential markers. The introduction of blood collection systems containing stabilizing additives has significantly improved the RNA quantity and quality of blood samples (Rainen et al., 2002;Thach, 2003). RNA stabilization systems have the advantage of storing the collected samples at more accessible temperatures before shipment to the laboratory for analysis, resulting in reduced pre-analytical variability. A well-described method for RNA stabilization in human blood is the PAXgene TM system (Chai et al., 2005;Rainen et al., 2002). The Tempus™ Whole Blood RNA isolation system offers an alternative approach to peripheral blood RNA isolation suitable for gene expression profiling as well (Asare et al., 2008). Recently RNAlater TM , a common stabilization reagent for RNA in cells and tissues, has been successfully used for RNA stabilization in human peripheral blood . The downside of the latter method is that pre-filled RNAlater TM blood collection tubes are not currently available commercially.
All the described methods are able to stabilize transcription and isolate total RNA with good quality and in appropriate quantities. However, RNA stabilization/isolation methods can critically impact differential expression results. For example, the failure of PAXgene TM to stabilize specific transcripts was reported in several studies (Asare et al., 2008;Kågedal et al., 2005). Until more broad studies are done, it is recommended that a researcher should prevalidate the whole blood stabilization/isolation conditions with the transcripts of interest. We find that strict adherence to the manufacturer's protocol for collection and storage, including how the reagent is mixed with the blood at the time of collection, is critical to successful expression profiling.
The discovery of microRNAs has opened new opportunities for markers in the diagnosis of cancer (Wang et al., 2009). MicroRNAs are small (typically ~22 nt in size) regulatory RNA molecules that function to modulate the activity of specific mRNA targets and play important roles in a wide range of physiologic and pathologic processes (Mattick & Makunin, 2005). MicroRNAs are an ideal class of blood-based biomarkers for disease detection because: (i) miRNA expression is frequently dysregulated in disease, (ii) expression patterns of miRNAs are tissue-specific, and (iii) miRNAs have unusually high stability in most tissues and can be recovered from formalin-fixed, paraffin embedded samples.

www.intechopen.com
Several studies have reported optimized isolation protocols to enhance the recovery of microRNAs in the stabilized samples. For example it was shown that microRNAs could be isolated from PAXgene-stabilized blood of sufficient quantity and quality that is suitable for downstream applications (Kruhøffer et al., 2007).
Another problem hampering the analysis of microarray gene expression data in whole blood is the presence of globin. Globin mRNA in red blood cells accounts for over 70% of all mRNA in whole blood and interferes with the accurate assessment of other genes (Field et al., 2007;Wright et al., 2008). Several approaches have been developed to mitigate this effect and tested in microarray experiments (Liu et al., 2006;Vartanian et al., 2009;Wright et al., 2008). Globin reduction techniques based on biotinylated DNA capture oligos (Ambion GLOBINclear processing protocol) produced sensitive results but was least reproducible among all the methods tested (Vartanian et al., 2009). An alternative protocol with globin PNAs (peptide nucleic acid inhibitory oligos) proved to be the best in sensitivity and reproducibility, but was the most time-consuming and required the highest amount of total RNA input (Liu et al., 2006;Vartanian et al., 2009). An alternative approach was suggested by Eklund and colleagues (Eklund et al., 2006). NuGEN's Ovation WB sample preparation protocol, based on single primer isothermal amplification (SPIA), generates cDNA target. The hybridization kinetics of the cDNA target are less affected than cRNA targets by the abundant globin RNA present in whole blood extract. The high specificity and sensitivity of cDNA targets, and the highly reproducible SPIA protocol have been shown to be as good or better for mitigating the interference of globin transcripts compared to other protocols (Fricano et al., 2011;Li et al., 2008;Parrish et al., 2010). The strong performance of this technique, and the relatively low input requirements (50ng of total RNA) have made the NuGEN Ovation WB protocol the method of choice for gene expression profiling in the microarray community.

Serum and plasma
Both plasma and serum are widely used specimen types for molecular diagnostics. Nucleic acids that can be found in small amounts in cell-free preparations of whole blood are frequently called "circulating nucleic acids". To date, a number of studies show that plasma and serum nucleic acids can serve as both tumor-and fetal-specific markers for cancer detection and prenatal diagnosis, respectively. For example, several studies reported increased concentrations of DNA in the plasma or serum of cancer patients sharing some characteristics with DNA of tumor cells (Leon et al., 1977;Stroun et al., 1989). Interestingly, DNA levels decreased by up to 90% after radiotherapy, while persistently high or increasing DNA concentrations were associated with a lack of response to treatment (Anker et al., 2001). RNA has also been found circulating in the plasma or serum of normal subjects and cancer patients (Feng et al., 2008;Tsui et al., 2002Tsui et al., , 2006. The recent discovery that serum and plasma contain a large amount of stable miRNAs derived from various tissues/organs has lead to multiple studies on circulating miRNA expression as well (Mitchell et al., 2008;Chen et al., 2008;Zhu et al., 2009).
Analysis of circulating nucleic acids, however, requires modified extraction methods to utilize plasma or serum as the source material. First, plasma and serum are biospecimens that have a very high concentration of protein that can interfere with sample preparation and detection techniques. Second, the yield of circulating nucleic acids from small volume www.intechopen.com plasma or serum samples (< 1 mL) usually falls below the limit of accurate quantification by spectrometry and calls for an alternative way to assess the efficiency of nucleic acids recovery. Several serum/plasma extraction kits are now available commercially through Qiagen, Norgen and other companies. These kits successfully address the problems mentioned above, employing column-based purification methods and various carriers. We suggest the use of carefully selected extraction spike-ins to allow researchers to evaluate the efficiency of the circulating nucleic acids isolation.

Circulating tumor cells
Circulating tumor cells (CTCs) are cells that have been sloughed off of primary tumors and circulate in the bloodstream. Their numbers can be very small (1-10 cells per mL of whole blood) and these cells are not easily detected. Even though CTCs were first observed by Thomas Ashworth back in 1869, the technology with the requisite sensitivity and reproducibility to detect CTC in patients with metastatic disease was developed only recently (Sleijfer et al., 2007). While the presence of circulating tumor cells themselves can serve as a marker of poor clinical outcome, there is an opportunity to develop new biomarkers by studying the gene or protein expression in these cells. Changes in the phenotype of tumor cells can occur after the original diagnosis and resistance to a treatment can only be inferred after the treatment has failed. CTCs offer a tool to understand the complex biology of tumor cells, without the need of invasive biopsies.
Recently, CTCs have been the target of multiple molecular profiling studies (Bosma et al., 2002;Punnoose et al., 2010;Smirnov et al., 2005;Tewes et al., 2009). mRNA expression and DNA mutations can be measured from captured CTCs. RT-PCR using a multi-marker panel of cancer-associated genes was found to be the most sensitive technique for the detection of CTC in blood of breast cancer patients (Bosma et al., 2002;Tewes et al., 2009). Another approach involves the analysis of CTC-enriched samples by microarray gene expression profiling, where numerous genes like S100A14 and S100A16 have been detected (Smirnov et al., 2005).

Dried blood spots
The method of collecting capillary blood on filter paper was introduced in Scotland by Robert Guthrie in 1963 and since then has become a mainstream approach for blood sample collection from newborns in more than 20 countries (Consultant Paediatricians and Medical Officers of Health of the SE Scotland Hospital Region, 1968;Scriver, 1998). These samples were found invaluable for screening for congenital metabolic disorders. Dried blood spots (DBS) are easily acquired through a simple needle stick and transfer to paper cards that are stored and handled at room temperature in ambient atmospheric conditions. This approach eliminates many costly, time-consuming, and unpleasant aspects of sample collection, and can also significantly reduce the cost for shipping samples. The collection of DBS samples requires very little infrastructure and can be done in resource-limiting locations. Vidal-Taboada and colleagues even showed that both patients and investigators prefer this as a method of DNA collection and storage (Vidal-Taboada et al., 2006).
The limitation of small sample volume has restricted the usage of dried blood spots for the development of molecular diagnostics until recently. Advances in technology have www.intechopen.com overcome many of the problems with reduced sensitivity and specificity. For example, the development of whole genome amplification (WGA) protocols allow researchers to perform reliable genome-wide scans using archived residual blood samples from newborn screening programs, which are standard practice in several countries (Hollegaard et al., 2009). Several studies have shown that despite being considered too vulnerable to degradation by ribonucleases, RNA could be recovered from DBS samples that had been stored for 15-20 years, and be successfully amplified by reverse transcription-PCR (Karlsson et al., 2003;Zubakov et al., 2008). Also, dried blood spots recently become the sample type of choice for HIV screening in low-resource settings (Sherman et al., 2005;Uttayamakul et al., 2005).

Cerebrospinal fluid
Cerebrospinal fluid (CSF) is a cell-free, colorless liquid that occupies the subarachnoid space and the ventricular system around and inside the brain and spinal cord. It is usually obtained through lumbar puncture. CSF has been rediscovered in the post-genomic era, as a great source of potential protein biomarkers for various diseases as it bathes the brain and other neurological tissues. Analysis of CSF allows rapid screening, low sample consumption, and accurate protein identification by proteomic technology (Guerreiro et al., 2006;Zheng et al., 2003). Brain proteins in CSF are also important for diagnosis of noninflammatory CNS diseases. Examples of conditions in which these proteins are diagnostically relevant include degenerative diseases (Otto et al., 1997;Ranganathan et al., 2005), tumors (Zheng et al., 2003), hypoxias and brain infarction (Schaarschmidt et al., 1994).
Advancements in nucleic acid (NA) amplification techniques have transformed the diagnosis of bacterial and viral infections of the central nervous system. Because of their enhanced sensitivity, these methods enable detection of very low amounts of pathogenic genomes in cerebrospinal fluid. Diagnosis of several viral CNS infections, such as herpes encephalitis, enterovirus meningitis and other viral infections occurring in human immunodeficiency virus-infected persons are currently performed using cerebrospinal fluid (Cinque, Bossolasco, & Lundkvist, 2003). MicroRNAs are also becoming an important analyte in CSF for the identification of neurological disease (Baraniskin et al., 2011;Cogswell et al., 2008;De Smaele et al., 2010). For example, miRNAs isolated from the frozen cerebrospinal fluid of Alzheimer disease-affected (AD) and non-affected patients showed distinctly different expression profiles (Cogswell et al., 2008). Notably miRNAs linked to immune cell functions including innate immunity and T cell activation and differentiation were up-regulated in AD.
Combining mRNA studies with protein expression analysis may provide a more global picture of the biological processes associated with CNS disorders. Information gathered could lead to the development of select biological indices (biomarkers) for guiding CNS diagnosis and therapy.

Saliva
Saliva is an easily obtainable tissue that has been used in forensics for decades (Sweet et al., 1997). However, new molecular profiling kits for voluntary saliva collection have made saliva an increasingly useful clinical biomarker tissue. The collection process is noninvasive, and can even be collected at home or in isolated locations using some of the newer www.intechopen.com collection kits (Oragene or Norgen products). This ease of collection results in higher compliance by the patients. As is often the case in biological samples, the difference in yield is usually a donor dependent value (van Schie & Wilson, 1997). It is possible that saliva samples could replace blood samples for DNA studies. A study in Australia and New Zealand compared 10 matched pairs of blood and saliva, as well as nearly 2000 samples of either blood (Australia) or saliva (New Zealand; Oragene collection system) for genotyping. This study was larger than the van Schie & Wilson study, but corroborated that there is a donor dependency to DNA yield. Because of the larger sample number, they saw more sample variance. However, they also concluded that variance had more to do with collection, processing and donor variability than variance due to tissue type (Bahlo et al., 2010). The collection and processing methods can all eventually be controlled. In most cases there was enough mass from 1 ml of saliva sample to yield at least 4ug of DNA, which is enough DNA for most molecular biology assays.

Skin tissue
Readily-accessible and as well-tolerated as punch biopsies (Camidge et al., 2005), skin is comprised of various layers of cells, making it useful for phenotypic and histological studies. Moreover, as a constantly dividing tissue with cells at various stages of development, skin provides insight into important signaling networks such as EGF, Wnt, Notch and cell proliferation (Phillips & Sachs, 2005).
Wee1 inhibitors have been examined as a way to bypass the G2 checkpoint, sensitizing p53 negative cells to DNA-damaging agents (Wang et al., 2001). In research conducted by Mizuarai et al., p53 negative rat skin xenograft tumors, p53 positive and negative cultured cancer cells, and p53 positive rat skin tissues were subjected to gemcitabine alone or in combination with the Wee-1 inhibitor MK-1775 (Mizuarai et al., 2009). Gene expression data identified five genes as potential biomarkers present in both tumor and skin.
Because of its strong potential as a surrogate tissue, it is important to address storage and handling challenges faced when using skin. Due to its protective nature, skin is shielded by nucleases and difficult to homogenize. We have found immediate preservation in RNAlater following the manufacturer's protocol (rather than flash-freezing) and thorough pulverization are paramount to extracting sufficient quantities of high-quality nucleic acid (data not shown).

Skin tissue alternatives
Synthetic skin is a relatively new surrogate tissue that lends itself to investigation of a wide variety of processes while reducing the need for volunteer recruitment or laboratory animal testing (Poumay & Coquette, 2007). For extracting nucleic acids, we have found that synthetic skin is less susceptible to nucleic acid degradation and more easily homogenized than real human skin (data not shown). Synthetic skin has recently been used to study processes such as wound healing (Koria et al., 2003), epithelial development (Taylor et al., 2009), effects of cosmetics on skin (Faller et al., 2002), and even differential gene expression in skin disorders.

www.intechopen.com
Yao et al. identified the overexpression of type I IFN-inducible genes in psoriatic biopsies by comparing biopsies of normal, healthy donor skin and non-lesional skin to psoriatic donor skin (Yao et al., 2008). To better understand the degree of type I IFN-inducible gene overexpression in psoriasis, blood from healthy donors and normal keratinocyctes (EpiDerm, MatTek, Inc.) were stimulated with various members of the type I IFN family. Ex vivo blood and in-vitro keratinocyte data showed overall agreement in up-regulated type I IFN-inducible genes. While only 1% of upregulated probes from the stimulation study were overexpressed in non-lesional compared to normal skin, 11.7% of the upregulated probes were overexpressed in lesional compared to non-lesional skin, suggesting type I IFNs may be a prospective target for psoriatic treatment.

Hair follicles
Hair follicles are different from skin and blood, in that they are made up of stem cells, which control the growth and cycling of hair. The stem cells are contained within the follicle and are often called the bulge. It is this fact which makes hair follicle gene expression particularly intriguing: "stem cells in the epidermis and hair follicle serve as the ultimate source of cells for both of these tissues, understanding the control of their proliferation and differentiation is key to understanding disorders related to disruption in these processes," (Cotsarelis, 2006).
Advances in hair follicle extraction, isolation, and amplification techniques along with the relative ease of collection of the tissue, and the abundance on most, hair follicle collection is being increasingly examined as a good investigatory and clinical biomarker tissue. To date most research has been in diseases involving skin conditions (Ohyama et al., 2006). However, hair follicles are also being examined for markers in to quantify exposures to pharmaceuticals (Reiter et al., 2008) or toxicology to certain drug targets (Kim et al., 2006).
Hair follicles are obtained using tweezers, grasping at the hair as near to the scalp as possible, and quickly yanking upwards. The follicle should be clearly present and immediately preserved in the appropriate preservation solution. For those with longer hair, we have found it helpful to cut the hair close to the follicle, before preservation. Although it is possible to achieve results with a single or a few (3 follicles), it is often better to acquire a larger set (15 follicles), to ensure the needed mass for evaluation will be met. The follicles for the experiment should be taken from a similar location for each extraction, as there might be slight gene expression changes with different hair locations (head, arm, and eyebrow). We recommend behind the ear for collection of the desired hairs for most applications. There are several different preservation solutions such as RNAlater (Ambion) or SD Lysis Buffer (Promega). Following preservation, follow the manufacturer guidelines on storage and extraction/isolation of the RNA.

Feces
Often overlooked, stool is an important source of potential biomarkers for a number of clinical indications. While the identification of infection and various metabolic imbalances are easily identified, feces can also yield RNA, DNA and miRNA for use in biomarker development. This is largely due to the shedding of epithelial cells in the gastrointestinal track (Osborn & Ahlquist, 2005). With the use of highly sensitive detection techniques, one can identify genetic aberrations in the genomes of these cells and understand or diagnose, non-invasively, the pathology of the patient's disease. However, the extraction and purification of nucleic acids in feces is quite challenging due to its low abundance and the high level of contaminants like humic acid. Thankfully, there are a number of commercial kits available for fecal DNA isolation, and new techniques such as synchronous coefficient of drag alteration (SCODA) show promise in further purifying and concentrating this rare DNA (Broemeling et al., 2008;Marziali et al., 2005). Interestingly, both the amount and integrity of DNA in feces have been shown to identify colorectal cancer patients (Klaassen et al., 2003;Osborn & Ahlquist, 2005). A variety of mutations found in this DNA have been identified in the stool of colorectal cancer patients. Genes identified with mutations include KRAS, TP53 and APC, among several others (Osborn & Ahlquist, 2005;Young & Bosch, 2011). The most interesting of these is the adenomatous polyposis coli gene (APC). Mutations in the APC gene have been shown to drive the growth of adenomas, and their identification in stool samples allows the early detection of early stage colorectal neoplasia (Jen et al., 1994;Traverso et al., 2002). Analysis of fecal DNA has also been used to identify pancreatic adenocarcinoma (Caldas et al., 1994).
The isolation and analysis of RNA from fecal samples has also gained a great deal of attention. While less stable than DNA, RNA provides a snapshot of the transcriptional activity of exfoliated cells; reflecting both genomic and environmental influences. Changes in gene expression may more fully reflect a target tissue's response to therapeutic agents. Alexander and Raicht demonstrated the ability to extract RNA from stool and suggested its use as a method for the early detection of colon tumors (Alexander & Raicht, 1998). One such transcript with potentially diagnostic value is cyclooxygenase 2 (COX-2) which can separate colorectal cancer patients from healthy patients (Kanaoka et al., 2004). Still others are exploiting fecal RNA to better understand infant health (Chapkin et al., 2010;Davidson et al., 1995;Kaeffer et al., 2007).

Urine
Urine is an ideal source for the identification of new biomarkers as it is easily and noninvasively collected. It has long been a standard fluid for the measurement of metabolites, proteins, and infectious agents. Recent data has demonstrated that not only can these traditional analytes can be identified, but RNA, DNA miRNA can be extracted and profiled. While less stable than the other nucleic acids, mRNA can be detected in urine. Keller and colleagues have demonstrated that this stability is likely due to protection of the mRNA in protein/lipid vesicles called exosomes (Keller et al., 2011;Nilsson et al., 2009). Further, mRNA patterns from urine sediments have been suggested for the development of ovulation and fertility biomarkers (Campbell & Rockett, 2006). miRNAs have also been uniquely identified in urine , and their stability has also been linked to exosomes (Record et al., 2011;Valadi et al., 2007). Differential detection of miRNAs in urine is showing promise in the non-invasive detection of lupus, nephropathy, renal allograft rejection and urothelial cancer (Lorenzen et al., 2011;Wang et al., 2010;Yamada et al., 2011).
Urinary DNA is a complex target, with both host and non-host DNA being present and clinically relevant. Patient DNA is readily extracted from urine with methylation patterns that have been shown to have utility in the diagnosis of cancer and kidney injury (Chen et www.intechopen.com al., 2011;Kang et al., 2011). Microbial DNA is also extracted in urine. Through the expanding discipline of microbial metagenomics, we now understand that the relative distribution of microbial DNA has important clinical utility (Nelson et al., 2010;Virgin & Todd, 2011). New improvements in next generation sequencing and microarray technology are showing how the interactions between microbial communities and their host are measurable and are correlated with the health of the host. Urine, like feces, has the potential to provide an easily accessed fluid type, whose flora may provide an exquisitely sensitive measure of pathological state. For example, the microbiome of urine can be used to monitor asymptomatic sexually transmitted disease and is highly correlated to data generated from the urethra swabs (Dong et al., 2011;Nelson et al., 2010). As more work is done in this field, it is likely that more examples will be uncovered.

Nipple aspirate fluid
The breast is a complex organ whose architecture is intertwined with its biology. Even the structure of the nipple is multifaceted and not completely well understood (Love & Barsky, 2004). However, it does provide unique access to fluid that can be leveraged for biomarker development. Nipple aspirate fluid (NAF) and ductal lavage contain cells that have been used for the diagnosis and monitoring of breast cancer (Lang & Kuerer, 2007;Li et al., 2005;Mendrinos et al., 2005;Sauter et al., 1997). NAF is generally obtained either through spontaneous emission or suction, while ductal lavage requires the use of a microcatheter to enter the duct orifice to rinse and collect fluid. Although more invasive, ductal lavage yields more cells (Dooley et al., 2001;Li et al., 2005). These cells originate from the ductal epithelium and by studying them in the NAF, we can glean important information about the active biology within the ducts without the risks associated with biopsy (Dooley et al., 2001;King & Love, 2006;Miller et al., 2006). Much of this work has focused on the early identification of neoplasia using proteomic or cytological analysis of the cells isolated from this fluid (Dooley et al., 2001;Harigopal & Chhieng, 2010;King & Love, 2006;Mendrinos et al., 2005;Wrensch et al., 1992;Wrensch et al., 2001). Recent work has focused on the genomic profiling of NAF cells in order to identify early biomarkers that may predict progression, before morphological changes are evident. For example, the methylation of key tumor suppressor genes can be a highly effective means of predicting tumorgenesis. Preliminary work using NAF samples has demonstrated this as a feasible biomarker of early cancer detection (Krassenstein, 2004). However, measuring the methylation status of key genes in NAF-derived cells is generally not a sensitive enough technique on its own to diagnose disease or predict progression (Euhus et al., 2007;Fackler et al., 2006;Locke et al., 2007).
Mitochondrial sequencing has been shown to be a sensitive way of identifying neoplastic tissues (Czarnecka et al., 2006;. Mutations in the mitochondrial genome are often found at higher rates than in normal tissues. It is likely that in many cases, these mutations are directly linked to disease pathogenesis, while in others this linkage may only be an effect of other processes. Various groups have applied different techniques to sequence mtDNA from NAF. Zhu and colleagues showed that mutations in mtDNA can be detected non-invasively from NAF using sequencing (Zhu et al., 2005). Jakupciak and colleagues used a mitochondrial resequencing microarray and were able to demonstrate the detection of mutations and a high correlation to traditional sequencing methods . These methods show great promise for clinical use, although further work is required to validate the approaches. Interestingly, traditional www.intechopen.com methodologies for mtDNA sequencing, such as Sanger sequencing or hybridization-based resequencing, are substantially impacted by the presence of normal cells. This background of normal cells attenuates the positive mutational signals, leading to poor discrimination of bases. While Zhu and colleagues did not find this to be true in their study, it is likely that as next generation sequencing methodologies are applied to NAF profiling, we will be able to discriminate and quantify the differences between normal and tumor cells with high resolution (Zhu et al., 2005).
Genomic and mitochondrial DNA statuses are important factors in understanding the genetic context of disease. However, tumorigenesis is a dynamic process that is influenced by heredity and environment. RNA profiling is a way of linking these factors in a measurable way. Due to their low numbers, breast fluid-derived cells are difficult targets for gene expression profiling. With recent advances in mRNA amplification methodologies, there are now tools that allow these studies (Van Gelder et al., 1990). For example, Single-Primer, Isothermal Amplification (SPIA) is one of several techniques that can amplify and label mRNA for microarray or RT-qPCR analysis (Kurn et al., 2005). Various studies have shown the utility of gene expression in identifying gene expression patterns of tumors that subclassify breast cancer and help to predict outcome (Cronin et al., 2007;Ma et al., 2003;van de Vijver et al., 2002). It is conceivable that these same transcript signatures will be obtained from isolated cells from ductal fluid.

Sample collection
The utility of a given sample to yield a clinically meaningful result is dependent on many factors. These include when and how samples were collected, the preservation method used to stabilize the analytes, shipping and storage effects, and the correct association of patient data with the sample. Variation in any of these areas can have a substantial impact on the usefulness of a sample.
There is conflicting data as far as the effect of time delay between sample collection and the time of extraction of RNA. Some studies report that any delay in getting the sample from the living state to a preserved state (frozen, in formalin (FFPE) or RNAlater) will decrease the quality of the sample (Hong et al., 2010). There are other studies that indicate that there is at least a 16 hour window in which the sample collection and the QC metrics of BioAnalyzer assessment do not show any degradation (Micke et al., 2006). In our experience, we have found that any interruption of sample collection state en route to preservation could lead to degradation of the RNA (unpublished observation). Lisowski and colleagues found that as FFPE sample slices aged, signal intensity by in situ hybridization (ISH) was impacted. If they sliced from the block right before extracting RNA, the signal was clearer and stronger (Lisowski et al., 2001). While some tissues are considered homogenous, studies by Irwin and Dyroff show that there are different physiological responses to different sections of liver in response to drugs (Dyroff et al., 1986;Irwin et al., 2005).

Shipping and storage
With the advent of electronic tracking by the shipping industry, as well as a societal expectation of overnight shipments, samples can safely and quickly travel from a clinical www.intechopen.com site to a separate processing facility. FedEx pioneered the idea of hub shipments and overnight travel, but others have adopted and emulated their practices. Some couriers will replenish dry ice on shipments traveling more than 24 hours (World Courier). Coupled with this is the need for the initial shipper to pack the samples in such a fashion that they will be held at the correct temperature for at least 24 hours. Written or web based guidance should be given to all collection sites with explicit details as to size of shipping containers and amount of dry ice to use to ensure safe passage of the samples.

Sample handling and logistics -Barcoding and annotation
Clinical studies need the support of large numbers of samples to confirm the efficacy and safety of a drug. With the expanded usage of biomarkers in clinical trials, even more samples and patients may be needed to fully discover the population that will best be served by a given therapy. One clinical collection set can consist of as little as one sample or up to potentially 100 samples from a single patient in one day. The number of samples needed to generate statistically significant data will number in the tens of thousands across the different stages of a clinical trial. Clinical trial involvement necessitates scrupulous tracking of many details about each sample. Historically, this was all done on paper, but with increasing computing power and usage, tracking of the samples can be more effectively done by utilizing well built database systems. Effective use of computers also increases the option of analyzing samples across multiple trials, including the option of comparing biomarkers for a more customized treatment approach. To accomplish this, companies are relying on electronic data capture such as LIMS (Laboratory Information Management system), EMR (Electronic Medical Records) or CTMS (Clinical Trial Management System) and barcodes on individual samples (Burczynski et al., 2005;B. Choi et al., 2005;Niland & Rouse, 2010).
There is more than one approach towards connecting the annotation about a sample and an identifier on the sample container. Some systems rely on human readable text on the labels to tell the person handling the container what should be in it. There is the potential for error when depending on a human to read or type (Turner et al., 2003). Sometimes these labels with text also have a barcode on them. This type of barcoding system is referred to as an intelligent barcode system, only because there is specific sample information, other than the barcode, on the label. Other systems make full use of contemporary technology to track samples (naïve barcodes). With the use of the naïve barcode system, the sample collector needs to be able to associate the sample with a related database. This can be done by the collector writing on a piece of paper, which is then entered into the database at a later time by a data entry clerk. Alternatively, technology may be fully leveraged by supplying the collection sites with barcode readers, and access to the appropriate database, to associate the barcode on the container, with the given patient ID.
There are pros and cons for each of these barcoding methods. Having an intelligent barcode (pre-association of barcode with patient ID/time point) means that the person doing the collection needs only to find the correct label for the given sample, as the time point information should already be tracked in a database. If the labels are printed in a sequential fashion, then this may be simple. The con to this system is that if for some reason the correct label cannot be found, there is not usually a means to associate a new label with the sample.

www.intechopen.com
Generally, projects that use this kind of labeling do not have any computer connection from the collection sites to the database storing the sample information. Before the advent of ubiquitous computers and hand held devices, associating the sample label information to a matching piece of paper seemed an effective way to track samples.
The major drawback with the naïve barcode system (barcoded tubes that are associated at the point of collection with the sample) is that if the association of sample to barcode is not made by the collection site, then the container is just a tube of tissue, useless for further study. To effectively use the naïve barcode, sites benefit from having access to the database while collecting samples. This can be as simple as barcode scanners that allow some amount of data entry. In some instances, double barcode labels can be supplied to the sites, one is affixed to the form and one is placed on the tube, with the association in to the database to be made later.
One method of association, which is a compromise between the intelligent barcode method and the naïve barcode method, is done by associating barcoded containers into a kit at a central laboratory assembly site. Then the kits are shipped to various collection sites. As the kit leaves the facility, the internal containers are still a naïve barcoded container, however at this point, they are associated with a tube type and a destination, all of this information is tracked at a the central laboratory, not on the containers. At the collection site, the kit is associated to a patient. This reduces the amount of data entry needed. The practice of associating the kit barcode at the site of collection to the patient ID allows some flexibility, while still allowing tracking of the tubes within the kits to be organized. This method ensures the highest quality association between a given sample and the donor.
In addition, given the current increase of hand held scanners with WiFi access, immediate computer access is no longer a large barrier. Car rental agencies and store inventory systems have been using portable scanners to track inventory for decades; similarly, it isn't too difficult to adopt similar technology for use in clinical trial data collection. The New York subway system integrates data from barcoded tickets, generated from identified machines, all with customer anonymity, to track where passenger flow is most active. There are some groups who have started to study the benefits of this type of live data association in studies involving human donors or patients (Avilés et al., 2008). While it is not essential for the sites to have computer access, as the paper trail of requisition forms is still common,, instant computer contact by the collection site does make the tracking easier. Handwriting barcodes and manual association outside of the database defeats the efficiency of the naïve barcode system, although downstream sample processing can make use of the barcoding system if there is a barcode and the association is made to the patient identifier.
In addition, there is an added benefit of naïve barcodes for double blind studies. Double blind studies mask the sample identity, including patient and treatment information. This is to prevent bias in the study and to protect the identity of the study patients. In the past a double tier system of identification numbers would cryptically hide the patient information from those involved in the collection or the analysis of the study. Only a select few would have access to source information about both the patient and drug information. Unique barcodes on the container, without any study information on the label, can provide a double blind labeling system, as long as the sample is always tracked in the LIMS system. www.intechopen.com

Conclusion
Technology has finally caught up with science fiction. The idea of a pin prick to divine ones' future is fast becoming a reality. Science is moving medicine in a direction where patient care will be predicted and prevented, and not watched from afar. Data-rich and highly sensitive techniques like microarray profiling, quantitative PCR, and Next Generation Sequencing are the genomics tools that are helping to drive these changes. However, to extract the greatest utility, tests need to be simple to complete, cost effective and as noninvasive as possible. Clinical impact is directly related to the availability and cost of a test. Consider the case of standard tumor biopsy. Depending on the disease and tumor location, a biopsy can be minor surgery involving a team of doctors, nurses, radiologists, and specialists. Recovery from a biopsy is often brief, but in some cases can lead to a costly overnight hospital stay. In the end, the actual cost of obtaining material for a test can be in the thousands of dollars, while the test itself, may only be a couple of hundred of dollars. For many biomarkers, there is more cost associated with the acquisition of sample, than the test itself. It is for this reason it makes both clinical and financial sense to find ways to make sample acquisition more cost effective and less precarious for the patient.
By studying often overlooked sample types, we may identify a treasure trove of clinically useful biomarkers. While not every surrogate tissue will yield a disease or response-specific biomarker, there is substantial data to justify the investigation. There is undeniable value in the use of biomarkers in drug development and patient care, but this value is tempered with the cost of sample acquisition. Developing methods for the acquisition of clinically useful and easily obtainable samples is important as we move from a drug discovery process that is focused on finding the right drugs to one that focuses on finding the right patients.

Acknowledgements
The authors are indebted to our many colleagues at the Covance Genomics Laboratory (formally the Rosetta Gene Expression Laboratory) for years of dedication and collaboration. We also give thanks to our clients, who have continued to challenge the norm and helped us develop new and novel approaches for biomarker development.