Array-based Identification of Copy Number Changes in a Diagnostic Setting Simultaneous gene-focused and low resolution whole human genome analysis

OBJECTIVES
The aim of this study was to develop and validate a comparative genomic hybridisation (CGH) array that would allow simultaneous targeted analysis of a panel of disease genes and low resolution whole genome analysis.


METHODS
A bespoke Roche NimbleGen 12x135K CGH array (Roche NimbleGen Inc., Madison, Wisconsin, USA) was designed to interrogate the coding regions of 66 genes of interest, with additional widely-spaced backbone probes providing coverage across the whole genome. We analysed genomic deoxyribonucleic acid (DNA) from 20 patients with a range of previously characterised copy number changes and from 8 patients who had not previously undergone any form of dosage analysis.


RESULTS
The custom-designed Roche NimbleGen CGH array was able to detect known copy number changes in all 20 patients. A molecular diagnosis was also made for one of the additional 4 patients with a clinical diagnosis that had not been confirmed by sequence analysis, and carrier testing for familial copy number variants was successfully completed for the remaining four patients.


CONCLUSION
The custom-designed CGH array described here is ideally suited for use in a small diagnostic laboratory. The method is robust, accurate, and cost-effective, and offers an ideal alternative to more conventional targeted assays such as multiplex ligation-dependent probe amplification.

T he importance of gene deletion and duplication in the pathogenesis of disease has become increasingly evident over the last decade.These deletions/duplications range from intragenic changes that are too large to be detected by sequence analysis, to larger genomic rearrangements responsible for the microdeletion and microduplication syndromes, and finally to whole chromosome loss or gain as seen in the aneuploidies.
In the discipline of cytogenetics, molecular karyotyping using high-density oligonucleotide arrays has recently become the recommended firstline diagnostic test for patients with developmental delay/intellectual disability, autistic spectrum disorder, or multiple congenital anomalies, replacing more conventional techniques such as G-banded karyotyping. 1,2Large deletions and duplications have long been recognised as playing an important part in the pathogenesis of several disorders traditionally diagnosed using molecular techniques, such as Duchenne muscular dystrophy and Charcot-Marie-Tooth disease type 1A. 3,4In addition to these classical deletion/duplication disorders, the role of partial or whole gene deletions in the aetiology of a wide variety of single-gene disorders is becoming more apparent.A 2008 review of the entries in the online Human Gene Mutation Database showed that large deletions and duplications comprise 10% of the listed mutations, compared to 6% in 2003. 5,6his number is likely to increase further as more individuals are subjected to dosage analysis as part of routine molecular diagnostics.
8][9] Each of these methods, however, is relatively expensive, principally as a result of the price of the probes, and in the case of MLPA and qPCR, is usually confined to a limited number of exons across a limited number of genes. 10,11Finally, in the case of a small diagnostic laboratory, low sample throughput decreases costeffectiveness, together with the attendant issue of maintaining staff proficiency in a range of dosage techniques.
In order to address the above difficulties, we designed a bespoke NimbleGen 12x135K comparative genomic hybridisation (CGH) array (Roche NimbleGen Inc., Madison, Wisconsin, USA).This array targets a panel of genes chosen to complement the sequencing assays offered in-house, as well as a number of other genes for which deletions and duplications are known to be implicated in a disease phenotype.In addition to this gene-focused coverage, the design of the array also involved low-density coverage of the entire human genome.Here, we report the use of this custom-designed array to analyse a series of 28 clinical samples in order to investigate the suitability of this approach for dosage analysis in the diagnostic environment.

Methods
A group of 20 individuals with a range of previously characterised copy number changes were selected for array comparative genomic hybridisation (aCGH) analysis.The patients, or parents in the case of neonates, provided informed consent for diagnostic testing; the New Zealand multiregion ethics committee has ruled that cases of patient management do not require formal ethics committee approval.The copy number changes included both cytogenetic and molecular abnormalities, and spanned a spectrum from aneuploidy to intragenic deletion with three cases of aneuploidy, two of unbalanced translocations, three microdeletions, two microduplications, seven intragenic deletions, and three intragenic duplications.These changes had been identified using a range of techniques, including conventional and molecular karyotyping, FISH and MLPA [Table 1].aCGH was also completed for an additional 8 individuals without known copy number changes for whom dosage analysis was desirable either for Applications to Patient Care -The targeted CGH array with backbone format allows for diagnostic flexibility in a clinical laboratory setting.
-The added advantage of the approach described here is that it removes the need to batch the mutation screening of patients based on their clinical phenotype.
diagnostic purposes or for completion of family studies.
Peripheral blood ethylenediaminetetraacetic acid (EDTA) samples from each of these 28 individuals were submitted to the Diagnostic Genetics Department at LabPLUS, Auckland City Hospital, New Zealand, for either molecular or cytogenetic analysis, as clinically indicated.
Genomic diribonucleic acid (gDNA) was extracted from peripheral blood leucocytes using the Gentra Puregene DNA Extraction Kit (QIAGEN, Germantown, Maryland, USA).In those samples referred for conventional karyotype or FISH analysis, classical phenol/chloroform extraction with ethanol precipitation was used to isolate DNA from cultured leucocytes, in order to provide a source of gDNA for molecular testing.
A primer design protocol was used to design primers flanking the region spanning exons 11-14 of the KCNH2 gene. 12,13In brief, the messenger RNA (mRNA) sequence of interest was identified using the University of California Santa Cruz (UCSC) genome browser. 14All primers were checked for single nucleotide polymorphisms using the software tool available from the National Genetic Reference Laboratory, Manchester, UK. 15 The primers were tailed with M13 sequences and were synthesised by Invitrogen Ltd., Renfrewshire, UK (primer sequences are available on request).
Polymerase chain reaction (PCR) amplification was performed in a total volume of 25 µL, containing 50 ng of genomic deoxyribonucleic acid (DNA), 0.20 µM of each primer, 1 mM of each dNTP, and 1.75 U of expand long template enzyme mix in buffer 2 (F.Hoffmann-La Roche Ltd., Basel, Switzerland).After an initial denaturation for 2 minutes at 94º C, the PCR amplification included 10 cycles of 94º C for 10 seconds, 60º C for 30 seconds, and 68º C for 2 minutes, followed by 20 cycles of 94º C for 15 seconds, 60º C for 30 seconds, 68º C for 4 minutes, and a final extension at 68º C for 10 minutes.PCR products were separated by a 2% agarose gel and the lower band, corresponding to the allele carrying the deletion, was excised and purified using the Roche High Pure PCR Cleanup Micro Kit (Roche Applied Sciences, Roche Diagnostics, Penzberg, Germany).Bidirectional DNA sequencing was performed using M13 forward and reverse primers and Big-Dye Terminator, Version 3.0 (Applied Biosystems Ltd., Carlsbad, California, USA).Using an automated Clean-Seq procedure (Agencourt Bioscience Corp., Beverly, Massachusetts, USA), 20 µL of sequenced product was purified with the aid of an epMOTION 5075 liquid handling robot (Eppendorf, Hamburg, Germany).Using the Applied Biosystems model 3130xl genetic analyser (Applied Biosystems, Inc., Foster City, California USA), 15 µL of purified product was then subjected to capillary electrophoresis.
Genes of interest, including those already sequenced in-house and those pertaining to common disorders known to frequently involve deletions/duplications (such as Duchenne muscular dystophy), were selected and the appropriate NM accession numbers identified using the UCSC genome browser.The final gene list comprising 66 genes was forwarded to NimbleGen and formed the basis of their design for a 12-plex 135K oligonucleotide array (see Table 2 for gene list).Each probe was 60-85 bp in length and possessed similar isothermal characteristics.Exonic probes were designed to overlap by 25 bp in order to provide high resolution detection of deletions or duplications within the coding regions of the genes of interest.Intronic probes were spaced on average every 175 bp.To minimise the occurrence of false positive results due to a one-off failure of hybridisation to a particular probe, each gene-focused probe was spotted in duplicate.In addition to the targeted probes tiled over the genes of interest, approximately 75,000 'backbone' probes were also included.These probes were spaced across the entire genome (with a mean probe interval of 46 kbp) to provide lowdensity whole genome interrogation, as well as increase the accuracy of data normalisation during the analysis procedure.Following completion of the design process, the array was manufactured by NimbleGen, Inc.
A total of 250 nanograms of genomic deoxyribonucleic acid (gDNA) were processed according to the NimbleGen Array User's Guide: CGH and CNV Arrays, Version 6.0.In brief, extracted gDNA from samples and Promega controls was denatured in the presence of a Cy3for the test group or Cy5-for the control group, labelled random primers and incubated with the Klenow fragment of DNA polymerase, together with deoxyribonucleotide triphosphates (dNTPs) (5 mM of each dNTP), at 37º C for 2 hours.The reaction was terminated by the addition of 0.5 M EDTA (21.5 µL), prior to isopropanol precipitation and ethanol washing.Following quantification, the test and sex-matched control samples were combined in equimolar amounts and applied to one of the twelve arrays on the microarray slide.Hybridisation was carried out in a NimbleGen Hybridisation Chamber for a period of 48 hours.Slides were washed and scanned using a NimbleGen MS 200 microarray scanner.Array image files (.tif ) produced by the MS 200 Data Collection Software were imported into NimbleScan Version 2.6 for analysis.Each genomic region exhibiting a copy number change within one of the genes of interest was examined using the UCSC genome browser to determine the location and significance of the change.Data was filtered using the default log 2 ratio thresholds recommended in the NimbleGen Array User's Guide of less than -0.2 for a deletion and greater than 0.2 for duplication.
For MLPA, the SALSA MLPA P114 LQT kit (lot 0805) was purchased from MRC-Holland (Amsterdam, Netherlands).This mix contains probes for 17 exons of the KCNQ1 gene, 9 probes for the KCNH2 gene, 4 probes for the SCN5A gene, as well as 4 and 3 probes for KCNE1 and KCNE2, respectively.This kit also contains four control probes mapping to other autosomes.MLPA analysis was carried out according to the MRC Holland protocol.Briefly, 125 ng of genomic DNA from each sample was diluted in 5 µl TE buffer and denatured at 98º C for 5 minutes.MLPA buffer and probe mix (1.5 µl of each) were then added to allow the probes to anneal to their target sequences by heating at 95º C for one minute and incubating for 16 hours at 60º C. A buffer/ligase mixture (32 µl) was added to each sample and incubated at 54º C for 15 minutes followed by heating to 98º C for 5 minutes.Ten microlitres of the ligation reaction were used for multiplex PCR amplification using a single universal primer pair suitable for all the probes in the kit.The SALSA polymerase was added at 60º C, followed by 36 cycles of 95º C for 30 seconds, 60º C for 30 seconds, 72º C for one minute, and a final extension step of 72º C for 20 minutes.One microlitre of each PCR product was mixed with 0.5 µl GeneScan 600 Liz size standard (Applied Biosystems, Ltd.) and 8.5 µl of deionized formamide and 1µl was injected into a 36 cm capillary (Applied Biosystems model 3130XL)) at 60º C. The electropherogram was analysed using GeneMapper software (Applied Biosystems Ltd.).For each sample, the relative peak area (RPA) was calculated and compared to 5 healthy controls using custom-designed software.The software calculates RPAs for each probe within the same test and compares each RPA to those obtained from the 5 controls.

Results
We developed a custom-designed NimbleGen 12x135K aCGH that combines targeted highdensity coverage of 66 genes of interest with genome-wide coverage to produce a low-resolution molecular karyotype.For the validation of this array we analysed 20 patients with known copy number abnormalities.The custom designed NimbleGen CGH array was able to accurately identify these copy number changes in all 20 patients [Table 3].
The array results for patient 12 revealed an additional alteration that had not been recognised previously.Patient 12 is a member of a large pedigree with multiple members suffering from long QT syndrome (LQTS).Analysis using the MRC-Holland SALSA P114 LQT MLPA kit, which interrogates a limited number of exons of the KCNH2 gene (exons 1-4,6,7,10,11,15), had identified a duplication of exons 10, 11, and 15 in all affected individuals [Figure 1, panel A]. 16This duplication had therefore been the focus of predictive testing using MLPA for additional at-risk members of the family.The aCGH results clarified the extent of the duplication, not only showing that it involved a breakpoint within exon 7 and encompassed the whole of exons 8, 9, 10, 11, 14 and 15, but also that the genotype was more complex than previously thought.A critical micro-deletion encompassing exons 12 and 13 was  Of the 8 patients who had not yet undergone any form of copy number analysis, 4 had a clinical diagnosis that had not been confirmed by sequence analysis of the implicated genes: two had a diagnosis of long QT syndrome, one of hereditary nonpolyposis colorectal cancer (HNPCC), and one of maturity onset diabetes of the young (MODY).No copy number changes were identified in the panel of long QT syndrome genes in either of the long QT patients, nor within the MODY genes in the MODY patient.However, a large deletion involving exons 2-14 inclusive of the MSH2 gene was detected in the individual with a clinical diagnosis of hereditary non-polyposis colorectal cancer (HNPCC).Mutations in the mismatch repair gene MSH2 are known to be responsible for 40% of cases of HNPCC; 20% of these mutations involve exonic or full gene deletions. 17he referral reason for aCGH analysis for the remaining 4 individuals without a known copy number change was to provide additional information for genetic counselling and family planning.Individuals 21 and 22 are the parents of patient 9, an eight-year-old girl with mild dysmorphic features and speech delay, who had been found to have a duplication involving the Williams-Beuren syndrome (WBS) critical region at 7q11 23 using an Affymetrix single nucleotide polymorphisms (SNP) 6.0 array (Affymetrix, Santa Clara, California, USA).While a microdeletion of the WBS critical region results in a well-characterised pattern of facial dysmorphism, supravalvular aortic stenosis, connective tissue abnormalities, hypercalcaemia, and a recognisable behavioural phenotype, duplication of the same region results in a much less distinctive set of characteristics. 18oremost among these, as was seen in our patient, are mildly dysmorphic facial features and prominent speech delay.Parental transmission of the 7q11. 23duplication is relatively frequent in the WBS duplication syndrome, but reduced penetrance and variable expression mean that determination of carrier status based on phenotype alone is not simple.An approximately 1.5 Mb duplication of the WBS critical region was readily indentified in the affected girl by our customdesigned NimbleGen CGH array, which agreed with the earlier Affymetrix SNP 6.0 array data, but was not detected in either of her parents.The conclusion is that the genomic copy number change detected in patient 9 is a de novo event and that future pregnancies are not at high risk of this mutation event.
Individuals 23 and 24 are the parents of patient 7, a six-year-old boy who was referred for investigation of developmental delay and features consistent with autistic spectrum disorder.Highdensity Affymetrix SNP 6.0 microarray analysis had revealed several copy number changes in the child, including a deletion at chromosome 2p21, a duplication at chromosome 15q14, and a deletion at chromosome 16p11.2(see Table 3 for full coordinates).Each of these changes was also identified by our NimbleGen custom CGH array, with only minor differences in breakpoint location, despite the difference in probe density [Table 3].The 16p11.2 deletion is consistent with the phenotypic features in this case, as dosage changes at 16p11.2 have been described in association with autistic spectrum disorder. 19The aCGH results confirmed that the chromosome 16p11.2deletion is de novo and that each of the other two copy number changes are most likely to be benign, as each is inherited from one of his parents.

Discussion
The purpose of the work described above was to design and validate a CGH array that could be used as an alternative to MLPA, quantitative PCR, and customised FISH in the diagnostic genetics laboratory.[22][23][24][25] The array design we report here is ideally suited to a small diagnostic laboratory.It enables the simultaneous interrogation of a large number of genes using a process that eliminates the risk of false negatives inherent in PCR-based techniques due to the possibility of polymorphisms lying under primer binding sites.Twelve patient samples are able to be tested at once, reducing the overall cost of the assay.The overlapping probes tile the exons at a high density and allow changes involving the coding regions of the gene(s) of interest, including single exon changes, to be readily and reliably detected.This design feature is in contrast to some previously reported designs which could not reliably detect single exon changes due to insufficient probe coverage over affected regions. 23The intron probes enable clarification of breakpoints, which is not possible with MLPA or qPCR, and the backbone probes facilitate the identification of larger genomic rearrangements, either as confirmation following high-density molecular karyotyping, or for carrier testing and family studies.

Conclusion
We have shown that our custom-designed NimbleGen CGH array can be used to accurately identify exonic deletions and duplications in a gene set of interest as well as offer a low resolution whole genome screen for larger genomic rearrangements.The technique is robust and cost-effective and allows for comprehensive analysis.This approach overcomes the problems associated with the use of expensive kits in the context of low sample throughput, and allows for consolidation of dosage analysis assays to a single validated technique.

Figure 1 :
Figure 1: Graphic representation of copy number changes in the KCNH2 gene in patient 12. (A) Dosage changes were detected using a multiplex ligation-dependent probe amplification (MLPA) approach.The graphic representation shows the increased dosage detected by probes that lie in exons 10, 11, and 15 of the KCNH2 gene.(B) Dosage changes were detected in the KCNH2 gene with a copy number gain (X3 copy number) defined by the chromosome 7 coordinates (NCBI36/hg18 assembly) 150,276,456-150,279,665bp (within exon 7 to within exon 11, log2 ratio 0.45) and 150,250,593-150,275,172 (encompassing exons 14 and 15, log 2 ratio 0.5), and an apparent 676bp deletion (X1 copy number, log 2 ratio -0.53) located at 150,275,345-150,276,020bp (encompassing exons 12 and 13).(C) Transcripts expressed from the KCNH2 gene are shown, together with the distal exons of transcript 1 of the KCNH2 gene (RefSeq accession number NM_000238.3).

detected [Figure 1 ,
panels B and C].PCR and DNA sequencing determined the exact breakpoints of the 1041 bp deletion, the length of which compares favourably to the 676 bp copy number change detected by the array [Figure 2].

Figure 2 :
Figure 2: Location and extent of the KCNH2 gene deletion in patient 12.A partial sequence of the KCNH2 gene is shown that encompasses exons 11 to 13, inclusive (in blue).The sequence-confirmed location and extent of the 1041bp deletion detected in the genome of patient 12 is highlighted in yellow (chromosome 7: 150,276,375-150,275,335bp; NCBI36/hg18 assembly).

Table 1 :
Copy number changes used to validate the Roche NimbleGen custom-designed comparative genomic hybridisation array

Table 2 :
Human disease genes selected for inclusion on the Roche NimbleGen custom-designed comparative genomic hybridisation array