High-resolution fish on DNA fibers for low-copy repeats genome architecture studies

Low-copy repeats (LCRs) constitute 5% of the human genome. LCRs act as substrates for non-allelic homologous recombination (NAHR) leading to genomic structural variation. The aim of this study was to assess the potential of Fiber-FISH for LCRs direct visualization to support investigations of genome architecture within these challenging genomic regions. We describe a set of Fiber-FISH experiments designed for the study of the LCR22-2. This LCR is involved in recurrent reorganizations causing different genomic disorders. Four fosmid clones covering the entire length of the LCR22-2 and two single-copy BAC-clones, delimiting the LCR22-2 proximally and distally, were selected. The probes were hybridized in different multiple color combinations on DNA fibers from two karyotypically normal cell lines. We were able to identify three distinct structural haplotypes characterized by differences in copy-number and arrangement of the LCR22-2 genes and pseudogenes. Our results show that Multicolor Fiber-FISH is a viable methodological approach for the analysis of genome organization within complex LCR regions.


Introduction
It has been estimated that the 5% of the human genome is constituted by segmental duplications or Low Copy Repeats (LCRs) [1]. LCRs are repetitive DNA elements from 1 to 400 Kb in length sharing a high level of sequence identity (>95%) [2]. Due to the high degree of homology between paralogous copies, they are considered highly dynamic regions leading to genomic instability by non-allelic homologous recombination (NAHR) [3,4].
As a result, LCRs are susceptible to structural and copy-number variation of their own genes and pseudogenes. This variation has been directly associated to the occurrence of some diseases [5][6][7] and to the formation of specific structural haplotypes, which have been linked to an increased susceptibility to secondary rearrangements of the region flanked by the LCR [8]. Rearrangements may be either somatic, causing sporadic disease in the individual, or in the germ line, leading to an increase in the risk of transmission to the offspring [8].
Some LCR haplotypes have been suggested to predispose to specific chromosomal rearrangements: 1) copy number variation (CNV) within some portions of the LCRs flanking the 7q11.23 region has been linked to the occurrence of deletions of the Williams-Beuren syndrome critical region [9], 2) variation in the copy-number and arrangement of the simple LCRs REPA and REPB at the chromosome region 17p11 is believed to confer different susceptibility to the formation of 17p isochromosomes [10], and 3) CNV of duplicated blocks within the BP1 and BP3 at 16p12.1 has recently been reported to predispose to deletions of this region [11].
Despite the recent advances in copy-number and structural variation detection [12], the repeated nature and often complex organization of LCRs hampers their analysis by standard methodologies such as array comparative genomic hybridization (array-CGH) and single nucleotide polymorphism (SNP) microarrays. The development of next-generation sequencing techniques has been successfully used to analyze the CNV of the whole genome [13], as well as specific LCRs; this is the case of the LCR22 of the 22q11.2 region [14]. PCR-based techniques have also allowed the quantification of the number of repeats shaping specific LCRs (7q11.23; [9]).
Fluorescence in situ hybridization (FISH) provides an alternative approach for the analysis of the genomic architecture of LCRs. By enabling the direct visualization of target DNA sequences in situ, FISH not only allows copy number assessment, but also facilitates the identification of balanced structural variants such as inversions and translocations. In particular, FISH on stretched DNA fibers (Fiber-FISH) with closely spaced probes has been satisfactory applied in several high-resolution physical mapping studies [15][16][17] and as a validation technique in CNV studies [18][19][20][21][22][23]. Moreover, it has also been used to assess the number of paralogous copies of the simple LCRs REPA and REPB on the 17p11.2 region [24].
The pericentromeric area of chromosome 22 contains its own LCRs (LCR22). These LCRs are involved in recurrent reorganizations causing different genomic disorders [25]. Among them, the DiGeorege/ Velocardiofacial syndrome (DGS) represents the most common deletion-caused syndrome in humans with an incidence of 1 every 4,000 newborns (OMIM188400) [26]. DGS is mostly caused by 3 Mb Genomics 100 (2012) 380-386 hemizygous deletions involving the flanking LCR22-2 and LCR22-4. These LCRs are complex mosaic of genes, pseudogenes and other repetitive elements partially formed by Alu-mediated recombination events during primate evolution [27]. The functional genes distributed along the LCR22s, are USP18, BCR, GGT5 and GGT1. Duplications of these genes and their own pseudogenes during evolution shaped the LCR22s [28].
In this work, we applied Fiber-FISH to determine structural and copy number variants within the LCR22-2. The main objective of the study was to assess the ability of Fiber-FISHas a high-resolution mapping techniqueto resolve the genomic architecture of complex LCRs, and to establish its potential as a methodological approach to assess risk haplotypes for critical regions.

Experimental design
A set of Fiber-FISH experiments were performed to establish the LCR22-2 genomic architecture in two karyotypically normal cell lines (see section 4: Materials and methods). DNA fibers were stretched on slides as previously described [30]. The experiments were designed as follows:

Unequivocal identification of the specific LCR22-2 signals
To distinguish between specific signals of LCR22-2 from paralogous copies distributed in other LCRs on 22q, the control probes F9 (mapping just outside the LCR22-2, proximally) and A10 (mapping just outside the LCR22-2, distally), were co-hybridized with the fosmid clone K3 in two different FISH experiments, allowing to identify patternsbased on the number of K3 repetitionsto be used as an LCR22-2 reference in the following hybridizations.

Determination of the LCR22-2 architecture
Once the number of K3 copies was established, three dual-color fiber-FISH experiments were performed by co-hybridizing K3 and L9, K3 and B22, B22 and L21. These high-resolution mapping experiments allowed the assessment of the LCR 22-2 genes copy number and relative arrangement.

Structure of the LCR22-2 in the Cell line A
A total of 81 informative fiber-FISH images were captured and analyzed to study the organization of the LCR22-2 in cell line A.
• F9 and K3: A larger signal corresponding to F9 followed/preceded by a consistent pattern of five repeats for K3 was observed. F9 was either separated (55% of the fibers; Fig. 3a) or overlapping (45% of the fibers; Fig. 3b) with the first/fifth K3 repeat (20 informative fibers were analyzed). • A10 and K3: A larger signal corresponding to A10 followed/preceded by a consistent pattern of five repeats for K3 was observed. A10 was either separated (62%; Fig. 3c) or overlapping (38%; Fig. 3d) with the first/fifth K3 repeat (13 informative fibers were analyzed). • K3 and L9: K3 displayed the same pattern previously described. Five signals were identified for L9, totally or partially overlapping with K3 ( Fig. 3e) (17 informative fibers were analyzed). • K3 and B22: Overlapped or partially-overlapped signals from these two probes were observed. Results showed two different patterns for B22 which either two or three signals (52.4% and 47.6% respectively; Figs. 3f and g) (21 informative fibers were analyzed). • B22 and L21: Two or three signals for B22 (60% and 40% respectively) were observed followed/preceded by one signal for L21 (Figs. 3h and i) (10 informative fibers were analyzed).
To determine whether an inversion was the cause of the two different signal patterns or "Fiber-FISH haplotypes" observed co-hybridizing the F9 and K3 clones (Figs. 3a and b) and A10 and K3 (Figs. 3c and d), an additional three-color Fiber-FISH was performed using the probes F9, K3 and L21. Two different signal patterns with the same frequency were observed: 1) F9 followed by K3 and L21 (42.8%; Fig. 4a), and 2) F9 followed by L21 and K3 (57.1%; Fig. 4b). These results strongly suggest the presence of an inversion involving most of the LCR22-2 in one of the two chromosome 22 homologs. In order to relate the number of B22 signals (Figs. 3f and g) with the inversion, a further three-color Fiber-FISH experiment was performed using the clones F9, B22 and L21. Two clone distributions were observed: 1) F9, L21 and two signals of B22 (38%; Fig. 4c), and 2) F9, B22 (three signals) and L21 (62%; Fig. 4d). These results suggest that the inversion segregates with the haplotype showing two signals for B22.

Structure of the LCR22-2 in the cell line B
A total of 87 informative images were analyzed using the following two-color Fiber-FISH experiments. • F9 and K3: As observed in cell line A, five contiguous signals for K3 were observed, all of them separated to the F9 signal (Fig. 5a) (18 informative fibers were analyzed). • A10 and K3: As in cell line A, five contiguous signals for K3 were observed, all of them separated to a longer A10 signal (Fig. 5b) (18 informative fibers were analyzed). • K3 and L9: As in cell line A, five contiguous signals for K3 and 5 partially overlapped L9 signals were detected (Fig. 5c) (20 informative fibers were analyzed). • K3 and B22: Two signals for B22 were consistently found on the second and third/third and fourth K3 signals (Fig. 5d) (14 informative fibers were analyzed). • B22 and L21: Two signals for B22 were observed followed/preceded by one signal for L21 (Fig. 5e) (17 informative fibers were analyzed).
Results allowed us to propose a model for the architecture of the LCR22-2 in the cell line A and B (Fig. 6).

Discussion
This work demonstrates for the first time the ability of Fiber-FISH coupled to an accurate experimental design to resolve the genomic architecture of complex LCRs. The strategy used in our study allows the identification of structural variation of different segments within the LCRs. The design consists in: 1) selecting specific clones covering the LCR, 2) selecting chromosomal markers to be used as positional references to facilitate the unequivocal identification of the LCR under investigation, and 3) developing and applying strict assessment criteria for the analysis of the Fiber-FISH hybridization patterns.
By direct visualization of haplotypic repeat patterns, Fiber-FISH allows both inter-chromosomal and inter-individual variability to be reliably ascertained, as our results on the LCR22-2 show (Fig. 6). Our observations suggest an arrangement of the LCR22-2 comprising five copies of the L9 and the K3, and of either two or three copies of the B22, all of them closely localized to each other and repeated in a  modular fashion (Fig. 6). Furthermore, we observed an inverted haplotype. The inversion involves most of the LCR22-2; accordingly, the relative position of the L21 was close to the clone F9. Besides, the combination in a triple-color fashion of the probes: 1) F9, K3 and L21, and 2) F9, B22 and L21 confirm again that the signals analyzed unequivocally identify the LCR22-2.
Some LCR structural haplotypes have been suggested to increase the likelihood of misalignment and NAHR, thus increasing the risk of transmission of secondary disease-associated rearrangements to the offspring [9][10][11]. Moreover, some data demonstrated a different susceptibility to NAHR among individuals. Our group have recently reported increased rates of deletions of the 15q11-q13 region in spermatozoa of  a) Co-hybridization of F9 (red), K3 (green) and L21 (purple), signal distribution following the current human genome assembly; b) co-hybridization of F9 (red), K3 (green) and L21 (purple), signal distribution corresponding to an inversion regarding the current human genome assembly; c) co-hybridization of F9 (green), B22 (purple) and L21 (red), signal distribution following the current human genome assembly; d) co-hybridization of F9 (green), B22 (purple) and L21 (red), signal distribution corresponding to an inversion regarding the current human genome assembly. fathers of children affected by Prader-Willi syndrome [31], as well as increased rates of 7q11.23 and 22q11.2 deletions in spermatozoa of fathers of Williams-Beuren or DiGeorge/Velocardiofacial children respectively (unpublished data), thus suggesting the presence of predisposing haplotypes to NAHR in these subjects.
The results obtained in this work demonstrate the potential of the Fiber-FISH methodology for the identification of predisposing LCR haplotypes in the flanking critical regions in parents of individuals affected by genomic disorders. This would allow the establishment of a direct relationship between specific LCR structural haplotypes and increased rates of NAHR in gametes.

Conclusion
Multicolor Fiber-FISH is a viable methodological approach for the analysis of genome organization within complex LCR regions.

Cell culture
Two karyotypically normal B-lymphoblastoid cell lines were used: 1) GM0171 (Human Genetics Collection, Health Protection Agency  (HPA), U.K.; no longer available) and 2) DO208915 (European collection of Cell Cultures, HPA), referred in the manuscript as cell line A and B respectively. Both cell lines harbor the VCFS/DiGeorge region since they showed a normal hybridization pattern with A10 (a clone localized within the critical region).

Slide preparation
Two kinds of preparations were performed: 4.2.1) Metaphase chromosomes were obtained following standard procedures: 1 hour before harvesting, cells were treated with Colcemid (Invitrogen) at a final concentration of 0.2 μg/mL. They were then resuspended in hypotonic solution (0.075 M KCl) for 10 minutes at 37°C and fixed in methanol:acetic acid (3:1). 4.2.2) DNA fibers were stretched on slides as previously described [30].
Briefly, 2 mL of a cell culture were centrifuged and the pellets were washed in 1× PBS. Pellets were resuspended in 1× PBS to reach a final concentration of 2×10 6 cells/mL and spread on slides. Once the slides were mounted on the Shandon Sequenza Coverplates DNA fibers were released applying a lysis solution (0.07 M NaOH in ethanol). Finally, fibers were fixed in methanol.
In both cases, slides were kept at − 20°C until processed.

Probes
All clones were kindly provided by the Wellcome Trust Sanger Institute (Cambridge, UK). Clone extraction was carried out using the QuickClean 5M Miniprep kit (GenScript) following the manufacturer's instructions.

Fluorescence in situ hybridization (FISH)
In two-color FISH experiments, clones were labeled by Nick-Translation (Abbott Molecular) either with Digoxigenin-11-dUTP (Roche) or Biotin-16-dUTP (Roche) ( Table 1). In three-color experiments, Alexa594-dUTP (Invitrogen) was also used. Probes were ethanol precipitated with a mix of salmon testis DNA (GIBCO-BRL), Escherichia coli tRNA (Boehringer) and 3 M sodium acetate. Approximately 200 ng of labeled DNA probe and 4 μg of Cot1 competitor DNA (Invitrogen) were mixed and dried on a heating block at 60°C, and resuspended in 1× hybridization buffer (50% formamide, 1× SSC and 10% dextran sulfate) to a final concentration of 40 ng/μL. For Fiber-FISH experiments, two-fold the labeled DNA probe and Cot1 competitor DNA were used (final concentration of 80 ng/μL of each probe).
FISH was carried out following standard procedures [30]. Briefly, probes were denatured at 75°C for 5 minutes and pre-annealed at 37°C for 45 minutes. Slides were denatured in 70% formamide/2× SSC at 70°C for 1 minute and hybridized in a moist chamber at 37°C overnight. Slides were washed twice in 50% formamide/1× SSC and once in 2× SSC, for 5 minutes at 42°C, followed by 5 minutes in 1× PBS at room temperature. For fiber-FISH experiments milder washes were used: one wash in 50% formamide/1× SSC, followed by one wash in 2× SSC, both of them for 5 minutes at 42°C.

Image acquisition and data analyses
Image capture and analysis were carried out on a CytoVision system (Leica) consisting of an Olympus BX-51 epifluorescence microscope coupled to a JAI CVM4+ CCD camera.
Fiber-FISH analysis was performed by applying the following scoring criteria: • Fibers were considered informative when at least two signals of different colors were observed overlapping or proximal in a consecutive fashion. • Two or more signals of the same color were considered independent when they were separated by a distance twice the distance of every single bead-on-string. • Signals were considered informative regardless of the size.
Author's contributions O.M. was responsible for conception and design, acquisition of data, data analysis and interpretation, writing the article and final approval. J.B. was responsible for conception and design, data analysis and interpretation, writing the article and final approval. E.A. was responsible for revision of data and interpretation and final approval. F.V. was responsible for revision of data and interpretation and final approval. E.V.V. was responsible for conception and design, data analysis and interpretation, and final approval.