Alternative Splicing of Intron 23 of the Human Cystic Fibrosis Transmembrane Conductance Regulator Gene Resulting in a Novel Exon and Transcript Coding for a Shortened Intracytoplasmic C Terminus*

The cystic fibrosis transmembrane conductance reg- ulator (CFTR) gene, the gene responsible for the lethal hereditary disorder cystic fibrosis, codes for a mem- brane protein functioning as a CAMP-regulated C1-channel. Evaluation of human CFTR mRNA tran- scripts from epithelial and nonepithelial cells demonstrated a CFTR cDNA containing a 260-base pair (bp) insertion between the known CFTR exons 23 and 24, introducing a premature stop codon that would result in a CFTR protein shortened by 61 amino acids at the carboxyl terminus compared to that expected from the normal reported human CFTR coding sequences. Se- quence analysis of intron 23 of the CFTR gene demonstrated that the 260-bp insertion (named exon 24a), a part of the reported intron 23 and located consecutive to exon 24, is likely generated by an alternative splice acceptor site. The exon 24a’ CFTR mRNA transcripts represented 3-16% of the total CFTR transcripts in epithelial and nonepithelial cells. These observations suggest an unexpected plasticity of expression of the CFTR gene, where alternative splicing of precursor CFTR mRNA transcripts permits the use of an alter- native exon derived from a genomic segment previously believed to function as an intron.

* The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked "advertisement" in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
The nucleotide sequence(s) reported in this paper has been submitted M96936.
Although the disease cystic fibrosis manifests clinically at a limited number of epithelial surfaces (l), expression of the CFTR gene is widespread (13) but very low (13-15). Expression of the CFTR gene can be modulated at several levels (14,(16)(17)(18). The 5"flanking region has the characteristics of a promoter of a housekeeping gene supporting a relatively low rate of transcription (14,19), but transcription can be regulated (14,17,18). CFTR mRNA transcripts have a half-life of 13-19 h, and CFTR mRNA stability can be altered (17).' Finally, CFTR mRNA transcripts can be alternatively spliced to remove exon 9 in phase, resulting in transcripts with a coding sequence shortened by 4.1% (20, 21).
The present study demonstrates additional, unexpected flexibility in CFTR gene expression: the alternative use of previously reported intron sequences as coding sequences to generate a novel exon (referred to as "exon 24a") and CFTR mRNA transcripts coding for a CFTR protein with all putative functional domains intact, but a carboxyl-terminal intracytoplasmic tail with a novel sequence, that is 4.1% shorter than CFTR protein predicted by the previously described 27exon CFTR mRNA (3).

MATERIALS AND METHODS
Source of Cells and Cell Culture-All cells evaluated for expression of the CFTR gene were of human origin. The cell lines included the T84 and HT-29 human colon carcinoma cell lines (American Type Culture Collection (ATCC) CCL 248 and HTB 38, respectively), the HeLa cervical carcinoma cell line (ATCC CCL 2), and the U-937 histiocytic lymphoma cell line (ATCC CRL 1593). Human alveolar macrophages were obtained by bronchoalveolar lavage from normal nonsmoking volunteers (22). Neutrophils were purified from blood of a normal individual (23). All cultured cells except T84 cells were maintained in Dulbecco's modified Eagle's medium (Whittaker Bioproducts) supplemented with 10% fetal bovine serum, 2 mM glutamine, 50 units/ml penicillin, and 50 pg/ml streptomycin (all from Biofluids, Inc.). T84 cells were maintained in Dulbecco's modified Eagle's medium with 5% fetal bovine serum, 2 mM glutamine, 50 units/ml penicillin, and 50 pg/ml streptomycin. All experiments with adherent cell lines were carried out when the cells were 80-90% confluent. U-937 cells were evaluated during exponential growth. All cell populations were >95% viable as determined by trypan blue exclusion.
Evaluation of Exons 21-24 in CFTR mRNA Transcripts-Total cellular RNA was extracted from all cells by the guanidine thiocyanate-CsC1 gradient method (24). For evaluation of CFTR mRNA transcripts in the region of exons 21-24, polymerase chain reaction (PCR) amplification of mRNA (after conversion to cDNA) and Southern hybridization were utilized (25,26). Briefly, an equal H. Nakamura, K. Yoshimura, G. Bajocchi, B. C. Trapnell, A. Pavirani, and R. G. Crystal, submitted for publication.

686
This is an Open Access article under the CC BY license. amount of total RNA (5 pg) from each cell type was first incubated with Moloney murine leukemia virus reverse transcriptase (GIBCO/ Bethesda Reaearch Laboratories) and oligo(dT) primer to convert mRNA to cDNA (50 pl of total volume). PCR amplification was performed using 5 pl of each reverse transcription product as a template, Taq DNA polymerase (Perkin-Elmer/Cetus), and CFTR gene-specific primers in exon 21 (HCF12 5"AGTGGAGTGATCAA-GAAATATGG-3') and exon 24 (HCF6: 5"TCCACGAGCTCCA-ATTCCATGAGG-3') (see Fig. lB), primers designed to amplify only CFTR mRNA (after conversion to cDNA) and to preclude amplification of potentially contaminating genomic DNA which is too large (>15 kb at this region) to be amplified (2,(13)(14)(15). Following amplification, cDNA was subjected to agarose gel electrophoresis, transferred to nylon membranes (Nytran, Schleicher and Schuell), cross-linked by ultraviolet irradiation (Stratalinker, Stratagene), and hybridized with a nested CFTR cDNA probe labeled with [a-3ZP]dCTP (>3000 Ci/mmol, Amersham) by the random priming method (27). Hybridization and subsequent washing of the membranes were carried out as previously described (13)(14), and the membranes were evaluated by autoradiography.
Sequence of CFTR Transcripts-After initial studies demonstrated that the PCR product of CFTR mRNA transcripts in the region of exons 21-24 contained larger (approximately 860 bp) transcripts as well as the expected 603-bp transcripts (see Fig. lC), the portion of an agarose gel corresponding to the larger transcripts of an alveolar macrophage sample was excised, and the DNA was purified. The DNA sample was used as a template for second PCR amplification (30 cycles) with the modified primers that are identical to HCFl2 or HCFG except for an additional restriction enzyme XhoI or PstI site at the 5'-end, respectively. After agarose gel electrophoresis, DNA at the expected position was recovered from the gel, purified, digested by XhoI and PstI, and inserted into the pBluescript I1 SK+ (pBS) vector at the corresponding cloning sites. The recombinant plasmid clone containing an approximately 860-bp insert (pPB254) was isolated, and sequenced for the cloned fragment by dideoxy chain termination (28). As detailed under "Results," the larger CFTR exon 21-24 transcript turned out to be identical to the published CFTR mRNA transcript (3), except for an additional 260-bp nucleotide sequence located in intron 23 (referred to as "exon 24a").
Northern Analysis for CFTR Transcripts Containing Exon 24a-Since two interpretations were possible regarding exon 24a (either a part of mature CFTR mRNA transcripts generated by alternative splicing or an intermediate product of a partially spliced intron 23), CFTR mRNA transcripts in total cellular RNA and poly(A)+ mRNA were evaluated by Northern analysis. Total RNA was extracted from T84 and HT-29 colon carcinoma cells as described above. Poly(A)+ RNA was further isolated from total RNA of both cell lines using an oligo(dT)-cellulose column (29). Total RNA (15 pgleach cell line) or poly(A)+ RNA (10 pgleach) was subjected to formaldehyde-agarose gel electrophoresis, transferred to the Nytran membranes, and hybridized with 32P-labeled probes generated by random priming. For evaluation of overall CFTR mRNA transcripts, a 4.5-kb cDNA probe was used (14,(16)(17)(18). For exon 24a+ CFTR transcripts, a nested 241bp exon 24a cDNA probe was generated by PCR amplification using pPB254 as a template and two exon 24a specific primers (CFEXXSl, TTAGTGAGATCTGGGACAGAAG-3') (see Fig. 3A). The proportion of exon 24a+ CFTR mRNA transcripts relative to the total CFTR mRNA transcripts was quantified by scanning the autoradiograms with an Ultroscan laser densitometer (Pharmacia LKB Biotechnology Inc.).
Evaluation of the CFTR Genome for the Source of Exon 24a"To evaluate the CFTR gene for sequences coding for exon 24a, the structure of intron 23 was evaluated. First, a 1.4-kb segment spanning exon 23 (partial), the whole of intron 23, and exon 24 (partial) of the CFTR gene was isolated (2,30). Genomic DNA extracted from neutrophils of a normal individual was used as a template for PCR amplification (40 cycles) along with a set of primers with a XhoI or PstI restriction site at the 5'-end, respectively (HCF51X, 5'-5"CAGTTCTACTAAACCTCCCTGAAG-3'; CFEXXASl, 5"TG-AGTCCTCGAGACAGTAATTCTCTGTGAACACAGG-3'; HCF-72P, 5"AGTCCTGCAGTCTCGTTCAGCAGTTTCTGGATGG- 3'). Amplified DNA of the expected size was excised from an agarose gel and purified as described above, digested with both XhoI and PstI, and subcloned into the pBS at the corresponding restriction sites. A recombinant plasmid clone containing the CFTR gene from exon 23 through 24 was isolated (pPB255) and evaluated by restriction endonuclease mapping and direct sequencing. Homology of the obtained DNA sequences to known sequences in GenBank was analyzed using the DNASIS program (Hitachi Software Engineering America). The sequences of splice sites around intron 23 and exon 24a were evaluated in comparison to known consensus motifs for splicing in vertebrate genes (31, 32).

RESULTS
Southern analysis of the PCR-amplified CFTR mRNA in the region of exon 21-24 showed the expected 603-bp segment (13-15) in T84 colon carcinoma cells, HeLa cervical carcinoma cells, and U-937 histiocytic lymphoma cells as well as in freshly isolated normal alveolar macrophages (Fig. 1). However, a higher molecular weight segment was also detected at approximately 860 bp in all cell types evaluated. The ratio of the higher molecular weight transcripts to total CFTR transcripts varied from 3 to 16%.
Cloning of the cDNA of the higher molecular weight CFTR transcript and evaluation of its structure demonstrated that it represented a 863-nucleotide transcript with a 260-nucleotide additional sequence inserted between exons 23 and 24 (Fig. 2). The additional sequence in these CFTR transcripts was designated as exon 24a.
When the exon 24a-specific probe was used for Northern analysis with total cellular RNA and poly(A)+ RNA from T84 and HT-29 colon carcinoma cells, a distinct signal was observed in the poly(A)+-selected RNA fraction of both T84 and HT-29 cells (Fig. 3). The size of the signal detected by the  (3). Shown are the predicted domains of CFTR, including (amino-to carboxyl-terminal) membrane-spanning domains, nucleotide-binding fold l , cytoplasmic R domain, membranespanning domains, and nucleotide binding fold 2. Each domain is aligned above the mRNA exon sequences which encode for the region. The translation start (ATG) and stop (TAG) codons are indicated in exons 1 and 24, respectively. The predicted intracytoplasmic "tail" of the protein is from residues T1387 to L1480. B, an enlarged region of exons 21-24 of CFTR mRNA transcripts. The stop codon (TAG) is in exon 24. This region (the mRNA segment from exon 21 through 24) was amplified (after conversion of mRNA to cDNA) using polymerase chain reaction with HCF12 and HCFG primers as indicated. The expected size of the PCR product (603 bp) is indicated, as is the nested "exon 22-23" probe used for detection.  The double-stranded DNA of plasmid pPB254 was analyzed by dideoxy chain termination method. The sequence of exon 23 is identical to that reported for the normal CFTR gene (3). C, sequence of the junction of exons 24a-24. The sequence of exon 24a is identical to that of the 3'-end of intron 23 reported for the CFTR gene (30), and the sequence of exon 24 is identical to that reported sequence of this exon (3). exon 24a probe was a bit higher than the 6.5-kb transcripts observed in parallel hybridization of the RNA samples with a 4.5-kb cDNA. Although it is difficult to estimate the exact molecular weight of the larger transcripts because of a compression in gel electrophoresis, the size of the signal with the exon 24a probe was in the range of 6.7-6.8 kb, consistent with that expected for CFTR transcripts with an additional 260-nucleotide segment. The same results were observed in two separate experiments.
Analysis of the structure of intron 23 revealed a 1343-bp nucleotide sequence between the reported exons 23 and 24 (Fig. 4). The sequences close to either exon 23 or 24 were identical to the previously reported partial sequences of 5'or 3'-end of intron 23, respectively (Fig. 4B) (30). Surprisingly, the 260 nucleotides of exon 24a was located 5' to exon 24 in a contiguous fashion, i.e. 3' part of its sequence (189 nucleotides) had been previously described as a part of the partially characterized intron 23 (30). Intron 23 also contained a partial Alu repeat sequence (a third of the conserved repeat consensus) (30). The nucleotide sequences immediately upstream to both exons 24a and 24 showed well conserved splice acceptor sites (Figs. 4B and 5). In addition, there were also well conserved splice branch sites, each containing a branchpoint (A nucleotide) 17 or 43 nucleotides from the junction of intron 23 and exon 24a or intron 23 and exon 24, respectively. Since intron 23 is a type 0 intron without interrupting the codon for leucine (L1414) in exon 23 (3,30), either exon 24a or exon 24 could start coding for residue 1415 and subsequent amino acids. However, because of an aberrant stop codon at the 6th amino acid position of exon 24a, an exon 24a+ CFTR transcript would be translated to make a truncated CFTR protein product 61 amino acids shorter at the carboxyl ter- A, schematic of exon 24a+ CFTR transcripts from exon 21 to 24. Shown is the 241-bp CFTR exon 24a-specific probe made by PCR amplification of the pPB254 plasmid clone (see Fig. 2) as a template, primers specific for the exon 24a sequence (CFEXXSI and CFEXXASl), and 32P-labeled by random priming. B, Northern analysis of total cellular RNA and poly(A)+ RNA extracted from T84 and HT-29 cells. Shown are analyses with a 4.5-kb CFTR cDNA probe not containing the exon 24a sequence (14,(16)(17)(18) (lanes 1-4) and with the 241-bp exon 24a-specific probe described above (lanes 5-8). The size of 6.5-kb mature CFTR mRNA transcripts is indicated on the left. Note that the position of the signal detected by the exon 24a-specific probe is slightly higher than that detected by the 4.5-kb cDNA probe.

DISCUSSION
Although the clinical manifestations of cystic fibrosis are localized to the surface of epithelia (1)) the expression of the CFTR gene is widespread, with CFTR mRNA transcripts detectable in epithelial and nonepithelial cells (13). The extent of CFTR gene expression varies somewhat from site to site, but at most sites CFTR mRNA transcripts are in the low abundance class (13-15). Consistent with the relatively low, but widespread, expression, the 5'-flanking region of the CFTR gene has the characteristics of a housekeeping gene promoter (14,19). Despite this, there is increasing evidence that CFTR gene expression can be modulated at the transcriptional and post-transcriptional levels (14,(16)(17)(18). The present study demonstrates an unexpected plasticity of CFTR gene expression: alternative splicing of the CFTR precursor mRNA transcript, 5' to normal intron 23-exon 24 boundary resulting in the generation of a 260-nucleotide novel exon (exon 24a) from sequences previously believed to be part of intron 23. Interestingly, because of a premature stop codon after 5 amino acids coded by exon 24a, the predicted CFTR protein by exon 24a+ CFTR transcript would be shorter by 61 amino acid residues at the intracytoplasmic tail compared to the normal CFTR protein (3). PIC. 4. Sequence of the CFTR gene in the region from exon 2 3 to 24. A , schematic of the genomic structure of' exon 23 (partial), entire intron 23, exon 24a, and exon 24 (partial). The 1.4-kb CFTR genomic DNA fragment (XhoI to PstI) was amplified by PCR using neutrophil DNA as a template and primers with a restriction enzyme linker site (Xhol or PstI). EcoRI and Hind111 restriction sites in intron 23 are indicated as are the two predicted stop codons (TAA in exon 24a and TAG in exon 24). H, sequence of exon 23 (partial), intron 23, exon 24a, and exon 24 (partial) of the CFTR gene. The cloned and sequenced genomic segment is marked with the vertical line (I) at the 5'

---A T T T T G T T T T C A A A G (17)
Consensus YNYTRAY YYYYYYYYYYYNYAG (18-50)

Exon 24 G T C T G A C --------C A G C C A T T T C C C T A G
FIG. 5. The likely splicing events involving the intron 23exon 24a and intron 23-exon 24 boundaries. Top, schematic of splicing for exon 24a+ CFTR mRNA. The most carboxyl-t.ermina1 amino acid residue (Y1419) of the CFTR protein predicted from the exon 24a+ t,ranscripts is indicated as are the consecutive stop codon (TAA), the reported stop codon (TAG) in exon 24, and the poly(A) tail (A)". Middle, the consensus sequences for the splice event including the splice branch site, splice acceptor site, and the distance (number of nucleotides) between branchpoint and intron-exon boundaries (31,32). Y represents either T or C, R represents either A or G, and N represents any nucleotide. The putative branchpoint adenosine nucleotide in the splice branch site is indicated by a dot. Botkm, schematic of splicing for exon 24a-CFTR mRNA with the nucleotide sequence of the putative splice branch site and splice acceptor site in the exon 24a region. The mismatched nucleotides from the consensus sequence are underlined. The most carboxyl-terminal amino acid residue of the reported CFTR protein (L1480) and the consecutive stop codon (TAG) in exon 24 are indicated as is the poly(A) tail (A)-.
Splicing of precursor mRNA of vertebrate gene involves two cleavage-ligation reactions, resulting in the ligation of two exons. The process involves formation of a hranched circular RNA, with the cleaved 5' terminus of the intron ligated to the adenosine nucleotide within the splice branch site located in the same intron near the splice acceptor site (31,32). The sequences of splice donor, branch, and acceptor sites are well conserved (for reviews see Refs. 31 and 32; Fig.  5 shows the splice branch site and splice acceptor site consensus sequences). A similar alternative splicing phenomenon observed in the present study for the CFTR gene has been observed in human y-glutamyl transpeptidase mRNA transcripts, with a 22-nucleotide insertion resulting in a premature stop codon and the shortened open reading frame compared to known cDNA sequences (33). Although it is conceivable that the ohserved exon 24a' CFTR transcript might be an intermediate product during the process of stepwise removal of intervening sequence nucleotides to generate the fully processed mature mRNA transcript (as reported in the mouse and human 6-globin as well as adenovirus 2 mRNA precursors (34-36)), this is highly unlikely, since the splice donor site in the hypothesized intermediate product (TG in exon 23 and CA in exon 24a) would not follow the well conserved splice acceptor site consensus (TG/GT) rule ( X , 32). Furthermore, such intermediate forms of mRNA are generally very shortlived, and it would be difficult to det,ect such an intermediate by Northern analysis. In this regard, an exon 24a-specific probe clearly identified the poly(A)+ transcripts with a slightly higher molecular weight than the normal 6.5-kb CFTR mRNA using Northern hybridization, suggesting the presence of a relatively stable form of mRNA transcripts containing the exon 24a sequence.
The clinical relevance of the exon 24a+ CFTR transcript with CF phenotype is unknown. It is not the result of a mutation, as it was observed in all cell samples. Furthermore, it is not localized to one cell type, as it is observed in epithelial and nonepithelial cells. It is possible that the CFTR protein product with a truncated carboxyl terminus coded by the exon 24a+ CFTR mRNA transcripts may function in a normal fashion as a CAMP-regulated C1-channel, since the intracytoplasmic carboxyl terminus of CFTR protein is believed to be irrelevant to CFTR function and no mutation of the protein coding sequence in exon 24 has been reported among the individuals affected with CF (37). Alternatively, the exon 24a+ CFTR mRNA transcripts might code for a nonfunctional CFTR protein. Since the fraction of the exon 24a+ CFTR mRNA transcripts is relatively small, ranging from 3 to 16% of the total CFTR mRNA transcripts, this would not have significant consequences to a normal individual with no mutations in the CFTR gene. However, it may have important consequences in certain circumstances, such as compound heterozygotes with CF with one mild allele (37-40). In this context, the variable expression of exon 24a+ CFTR transcript is likely one of many modulating factors causing the phenotypic diversity of CF individuals with the same genotype. Finally, it is important to consider the presence of the CFTR protein with a truncated carboxyl terminus generated by alternative splicing in detecting CFTR protein in cells or tissues, since an antibody raised to the very carboxyl terminus (e.g. the antibody to amino acids 1468-1480 (41)) would not interact with carboxyl terminus-truncated CFTR protein derived from 24a+ CFTR mRNA transcripts.