The Smooth Muscle Myosin Heavy Chain Gene Exhibits Smooth Muscle Subtype-selective Modular Regulation in Vivo *

Previous studies in our laboratory demonstrated that the transgene consisting of the −4.2 to +11.6 kilobase (kb) region of the smooth muscle (SM) myosin heavy chain (MHC) gene was expressed in virtually all SM tissue typesin vivo in transgenic mice and that the multiple CArG elements within this region were differentially required in SMC subtypes, implying that the SM-MHC gene was controlled by multiple transcriptional regulatory modules. To investigate this hypothesis, we analyzed specific regulatory regions within theSM-MHC −4.2 to +11.6 kb region by a combination of deletion analyses of various SM-MHC transgenes as well as by DNaseI hypersensitivity assays and in vivo footprinting in intact SMC tissues. The results showed that SM-MHCtransgene expression depended on a large number of required regulatory modules that were widely spread over the −4.2 to +11.6 region. Moreover, the results revealed several unexpected novel features of regulation of the SM-MHC gene including: 1) unique combinations of regulatory modules were required for SM-MHCexpression in different SMC-subtypes; 2) repressor modules as well as activator modules were both critical for SMC specificity of the gene; 3) certain modules were required in certain contexts but were dispensable in others within a given SMC-subtype (i.e. the net activity of the module was determined by interaction between modules not simply by the sum of module activities); and 4) we identified a highly conserved 200-base pair transcriptional regulatory module at +8 kb that was required in the large arteries but dispensable in the coronary arteries and airways in transgenic mice and contained multiple potential cis-elements that were occupied by nuclear proteins in the intact aorta based on in vivofootprinting. Taken together, the results suggest a model of complex modular control of expression of the SM-MHC gene that varies between SMC subtypes. Moreover, the studies establish the possibility of designing derivatives of the SM-MHC promoter that might be used for targeting gene expression to specific SMC subtypes in vivo.

Previous studies in our laboratory demonstrated that the transgene consisting of the ؊4.2 to ؉11.6 kilobase (kb) region of the smooth muscle (SM) myosin heavy chain (MHC) gene was expressed in virtually all SM tissue types in vivo in transgenic mice and that the multiple CArG elements within this region were differentially required in SMC subtypes, implying that the SM-MHC gene was controlled by multiple transcriptional regulatory modules. To investigate this hypothesis, we analyzed specific regulatory regions within the SM-MHC ؊4.2 to ؉11.6 kb region by a combination of deletion analyses of various SM-MHC transgenes as well as by DNaseI hypersensitivity assays and in vivo footprinting in intact SMC tissues. The results showed that SM-MHC transgene expression depended on a large number of required regulatory modules that were widely spread over the ؊4.2 to ؉11.6 region. Moreover, the results revealed several unexpected novel features of regulation of the SM-MHC gene including: 1) unique combinations of regulatory modules were required for SM-MHC expression in different SMC-subtypes; 2) repressor modules as well as activator modules were both critical for SMC specificity of the gene; 3) certain modules were required in certain contexts but were dispensable in others within a given SMC-subtype (i.e. the net activity of the module was determined by interaction between modules not simply by the sum of module activities); and 4) we identified a highly conserved 200-base pair transcriptional regulatory module at ؉8 kb that was required in the large arteries but dispensable in the coronary arteries and airways in transgenic mice and contained multiple potential cis-elements that were occupied by nuclear proteins in the intact aorta based on in vivo footprinting. Taken together, the results suggest a model of complex modular control of expression of the SM-MHC gene that varies between SMC subtypes. Moreover, the studies establish the possibility of designing derivatives of the SM-MHC promoter that might be used for targeting gene expression to specific SMC subtypes in vivo.
Smooth muscle (SM) 1 myosin heavy chain (MHC) is one of the best markers for smooth muscle cell (SMC) lineages (1). Its expression is regulated precisely and dynamically during SMC differentiation and also during the formation and development of vascular diseases such as atherosclerosis (2,3). Therefore, studies on the mechanisms that regulate SM-MHC gene expression would be of great importance not only for understanding of transcriptional regulatory mechanisms in SMC differentiation but also for the understanding of the mechanisms of phenotypic modulation of SMCs in vascular diseases.
We previously demonstrated that the SM-MHC genomic region from Ϫ4.2 to ϩ11.6 kb was capable of driving LacZ transgene expression in virtually all SMC subtypes in transgenic mice in vivo (4). In contrast, a LacZ transgene containing the Ϫ4.2 kb to ϩ88 bp of the SM-MHC gene was not expressed in any SMCs in multiple transgenic lines, indicating an absolute requirement of the intronic sequence from ϩ88 bp to ϩ11.6 kb for SM-MHC expression in vivo. We have confirmed recently the rigorous SMC specificity of the SM-MHC Ϫ4.2 to ϩ11.6 region in vivo using a Cre recombinase-activating reporter system in which transgenic mice carrying an SM-MHC Ϫ4.2 to ϩ11.6 region Cre transgene were crossed to a floxed indicator line that showed Cre-induced activation of LacZ (5). Because this system provides an integral of promoter activity throughout development and maturation, it provides a very sensitive means of detecting even transient expression in non-SMCs. Transgene expression was tightly confined within the SMCs in this system, thus further demonstrating the SMC specificity of the SM-MHC Ϫ4.2 to ϩ11.6 kb sequence (5).
As a first step to elucidate the transcriptional regulatory mechanisms of the SM-MHC gene, we previously analyzed the function of three conserved CArG elements within the 5Ј-flanking and first intronic sequences in transgenic mice (6). Results of these studies demonstrated that the three CArG elements were differentially required in SMC subtypes in vivo. For example, although CArG1 in the 5Ј-flanking sequence was required for all SMCs, the intronic CArG element was dispensable in all SMC subtypes other than the large arteries. These results implied that expression of the SM-MHC gene was regulated by multiple transcriptional regulatory regions including the 5Ј-flanking and intronic regions containing the CArG elements. However, these studies examined the function of individual cis-elements and did not define the precise regulatory regions within the Ϫ4.2 to ϩ11.6 SM-MHC region required for SMC-specific gene expression in vivo. Similarly, although the previous studies showing that the proximal SM22␣ promoter was capable of driving transgene expression in arterial SMCs but not in venous or visceral SMCs suggest that this gene is controlled by multiple regulatory modules (7)(8)(9), the specific regulatory modules required for expression within SMC subtypes have not been identified.
To examine the hypothetical modular control mechanisms of the SMC-specific genes in vivo, we analyzed regions within Ϫ4.2 to ϩ11.6 kb of the SM-MHC gene that were required for expression of the gene in vivo in transgenic mice. The results support a complex model in which the SM-MHC is controlled by multiple positive-and negative-acting regulatory modules that are widely distributed within the Ϫ4.2 to ϩ11.6 kb sequence. Moreover, we present novel evidence showing that these regulatory modules function differentially in SMC subtypes. These results imply that the SM-MHC gene regulatory system consists of multiple modular regulatory domains that confer the capability for SMCs to respond to vastly divergent environmental cues in developmental space and time and under pathophysiological conditions.

MATERIALS AND METHODS
Plasmid Construction and Transfection-Deletion constructs of the Ϫ4.2/ϩ11.6 LacZ were generated by restriction digestion and ligation. The integrity of the constructs was determined by restriction enzyme mapping and sequencing. The structure of the constructs is indicated schematically in Fig. 1. To generate the Ϫ1.3/ϩ11.6 LacZ construct, a part of the 5Ј-flanking sequence was taken from the pCAT-1346 (10). For construction of the 3xHS7-TK LacZ plasmid, three copies of the StyI/StuI fragment (ϩ8038 to ϩ8327) containing the highly homologous sequence was subcloned in tandem 5Ј of the minimal TK promoter in pTK LacZ (6).
The culture methods for rat aortic SMCs were described previously (10). For transfection, SMCs were plated at 20,000 cells/cm 2 in 6-well plates. DNA was transfected using Superfect (Qiagen) according to manufacturer recommendations on the next day of plating. The cells were harvested 72 h after transfection. ␤-Galactosidase assays were performed as described previously (6). To eliminate errors in the production of plasmids and to eliminate variability of plasmid quality, at least two independently prepared plasmid DNA samples of independent clones were transfected in duplicate.
Transgenic Mice-Transgenic mice were produced using standard procedures (4). These mice were used to establish breeding founder lines (F 0 ) or sacrificed for the analysis of reporter expression (transient transgenic lines). The analysis of transgenic mice was performed as described previously (4). All animal procedures used in these studies were reviewed and approved by the University of Virginia Animal Use and Care Committee. Transient transgenic lines were harvested at embryonic day 18.5. Because of the relatively weak activity of the wild-type SM-MHC Ϫ4.2/ϩ11.6 LacZ construct in many SMC tissues including large blood vessels within the embryo (4), LacZ expression patterns of various transgene constructs were mainly compared using 4 -8-week-old F 1 mice from founder lines except as noted otherwise. Because of inherent variations in transgene expression among transgenic lines and minor variations in the preparations from a given line, multiple independent transgenic lines of each construct were examined for transgene expression (see Table I). A similar pattern of transgene expression was observed between independent founder transgenic lines.
In Vivo Footprinting-Dimethyl sulfate (DMS) treatment of intact tissues was described previously (12). Fat and matrixes were removed from tissues prior to DMS treatment. The intact rat aorta was incubated in 5 ml of PBS containing 8 l of DMS for 2 min at 37°C with agitation. Endothelial and adventitial layers were removed from the aorta after DMS treatment. Genomic DNA was purified from the treated tissues and subjected to piperidine treatment as described previously (12).
Ligation-mediated PCR was performed as described previously (12) with minor modifications. In brief, 2 g of DNA samples was subjected to first primer extension from biotinylated gene-specific primer 1. A 20-l reaction containing 1ϫ LA Taq buffer (Takara), 0.5 pmol of primer, and 2 g of DNA was incubated at 95°C for 5 min using a thermal cycler. The temperature was decreased by 10°C/min and held at T m -5°C for 30 min. Five microliters of 1ϫ LA Taq buffer, 0.25 mM each of dNTPs, and 2.5 units of LA Taq polymerase (Takara) was added and incubated for 10 min. Subsequent reactions were incubated at 76°C for 10 min. Primer extension products were then ligated to a double-stranded linker oligonucleotide by adding 45 l of ligation mixture containing 7 l of 10ϫ T4 DNA ligase buffer (New England Biolabs), 0.7 l of 100 mM ATP, 50 pmol of the double-stranded linker oligonucleotide, and 2 l of T4 DNA ligase (New England Biolabs), and the reactions were incubated at 16°C overnight. The linker oligonucleotide was produced by annealing 5Ј-GCGGTGACCCGGGAGATCT-GAATTCT-3Ј and 5Ј-GAATTCAGATC-3Ј (13). A T nucleotide was added to the original sequence of the former oligo (13) to facilitate ligation of Taq-amplified products. Primer extension products were purified using Dynabeads M-280 Streptavidin (Dynal). Forty microliters of Dynabeads was washed once with 100 l of 2ϫ binding and washing buffer (2 M NaCl, 10 mM Tris-HCl, pH 7.5, and 1 mM EDTA) using a magnetic separation stand (Promega). Beads were resuspended in 70 l of 2ϫ binding and washing buffer and then added to ligation reactions. The samples were mixed for 5 min at room temperature with constant rotation. Beads were washed two times with 100 l of 2ϫ binding and washing buffer and two times with 100 l of 1ϫ TE buffer (10 nM Tris-HCl, pH 8.0, 10 nM EDTA). Beads were resuspended in 40 l of 0.15 M NaOH and incubated for 10 min at 37°C. Beads were captured on a magnetic stand, and the supernatants were ethanol-precipitated using 250 l of ethanol, 4 l of 3 M sodium acetate, pH. 5.2, and 2 l of SeeDNA coprecipitant (Amersham Pharmacia Biotech). Precipitated DNA was washed twice with 70% ethanol and dissolved in 10 l of distilled H 2 O. Exponential amplification was done in 50 l of reaction containing 1ϫ LA Taq buffer, 0.4 mM each of dNTPs, 200 nM genespecific primer 2, 200 nM linker primer, and 2.5 units of LA Taq. The PCR conditions were 94°C for 2 min and 15-18 cycles of 94°C for 30 s; T m -5°C for 2 min; 76°C for 3 min and 1 cycle of 76°C for 10 min. Fifteen microliters of PCR products were labeled for visualization by adding 5 l of labeling mixture containing 1ϫ LA Taq buffer, 20 nM 32 P end-labeled gene-specific primer 3, 1.5 mM each dNTPs, and 1.25 units of LA Taq and incubated at 94°C for 2 min and 5-8 cycles of 94°C for 30 s, T m -5°C for 2 min, 76°C for 3 min, and 1 cycle of 76°C for 10 min. PCR products were ethanol-precipitated and resolved on denaturing 6% Long Ranger (BMA) gels. Sequencing ladders produced from the genespecific primer 3 were run on the same gel as size markers (data not shown). The PCR primers for the antisense strand (forward primers) are: gene-specific primer 1, 5Ј-CAAAGCATTCAGTGTGGAAC-3Ј; primer 2, 5Ј-CTTCCTTGGCTTTGACCAAAGCAGTC-3Ј; and primer 3, 5Ј-TCCTTGGCTTTGACCAAAGCAGTCTTGTGC-3Ј. Primers for the sense strand (reverse primers) are: primer 1, 5Ј-AACATACTTTAG-GCACATAG-3Ј; primer 2, 5Ј-GTCACTCAGAGGCCTAGTGTTTGTG-3Ј; and primer 3, 5Ј-CACTCAGAGGCCTAGTGTTTGTGTCATTCC-3Ј.

SM-MHC Expression Requires Additional Regulatory Regions in Addition to Known Regions within the 5Ј-Flanking
Sequence-Results of our previous studies showed that expression of the SM-MHC gene in vivo depended on 1) the region from Ϫ4.2 to ϩ11.6 kb and 2) three CArG elements located within two highly conserved regions of ϳ200 bp in the 5Јflanking sequence at Ϫ1.3 kb and the first intronic sequence at ϩ1.6 kb (6). On this basis, we hypothesized that the SM-MHC gene might be regulated by multiple transcriptional regulatory subregions including these two CArG-containing regions. The overall goal of the present studies was to test this hypothesis and to identify specific transcriptional regulatory regions required for SM-MHC expression in vivo. Given the very large size of the SM-MHC regulatory region (ϳ16 kb), we employed multiple experimental methods and strategies to characterize regulatory regions including 1) DNaseI hypersensitive assays and in vivo footprinting to identify regions that may contain transcriptionally active elements based on evidence of binding of nuclear proteins, 2) identification of regions that are conserved across species, 3) testing various SM-MHC promoter constructs based on transient transfection studies in cultured SMCs, and 4) testing selected promoter constructs in transgenic mice. However, our major focus was on testing candidate regions in vivo in transgenic mice, because cultured SMCs are know to be phenotypically modulated, and we have shown previously that results obtained in cultured SMCs are often invalid for predicting the activity of promoter constructs in vivo (4,14). Expression patterns of various SM-MHC transgenic constructs were examined mainly using 4 -8-week-old F 1 mice because of the late induction of SM-MHC Ϫ4.2/ϩ11.6 LacZ transgenes during mouse development (4) and the inherent variability in transgenic expression observed in embryos that would confound direct comparisons of the activity of various test constructs.
To begin identification of specific regulatory regions required for SM-MHC expression in vivo, we first investigated whether subfragments containing known transcriptional regulatory regions were sufficient for transcription in vivo. We initially tested the sequence from Ϫ1342 bp to ϩ11.6 kb (Ϫ1.3/ϩ11.6 LacZ, Construct 2 in Fig. 1), because the 1346 bp of the 5Јflanking sequence contains the entire conserved 227-bp domain (Ϫ1321 to Ϫ1095) containing the CArG1 and CArG2 elements that we previously showed were required for expression within the context of the SM-MHC Ϫ4.2/ϩ11.6 LacZ transgene (6). Moreover, in reporter assays using cultured SMCs, the Ϫ1.3/ ϩ11.6 LacZ construct showed activity equivalent to that of the Ϫ4.2/ϩ11.6 LacZ (data not shown). In contrast, Ϫ1.3/ϩ11.6 LacZ transgenic mice clearly showed weaker LacZ expression in many SMC subtypes as compared with Ϫ4.2/ϩ11.6 LacZ transgenic mice ( Fig. 2 and Table I). For example, the Ϫ1.3/ ϩ11.6 LacZ construct showed no expression in vascular SMCs except very weak expression in several small arteries (Fig. 2, compare panels 2a with 1a). Relatively strong transgene expression was observed in the trachea and main trunks of bronchi (Figs. 2, 2b, and 3e), but no expression was detected in airway SMCs within the smaller branches of the bronchial tree. The transgene was expressed strongly in the GI tract and bladder (Fig. 2, 2c-2e), although expression was somewhat uneven with some cells staining very intensely, whereas others were very weakly stained. Of particular significance, unlike the wild-type Ϫ4.2/ϩ11.6 LacZ transgene, which is highly SMCspecific, the expression of the Ϫ1.3/ϩ11.6 LacZ transgene was observed consistently in a fraction of cardiac muscle cells and various non-SMC mesenchymal cells (data not shown). These results thus clearly demonstrate that the Ϫ1.3 to ϩ11.6 region was not sufficient for SMC-specific expression in vivo and the region from Ϫ4.2 to Ϫ1.3 contains regulatory sequences important for the expression in vascular and airway SMCs as well as elements required for restricting SM-MHC expression to SMCs.
The First 2.5-kb of the Intronic Sequence Containing the Intronic CArG Region Was Not Sufficient for SM-MHC Expression in Vivo-We next tested whether the sequence from Ϫ4200 to ϩ2500 bp (Ϫ4.2/ϩ2.5 LacZ, Construct 3 in Fig. 1) was sufficient to drive SMC-specific expression in vivo. The 2500 bp of the first intronic sequence was chosen based on the fact that it contains the intronic CArG region (6) and on results of tran- within the aorta (Figs. 2, 3a, and 3a). However, transgene expression in the airways was very strong and equivalent to that of the full-length transgenic lines (Fig. 2, compare 3b with  1b). The Ϫ4.2/ϩ2.5 LacZ transgene was also expressed strongly in the GI tract. However, transgene expression in the GI tract differed among parts of the tissues (Figs. 2, 3d, and 3g) and did not show consistently strong staining as observed in Ϫ4.2/ ϩ11.6 LacZ transgenic mice. Strong transgene expression was also observed in cardiac and skeletal muscles (Figs. 2, 3a, and 3i). These data indicate that the SM-MHC Ϫ4.2 to ϩ2.5 kb region is not sufficient for expression of the SM-MHC gene in vivo and that the region between ϩ2.5 and ϩ11.6 kb is required for expression in subsets of SMCs including vascular and bladder SMCs. Results also show that the region from ϩ2.5 to ϩ11.6 contains regulatory regions that suppress transgene expression in non-SMCs. Taken together, the results of our initial experiments show that SM-MHC expression in vivo requires regulatory modules located outside of the Ϫ1.3 to ϩ2.5 kb region.
The Region from ϩ2.5 to ϩ5.3 Exhibited Repressor Activity in Transgenic Mice-Our initial results clearly demonstrated that the SM-MHC region from ϩ2.5 to ϩ11.6 contains regulatory regions required for SMC-specific expression of the SM-MHC gene in transgenic mice. Given the large size of this region and the relative lack of large regions that are highly conserved across species, we analyzed various 3Ј-deletion mutants of this region in cultured SMCs to identify possible regulatory regions that could be subsequently tested in transgenic mice. The results of transient transfection studies in cultured rat aortic SMCs showed that the construct containing the SM-MHC sequence from Ϫ4.2 to ϩ5.3 kb (Ϫ4.2/ϩ5.3 LacZ, Construct 4 in Fig. 1) had the strongest activity (65-fold activity over promoterless pAUG LacZ versus 57-fold activity of the Ϫ4.2/ϩ11.6 LacZ), suggesting that the region from ϩ2.5 to ϩ5.3 might contain additional positive regulatory regions required for SM-MHC expression. However, in contrast to observations in cultured SMCs, addition of the region from ϩ2.5 to ϩ5.3 resulted in a marked reduction in expression of the transgene as compared with that of the full-length Ϫ4.2/ϩ11.6 LacZ transgenic lines or the shorter Ϫ4.2/ϩ2.5 LacZ transgenic lines (Fig. 2, compares row 4 with rows 1 and 3). For example, multiple Ϫ4.2/ϩ5.3 LacZ lines showed no transgene expression in vascular SMCs or airway SMCs (Figs. 2, 4a and 4b, and 3b). Although some transgene expression was detected in the GI tract and bladder, LacZ staining was limited to a small fraction of SMCs (Fig. 2, 4c-4e). As such, transgene expression in the SMCs of the Ϫ4.2/ϩ5.3 lines was markedly weaker than that of Ϫ4.2/ϩ11.6 LacZ and Ϫ4.2/ϩ2.5 LacZ transgenic mouse lines. Interestingly, although expression of the Ϫ4.2/ϩ2.5 LacZ transgene was observed in non-SM tissues, this was not the case with multiple Ϫ4.2/ϩ5.3 kb LacZ lines. These results suggest that the ϩ2.5 to ϩ5.3 region contains a negative-acting regulatory region necessary for limiting SM-MHC expression within SMCs. However, expression in Ϫ4.2/ϩ5.3 lines was not equivalent to that of Ϫ4.2/ϩ11.6 LacZ transgenic mice, indicating that further regulatory regions are also located within the sequence from ϩ5.3 to ϩ11. 6.
The Region from ϩ5.3 to ϩ11.6 kb Contained a DNaseI Hypersensitive Site That Was Required for SM-MHC Expression in Vivo-Because results of the preceding analyses of transgenic mice indicated that additional transcriptional regulatory regions were located within the ϩ5.3 to ϩ11.6 kb region, we sought to identify these possible regulatory regions. We first attempted to identify regulatory regions by reporter assays using cultured SMCs. However, the addition of various fragments of the region from ϩ5.3 to ϩ11.6 kb to the Ϫ4.2/ϩ5.3 LacZ did not alter the reporter activity in cultured SMCs. The lack of an effect may be the result of phenotypic modulation of SMCs. Alternatively, it is possible that regulatory regions within the ϩ5.3/ϩ11.6 sequence might function only in the context of intact chromatin, as has been demonstrated for a number of transcriptional regulatory modules (15). In reporter assays using transient transfection of plasmid DNAs, the vast majority of plasmid DNA stays as episomal DNA and is not organized properly into a nucleosomal structure (16). Therefore, we elected to identify candidate transcriptional regulatory regions within the context of the endogenous SM-MHC gene in intact chromatin using DNaseI hypersensitivity assays. Because the opened chromatin structure frequently associated with transcriptionally active regions allows access of large DNaseI molecules, opened active regions are digested by DNaseI much more strongly than other silent regions and thus often show DNaseI hypersensitivity. Intact cultured SMCs were permeabilized and treated directly with DNaseI. Genomic DNA was purified, and the endogenous SM-MHC genomic region from Ϫ3.1 to ϩ12 kb was examined for DNaseI hypersensitivity (Fig. 4). Multiple hypersensitive sites designated HS1-7 were detected (Fig. 4b). As expected, known transcriptional regulatory regions showed DNaseI hypersensitivity in these assays. For example, HS1 corresponded to the highly conserved region containing 5Ј-flanking CArG elements and the GC repressor element at Ϫ1.3 kb, HS2 corresponded to the proximal promoter region, and HS4 corresponded to the highly conserved region containing the intronic CArG at ϩ1.6 kb. These data show the presence of multiple DNaseI hypersensitive sites within the region from Ϫ3.1 to ϩ12 and identified a novel candidate transcriptional regulatory region at ϩ8 kb (HS7) within the region from ϩ5.3 to ϩ11.6 for further studies.
The HS7 DNaseI Hypersensitive Site at ϩ8 kb Contained Multiple cis-Regulatory Elements-Results of analyses of the Ϫ4.2/ϩ5.3;ϩ7.5/ϩ9 LacZ transgenic lines clearly indicated that the 1.5-kb from ϩ7.5 to ϩ9 kb contained a positive-acting regulatory region. We next characterized this putative transcriptional regulatory region. Sequence analyses of the rat and human genes revealed a highly conserved 200-bp domain at HS7. Outside of this region, the rat and human sequences are highly divergent.
We next analyzed nuclear protein binding to the HS7 by using in vivo footprinting methods. The DNA samples prepared from DNaseI-treated cultured SMCs and used for DNaseI hypersensitive assays were subjected to further analyses by ligation-mediated PCR. In these experiments, the level of DNaseI digestion of DNA was analyzed at the single nucleotide level in contrast to the much lower resolution (ϳ100 bp) attainable by Southern analyses in DNaseI hypersensitivity assays (17). Because protein binding to DNA inhibits access of DNaseI to DNA, DNA sequences bound by nuclear proteins show a lower level of digestion, resulting in lower intensity bands (footprint) as compared with corresponding bands obtained with DNaseItreated naked genomic DNA when visualized by ligation-mediated PCR. Thus, nuclear protein binding to DNA sequences can be analyzed within the context of the endogenous gene in intact chromatin in these experiments. As shown in Fig. 5a, multiple sequences were protected from DNaseI digestion (Ft1-Ft9), indicating nuclear protein binding to these sequences at HS7 of the endogenous SM-MHC gene in intact cultured SMCs.
To examine further if these protected regions were also bound by proteins in SMCs in intact aortic SMCs, the rat intact aorta was subjected to in vivo methylation footprinting. Because the intact aorta does not permit penetration of large DNaseI molecules, we employed an alternative method in which the intact rat aortas were treated directly with DMS that preferentially methylates G residues (12). DNA was purified from DMS-treated aortas, and purified DNA was cleaved specifically at methylated G residues by piperidine treatment (13). The cleavage points of DNA were visualized using ligationmediated PCR. As shown in Fig. 5b, multiple G nucleotides were partially protected from methylation by DMS in vivo, suggesting occupation of these G residues by DNA-binding proteins. Many G residues within the Ft1-Ft9 footprint regions identified in DNaseI in vivo footprinting of cultured SMCs were protected (Fig. 6), suggesting that at least some of these footprinting regions detected in cultured SMCs were also bound by proteins in the intact rat aorta.
To further characterize protein binding to Ft1-Ft9 footprint regions, these regions were examined by EMSAs using nuclear extracts prepared from cultured SMCs. All probes for Ft1-Ft9 regions formed specific DNA-protein complexes with SMC nuclear extracts in EMSAs (Fig. 7 and data not shown). Of particular interest, Ft3 contained a CArG-like element at ϩ8132 bp, and Ft4 contained an E-box at ϩ8159 bp. The CArG-like sequence within the Ft3 region contains a mismatch nucleotide (CCGTTTTTGG) compared with the CArG consensus, CC(A/ T) 6 GG, and is well conserved between the rat and human genes. In EMSAs, both rat and human CArG sequences formed a major shift band that had a mobility identical to that of the major shift band of the well defined CArG element of the c-fos promoter, c-fos SRE (Fig. 7a). These shift bands were supershifted by the addition of anti-serum response factor (SRF) antibody, indicating that SRF was present in the DNA-protein complexes. However, a 50-fold molar excess of cold probes of the rat and human Ft3 CArG elements did not compete efficiently with SRF binding to the c-fos SRE probe as compared with c-fos SRE itself (Fig. 7a, lanes 8 -10), suggesting a relatively low affinity to SRF of these CArG elements in EMSAs in vitro.
E-boxes (CANNTG) are target sites for basic helix-loop-helix factors such as the MyoD family myogenic regulatory factors and are important for control of a number of skeletal musclespecific genes (18). The E-box and its flanking sequence within the Ft4 footprint region are identical between the rat and human genes. In EMSAs, the probe for Ft4 formed a major shift band that was competed by a 50-fold molar excess of the cold self-probe and an oligonucleotide containing the muscle creatine kinase E-box sequence (Fig. 7b, lanes 2 and 4). Although the sequence of Ft4 (5Ј-TCAAGTG-3Ј in the antisense strand) also contains the consensus sequence of homeobox proteins, Nkx2 factors (5Ј-TNNAGTG-3Ј) (19), the major band was not competed by a consensus Nkx2-1 binding site (lane 5) (19). Formation of the major shift band was inhibited by anti-upstream stimulatory factor (USF)-1 and USF-2 antibodies but not by antibodies against other basic helix-loop-helix transcription factors including E12 and E2A (lanes 6 and 7). Taken together, data of EMSAs demonstrate that the CArG element and E-box within HS7 can bind SRF and USF in vitro, respectively.
Because the data of DNaseI hypersensitivity assays and in vivo footprinting suggest that HS7 may be transcriptionally active, we tested whether the HS7 region might exhibit enhancer activity in cultured SMCs. Despite observations of protein binding to this region of the endogenous SM-MHC gene within chromatin, neither 257 bp of highly conserved sequence (ϩ8038 to ϩ8294) nor the 1.5 kb from ϩ7.5 to 9 kb significantly increased reporter activity when subcloned into the minimal TK-LacZ construct in transient transfection in cultured SMCs (data not shown). Enhancer activity was also not detected in Chinese hamster ovary (CHO), 10T1/2, and NIH3T3 cells (data not shown).
HS7 at ϩ8 kb Was Required for Transcription in Vivo-Although the DNaseI hypersensitive assay and in vivo footprinting data indicated that HS7 was opened and bound by nuclear protein in the context of the endogenous SM-MHC gene within chromatin, transient transfection experiments using cultured SMCs did not detect the transcriptional activity of this region. To test whether HS7 might function in vivo in transgenic mice, the 200-bp sequence (ϩ8095 to ϩ8294) at HS7 was deleted from the Ϫ4.2/ϩ11.6 LacZ construct (Ϫ4.2/ϩ11.6 ⌬HS7 LacZ, Construct 6 in Fig. 1). The deletion did not significantly change reporter activity in cultured SMCs (data not shown). However, the deletion resulted in differential reduction in reporter activity in SMC subtypes in vivo in transgenic mice (Fig.  2, row 6 versus row 1). Of particular note, reporter expression in large arteries was very weak (Figs. 2, 6a versus 1a, and 3, l versus k), although the expression in veins and smaller arteries was equivalent to that in the Ϫ4.2/ϩ11.6 LacZ transgenic mice (Fig. 3, l versus k). Transgene expression was also detected easily in the coronary arteries in the Ϫ4.2/ϩ11.6 ⌬HS7 LacZ transgenic mice (Fig. 2, 6a, and 3j). Transgene expression in the airways was strong, whereas no expression was detected in pulmonary blood vessels (Fig. 2, 6b). Expression in the GI tract and bladder was strong but showed some uneven staining among SMCs (Fig. 2, 6c-6e). These results indicate that the conserved 200-bp region containing HS7 is absolutely required for expression in large arteries but dispensable in small arteries and airways. The region may also be required for maximal expression in the GI tract and bladder. However, compared with the expression patterns of the Ϫ4.2/ϩ5.3 LacZ lines, the Ϫ4.2/ϩ11.6 ⌬HS7 LacZ transgenic mice clearly showed a higher level of LacZ expression in subsets of SMCs including visceral and small arterial SMCs. These data suggest that there may be additional regulatory regions within the ϩ5.3 to ϩ11.6 sequence.
We next tested whether the HS7 region was not only necessary but also sufficient to drive transcription in vivo. Three copies of the conserved region were subcloned 5Ј of a minimal TK promoter (3xHS7-TK LacZ), and transgenic mice were generated. Three independent founder lines showed no transgene expression in any cells. Taken together, the HS7 region is differentially required for SM-MHC expression in vivo but is not sufficient to drive transcription when coupled with a minimal TK promoter.
The Sequence from ϩ5.3 to ϩ11.6 kb Contained Additional Negative-and Positive-acting Regulatory Regions That Played Crucial Roles in SM-MHC Expression in Vascular SMCs-The results of analyses of transgenic mice carrying the HS7 deletion mutant (Ϫ4.2/ϩ11.6 ⌬HS7 LacZ) suggested that there might be additional regulatory regions within the sequences from ϩ5.3 to ϩ7.5 and ϩ9 to ϩ11.6. As such, we next tested the function of the sequence from ϩ5.3 to ϩ7.5 in transgenic mice. A construct containing the SM-MHC genomic sequence from Ϫ4.2 to ϩ9 kb (Ϫ4.2/ϩ9 LacZ, Construct 7 in Fig. 1) was generated and used to produce transgenic mice. Although we expected positive activity, transgenic mice carrying the Ϫ4.2/ϩ9 LacZ transgene showed clearly weaker transgene expression in vascular SMCs as compared with that seen in Ϫ4.2/ϩ5.3;ϩ7.5/ϩ9 LacZ transgenic mice (Fig. 2, 7a and 7b  versus 5a and 5b, and Fig. 3, d versus c). Only a minor fraction of vascular SMCs were stained positively for LacZ in large and small arteries (Figs. 2, 7a and 7b, and 3d). In contrast, transgene expression in the GI tract and bladder was very strong and equivalent to that of Ϫ4.2/ϩ11.6 LacZ and 4.2/ϩ5.3; ϩ7.5/ϩ9 LacZ transgenic mice (Fig. 2, 7c-7e versus 1c-1e and  5c-5e). Transgene expression in the airways was also strong (Fig. 2, 7b). The weaker transgene expression observed in vascular SMCs of Ϫ4.2/ϩ9 LacZ transgenic mice as compared with Ϫ4.2/ϩ5.3;ϩ7.5/ϩ9 LacZ mice suggests that the sequence from ϩ5.3 to ϩ7.5 may contain a regulatory module that inhibits transgene expression at least in vascular SMCs. In addition, the fact that the Ϫ4.2/ϩ9 transgene was expressed at lower levels than the Ϫ4.2/ϩ11.6 LacZ in vascular SMCs also suggests that the sequence between ϩ9 and ϩ11.6 may contain a positive-acting regulatory module that is required for SM-MHC expression in vascular SMCs.  5. In vivo footprinting analyses of the HS7 region ؉8 kb. a, DNaseI in vivo footprinting analyses using cultured SMCs. To specify protein binding sequences within the HS7 region of the endogenous SM-MHC gene in intact chromatin, intact cultured SMCs were treated directly with DNaseI. Protection from DNaseI digestion (footprints), which indicates protein binding to DNA sequences, was visualized using ligation-mediated PCR. The antisense (lanes 1 and 2) and sense (lanes 3 and 4) strands were amplified using forward and reverse primer sets, respectively. C and T indicate lanes of control naked DNA and cultured SMC DNA, respectively. Footprint domains are indicated by bars. 1-9 refers to the footprint regions Ft1-Ft9. b, DMS in vivo footprinting analyses using the intact rat aorta. To examine whether footprinting regions identified above were also bound by proteins in the rat aorta, intact rat aorta was treated directly with DMS that preferentially methylated guanine residues. Methylated G residues were specifically cleaved by piperidine treatment. The levels of cleavage (i.e. the levels of methylation) were visualized using ligation-mediated PCR. The antisense (lanes 1 and 2) and sense (lanes 3 and 4) strands were amplified using forward and reverse primer sets, respectively. Protected guanine nucleotides are indicated by circles. These G nucleotides consistently showed protection in multiple ligation-mediated PCR analyses of multiple independently prepared DNA samples. Representative gel images are shown. A hypermethylated guanine nucleotide is indicated by an arrowhead.

Expression of the SM-MHC Gene Is Regulated by Multiple Modular Control Regions That Exhibit SMC Subtype-selective
Activity-Our previous studies of the function of multiple CArG elements in the SM-MHC transcriptional control (6) and studies of others of SM22␣ promoter regions in transgenic mice (7-9) implied that transcriptional control depended on multiple control regions that varied between SMC subtypes. However, these studies failed to identify specific modules required for expression in diverse SMC subtypes. Results of the present studies provide clear identification of the specific genomic regions required for controlling the SM-MHC gene in different SMC subtypes. Furthermore, the present studies revealed novel and unexpected features of the SM-MHC regulatory system.  (Ft1-Ft9). Guanine residues that were protected or hypermethylated in DMS in vivo footprint analyses of the intact rat aorta are indicated by circles and a delta, respectively. Bold letters indicate nucleotides conserved between the rat and human genes. A CArG-like element and an E-box are boxed. First, unique combinations of transcriptional regulatory modules that are spread widely over a very large genomic region are required for expression of the SM-MHC gene in different SMC subtypes. Indeed, deletion of any one of seven restriction fragments of the SM-MHC Ϫ4.2 to ϩ11.6 kb region resulted in differential changes in transgene expression in SMC subtypes in vivo (Fig. 2), indicating very different roles of each region in different SMC subtypes. That is, expression of the SM-MHC gene in SMC subtypes is clearly not controlled by a single or a few SMC subtype-specific enhancer regions but rather by complex combinations of an unexpectedly large number of regulatory modules, the activity of which varies in different SMC subtypes.
Second, the activity of certain modules varies not only in different SMC subtypes but also depending on genomic contexts. For example, the region ϩ2.5 to ϩ5.3 appeared to contain both positive-and negative-acting modules but exhibited predominantly repressor activity when added to the Ϫ4.2/ϩ2.5 region (Fig. 2, compare Ϫ4.2/ϩ5.3 LacZ with Ϫ4.2/ϩ2.5 LacZ). However, this same region exhibited positive activity in vascular SMCs, particularly in coronary vessels, in the context of additional promoter-intronic regions (e.g. Fig. 2, compare Ϫ4.2/ ϩ2.5;ϩ5.3/ϩ11.6 LacZ with Ϫ4.2/ϩ11.6 LacZ). Thus, the positive and negative activity of this region seemed dispensable in certain SMC subtypes in certain contexts but were required in other contexts. Likewise, although the 1.5-kb region from ϩ7.5 to ϩ9 clearly increased transgene expression in all types of SMCs when added to the Ϫ4.2/ϩ5.3 kb LacZ construct (Fig. 2, compare row 4 with row 5), high expression in airway SMCs was observed despite the complete omission of this region in the case of the Ϫ4.2/ϩ2.5 LacZ transgene construct (Fig. 2, 3b). These results indicate that expression of the SM-MHC gene is controlled by complex interplay between regulatory modules. That is, the net effect of a given module on overall transcription of the SM-MHC gene is not determined by the isolated activity of this module but by interaction with other modules. In other words, the activity of a given module seems to be processed in reference to the activity of other modules, and activity is not simply the linear sum of all modules present.
Third, negative-acting modules as well as positive-acting modules are both required for SMC specificity of expression of the SM-MHC gene. For example, addition of the fragment ϩ2.5/ϩ5.3 eliminated the strong transgene expression observed in cardiac and skeletal muscle cells in the Ϫ4.2/ϩ2.5 LacZ transgenic mice (Fig. 2, compare 3a with 4a), whereas elimination of the Ϫ4.2 to Ϫ1.3 region also resulted in expression non-SMCs. These data suggest that negative-acting regulatory modules play a critical role in confining SM-MHC expression within SMCs.
Taken together, results of the present studies revealed the extremely complex modular structure of the SM-MHC cis-regulatory system in vivo. This multiplicity of the modular regulatory system of the SM-MHC gene is likely to enable the system to respond to vastly divergent local environmental cues in vivo between large and small arterial SMCs, vascular and intestinal SMCs, etc., and thus may contribute to the marked heterogeneity of SMCs (20). The modular system may be required also to respond to changing environmental cues during development and also in pathophysiological conditions. The results are consistent with a growing body of evidence indicating that a multiple modular structure is a common feature of transcriptional regulatory systems of developmentally regulated genes (21,22). For example, several skeletal and cardiacspecific genes including MyoD, myogenin, Myf5, and Nkx2-5/ Csx have been demonstrated to be regulated by multiple modules in vivo (21,23,24). However, the present studies are the first to provide clear evidence showing the complex modular regulatory nature of an SMC differentiation marker gene in vivo. Fig. 8 summarizes the structure and function of transcriptional regulatory regions within the Ϫ4.2 to ϩ11.6 kb of the SM-MHC gene and illustrates the differential roles of SM-MHC fragments or regulatory modules that we observed in SMC subtypes in vivo. However, we want to emphasize that it is possible, and indeed likely, that some modules might function differently in different contexts (i.e. different combinations of modules) (25) and that functions of some modules might have been (a) masked by other modules, (b) not detected because of redundancy in function or by chance because of the location of restriction sites, and/or (c) not have been functional under conditions of our experiments. That is, certain modules may only be active at specific stages of SMC development, during modulation of SMC after vascular injury, and other pathophysiological circumstances. In any case, it is clear that much further work will be necessary to identify specific functions of . cis-elements identified previously (see Fig. 1  transcriptional regulatory modules. Nevertheless, the data presented in the present studies provide a strong foundation for extensive further characterization of mechanisms that control the SM-MHC gene in vivo. HS7 at ϩ8 kb Is Evolutionarily Conserved and Required for SM-MHC Transcription in Vivo-Transgenic mouse lines carrying the Ϫ4.2/ϩ11.6 ⌬HS7 LacZ transgene clearly demonstrated the requirement of HS7 in expression of the SM-MHC gene in vivo. Furthermore, in vivo footprint assays demonstrated that this region contained multiple potential cis-elements occupied by proteins in both intact cultured SMCs and the rat aorta. However, this region did not exhibit enhancer activity in cultured SMCs when coupled with several promoter constructs including a minimal TK promoter and the Ϫ4.2/ϩ5.3 LacZ. These data add to a growing body of evidence clearly showing that results of promoter activity obtained by transient transfection of reporter plasmids into cultured SMCs may not provide a valid index of regulation of endogenous SMC-specific genes. There are at least two possible explanations for these contradictory results about the function of HS7. First, because of the highly modulated phenotype of SMCs in culture, HS7 may not exhibit its optimum activity in cultured SMCs. It is well known that SMCs are highly modulated in culture, and virtually all SMC marker genes including the SM-MHC gene are down-regulated in cultured SMCs (1). Therefore, it is conceivable that the activity or expression of transcription factors necessary for optimum enhancer activity of HS7 might be suppressed. Second, because plasmid DNA largely stays as episomal DNA when transiently transfected into cultured cells and is not organized properly into the nucleosomal structure (16), reporter assays may not provide information regarding transcriptional control mechanisms that operate in the context of intact chromatin. Indeed, it has been demonstrated that transcriptional regulatory modules of several genes including the two DNaseI hypersensitive sites of the ␤-globin locus control region do not work as classical enhancers in conventional reporter assays, whereas they do exhibit enhancer activity when integrated into chromatin (15,16). As such, contradictory results obtained from studies of the endogenous SM-MHC gene (DNaseI hypersensitive assays, in vivo footprinting, and transgenic mice) and transient transfection experiments suggest that HS7 may function only when integrated into chromatin. Indeed, the successful identification of HS7, which was not detected by conventional transient transfection experiments in cultured SMCs, using methods that detect transcription factor binding to target sites in the context of intact chromatin exemplifies the necessity and strength of these novel methods in studies of complex transcriptional regulatory systems in vivo.
Although the results of Ϫ4.2/ϩ11.6 ⌬HS7 LacZ and Ϫ4.2/ ϩ5.3;ϩ7.5/ϩ9 LacZ transgenic mice clearly established an important role of HS7 in control of SM-MHC expression in vivo, the 3xHS7-TK LacZ transgene was not expressed in three independent transgenic mouse lines. Because of the small number of transgenic lines we have analyzed, we cannot clearly conclude that the 3xHS7-TK LacZ transgene is not functional in vivo. However, because of significant inherent limitations involved in minimal promoter transgene experiments, we decided not to produce more transgenic lines carrying the 3xHS7-TK LacZ construct. For example, the data in the present studies as well as previous studies in other cell systems (15,25,26) demonstrated that the functions of regulatory modules are highly dependent on the context in which they are tested including chromatin structure, the presence of other regulatory modules, and the proper orientation, spacing, and order of modules (15,25,26). Most of these factors cannot be fully reconstituted in small synthetic transgenes using foreign min-imal promoters. As such, although transgenic mice carrying minimal promoter constructs have provided valuable information regarding functions of isolated regulatory modules, the functions of modules observed in this type of experiment need to be interpreted very cautiously.
HS7 contained an E-box that was bound by USF in EMSA experiments using nuclear extracts prepared from cultured SMCs. We also detected USF binding activity to this E-box in nuclear extracts prepared from intact rat SM tissues. 2 Previous studies in our laboratory provided evidence for a role of USF in SM ␣-actin expression through E-boxes in the 5Ј-flanking sequences (27). Importantly, these E-boxes were required for transcription in vivo in transgenic mice (28). Recently, the osteopontin and APEG-1 genes have also been shown to depend on E-boxes in cultured SMCs (29,30). These studies support important roles of E-boxes in transcriptional regulation of SMC-specific genes. However, analysis of E-box-binding proteins using in vitro assays such as EMSAs and in vitro footprinting may be inconclusive, because binding activity of some basic helix-loop-helix factors including c-Myc cannot be detected using these in vitro assays even when these factors do bind target E-boxes in chromatin (31). In contrast, USF is known to be detected easily in nuclear extracts prepared from various types of cells, and thus in vitro binding assays are biased toward detecting USF (31). In fact, the uncertainty regarding functional roles of transcription factors detected by in vitro binding assays in the control of endogenous target genes in chromatin is an inherent problem of this type of assay. Many studies have shown major differences in transcription factor binding to the target sites between in vitro assays and endogenous genes in chromatin (31)(32)(33), and the results of in vitro binding assays alone by no means can be used as evidence for functionality. As such, we cannot conclude that USF binds the E-box of the endogenous SM-MHC gene. Nevertheless, results of in vivo footprinting assays showing occupancy of the endogenous SM-MHC E-box provide novel evidence for the possible function of E-boxes in control of expression of SMCspecific genes. Whereas the function of E-boxes and myogenic regulatory factors in control of skeletal muscle-specific genes and skeletal muscle differentiation has been extensively studied, little is known regarding the function of E-boxes in control of SMC-specific genes. Thus, it would be of great importance to identify binding proteins to E-boxes of the SMC-specific genes.
Development of SMC Subtype-selective Promoters-One of the most significant implications of the present studies is that studies establish the feasibility of engineering derivatives of the SM-MHC genomic sequence that function only in subsets of SMCs (34). Such derivatives could be of major utility for targeting expression of therapeutic agents to specific SMC subtypes and/or for purposes of studying the function of candidate genes in SMC subtypes in vivo using targeted knockout or overexpression systems. For example, on the basis of results of the present studies and our previous studies showing selective functionality of the CArG elements in subsets of SMCs, we would predict that a derivative SM-MHC promoter construct containing the sequence Ϫ4.2/ϩ2.5 plus ϩ7.5/ϩ9 and mutation of the intronic CArG should be selectively expressed in bronchial SMCs with little activity in large arteries and coronary arteries. However, as discussed above, because of the functional redundancy and interplay between modules, some regulatory modules might exhibit functions that were not detected in the present studies. Thus, SM-MHC promoter derivatives need to be tested empirically for their functions under the situations they would be used (adults versus embryos and in-tegrated versus episomal genes). Nevertheless, results of the present studies provide the framework for future studies of complex reconstitution and tailoring of SM-MHC regulatory modules.
In summary, we have mapped transcriptional regulatory regions within the SM-MHC Ϫ4.2 to ϩ11.6 kb sequence and revealed the complex modular cis-regulatory program of the SM-MHC gene in vivo. SMC-specific expression of the SM-MHC gene is not controlled by a single regulatory region but by the complex interplay of multiple positive-and negative-acting regulatory modules in SMC subtypes in vivo. Because of its unique degree of SMC specificity and tight regulation during SMC differentiation, results are likely to be broadly applicable to defining the transcriptional circuitry that controls cell typespecific gene expression during SMC differentiation.