Characterization of genomic structure and polymorphisms in the human carbamyl phosphate synthetase I gene☆
Introduction
Human carbamyl phosphate synthetase I (CPSI) (EC 6.3.4.16) is the rate-limiting enzyme that catalyses the first committed step of the hepatic urea cycle. The urea cycle is responsible for the removal of waste nitrogen produced by endogenous and exogenous protein metabolism. CPSI is highly tissue specific, with function and production limited to the liver and a lesser amount in the intestine. The 165 kDa CPSI proenzyme is produced in the cytoplasm and transported into the mitochondria where it is cleaved into its mature 160 kDa form. Mature CPSI enzyme and its cofactor n-acetyl glutamate (NAG) catalyse the conversion of ammonia and bicarbonate to carbamyl phosphate (CP) with the expenditure of two ATPs (Rubio and Grisolia, 1981, Rubio et al., 1981).
The CPSI gene is a highly conserved ancient gene with representatives from each of the well- defined domains of Bacteria, Archaea, and Eukarya (Schofield, 1993). An analysis of the mammalian CPSI coding sequence indicates that the CPSI gene encodes a protein that has arisen from a fusion of loci from two separate ancestral subunits that are found in yeast and Escherichia coli (Nyunoya et al., 1985a, van den Hoff et al., 1995). In bacteria, the smaller enzyme subunit is responsible for the catalytic transfer of the amide nitrogen from glutamine to the catalytic center for CP synthesis situated on the larger enzymatic subunit (Nyunoya et al., 1985a). In contrast, the human and rat CPSI enzymes are unable to process glutamine because they lack the cysteine residue that is essential for aminotransferase activity in the yeast and the bacterial enzymes (Nyunoya et al., 1985b, Rubio, 1993). In addition to housing the active sites for CP synthesis and a set of duplicated ATP-binding domains, the carboxy end of the protein on the large subunit is the site of the binding domain for NAG. The binding of NAG to CPSI is theorized to cause a conformational change in the enzyme that exposes the ATP-binding domains (Rubio, 1993). Although studies attempting to localize the CPSI ATP-binding sites have obtained varied results (Nyunoya et al., 1985a, Powers-Lee and Corina, 1986, Powers-Lee and Corina, 1987), the CPSI enzyme does have a sequence bearing a high degree of homology to known ATP/bicarbonate binding domains and other ATP-binding sites. Despite the performance of CPSI enzyme functional studies, the exact function of the NH4-terminal portion of the enzyme is not known.
Deficiencies in CPSI enzyme function reflect the severity of the underlying molecular defect (Summar et al., 1995). CPSI deficiency (CPSID) is inherited in an autosomal recessive mode and presents as either a devastating metabolic disease in neonates or a more insidious late-onset condition. In the present study we have determined the intron–exon organization of the human CPSI gene. We also have identified 14 polymorphisms in the gene that encodes the CPSI enzyme, one having an implication in environmental toxicity. In addition to presenting an evolutionary understanding of this gene, the information presented will facilitate studies of CPSI mutations and their role in the disruption of normal urea cycle function.
Section snippets
cDNA sequence
RNA was extracted from a normal human liver and reverse transcribed using oligo(dT) and primers derived from rat CPSI sequence. Subsequent PCR reactions were done using primers derived from rat sequence and fragments were cloned and sequenced. 5′ and 3′ sequences were obtained from screening a human hepatic cDNA library and obtaining partial clones spanning these regions. Using this sequence as a base we have determined the cDNA sequence from over ten individuals in order to arrive at a
Results
Utilizing various experimental protocols, we report for the first time the complete structure of the human CPSI gene and identify fourteen polymorphisms that are located in its coding and non-coding regions.
Organization of the human CPSI gene
Our data of the intron and exon sequence shows that CPSI adheres to the consensus sequence of a ‘GT’ at the 5′ end of each intron and an ‘AG’ at the 3′ end. Additionally, because 37 out of 38 exons end in a ‘G’, this is greater than the average consensus of a last ‘G’ in mammalian exons. Perhaps this observation is not too unusual in light of the fact that the CPSI gene has a highly conserved nature that extends between its structure and that of its known non-mammalian bacterial and yeast
Acknowledgements
The authors wish to thank the NIH (ES09915) and the National Urea Cycle Disorders Foundation for their support of this project.
References (14)
- et al.
Characterization and derivation of the gene coding for mitochondrial carbamyl phosphate synthetase I of rat
J. Biol. Chem.
(1985) - et al.
Domain structure of rat liver carbamoyl phosphate synthetase I
J. Biol. Chem.
(1986) - et al.
Photoaffinity labeling of rat liver carbamoyl phosphate synthetase I by 8-azido-ATP
J. Biol. Chem.
(1987) - et al.
Carbamyl phosphate synthetase I of human liver. Purification, some properties and immunological cross-reactivity with the rat liver enzyme
Biochim. Biophys. Acta
(1981) - et al.
Cloning and sequence of cDNA encoding human carbamyl phosphate synthetase I
Gene.
(1991) A survey on intron and exon lengths
Nucleic Acids Res.
(1988)- et al.
The gene coding for carbamoyl phosphate synthetase I was formed by fusion of an ancestral glutaminase gene and a synthetase gene
Proc. Natl. Acad. Sci. USA
(1985)
Cited by (0)
- ☆
The full working draft sequence is BAC clone NH0349G04, accession number AC008172.1. The CPSI coding sequence derived from our laboratory is at accession number AF154830. A previously published coding sequence is at accession number NM_001875. Supplementary data includes a list of primer sequences for primers listed in Table 2. Primer sequences are: U1119 (TACTGCTCAGAATCATGGC), U2712 (AGAGTTGTCTGAACCAAGCA), U4295 (CGGAAGCCACATCAGACTGG), U4926 (AATGGTGATCAAGGTAGGAA), L5025 (TGTCCTGAGTTTGCAGATAG), L5277 (TGGAGAGTGTGACTCCATCT), U5195 (TGTGACAGAGGCATTTAGAG), L5547 (GGAATGAACCTTACTTCCAA).