The A20 cDNA Induced by Tumor Necrosis Factor a! Encodes a Novel Type of Zinc Finger Protein*

Isolation of the full-length cDNA followed by sequence analysis revealed that A20 codes for a novel zinc finger protein. Structural features suggest that this putative protein, which contains multiple CysZ/Cysa finger motifs, defines a novel class of zinc finger proteins. Southern analysis supports the existence of other members of the class.

Tumor necrosis factor O( (TNF)' is a cytokine secreted predominantly by cells of the monocyte lineage following activation. The importance of TNF in the development and maintenance of inflammation is suggested by the wide range of proinflammatory activities attributed to TNF and the variety of disease states with which it has been associated (1).
Among the many tissue types which are responsive to TNF the vascular endothelium has a fundamental role in mediating the proinflammatory activities of TNF. Positioned at the interface between the bloodstream and solid body tissues, the endothelium activity controls the flux of leukocytes into an inflammatory nidus. The endothelium has been shown to respond to TNF by developing procoagulant activities (2-4) and by actively increasing its adhesive interactions with circulating leukocytes through the expression of adhesive molecules such as intercellular adhesion molecule-l (5), endothelial leukocyte adhesion molecule-l (6), and vascular cell adhesion molecule-l (7). Our earlier work has shown that neutrophil chemotactic factor and monocyte chemotactic factor are expressed as primary response genes in endothelial In an effort to discover regulatory factors which function to initiate the cascade of endothelial responses to TNF, we previously used a strategy involving differential hybridization to identify several cDNA clones induced as primary response genes by TNF in human umbilical vein endothelial cells (8,9). Primary response genes are those which are induced rapidly and profoundly by a stimulus in the absence of protein synthesis. In other systems it has been demonstrated that genes induced as part of this initial wave of expression encode regulatory proteins that are either transcriptional factors or paracrine factors. Thus, with this type of approach we expected to obtain clones which would encode, among other things, transcriptional factors regulating the expression of the many genes TNF is likely to influence. One of the clones obtained, designated as A20, has a cognate mRNA approximately 4 kb in size which is rapidly (15 min) and transiently induced following TNF treatment. A20 was not identical to any known gene product based on sequence analysis of a partial length cDNA clone.
With this report we extend our characterization of the TNF-responsive A20 cDNA. Analysis of the full-length cDNA has revealed an open reading frame coding for a protein composed of 790 amino acid residues which is not identical to any other previously described protein. Within this putative protein two types of repeated elements are found including seven novel zinc finger motifs which may serve to define a new class of zinc finger proteins. Southern analysis suggests that the A20 gene is a member of a large family of related genes. Blots were washed with increasing stringency ranging from 2 X SSC at 42 "C to 0.1 X SSC at 65 "C.

AND DISCUSSION
The A20 cDNA clone isolated previously represented approximately 1.5 kb of the predicted 4.0-kb full-length mRNA. To obtain a full-length cDNA clone TNF-induced cDNA libraries were screened using the 32P-labeled 1.5-kb partial length A20 cDNA as a probe. Hybridizing clones were isolated and the cDNA insert sized by appropriate restriction mapping. This analysis yielded four additional cDNAs which were each sequenced to determine the overlap between the clones.
The nucleotide and predicted protein sequence of A20 is displayed in Fig. 1. The length of the cloned sequence (4440 nucleotides) is in good agreement with the size predicted for a full-length clone by Northern analysis. An open reading frame of 2370 nucleotides is indicated. The initiator methionine is encoded beginning at nucleotide 67, and this sequence is in complete agreement with Kozak's translation initiation consensus sequence (17). The cDNA encodes a protein pre-dieted to contain 790 amino acids; contains a 5'-untranslated region of 66 nucleotides and a 3'-untranslated region of 2001 nucleotides. Within the 3'-untranslated region, four copies of the consensus polyadenylation signal AATAAA are found along with four copies of the sequence ATTTA which is present in the mRNA of various oncogenes and cytokines and confers message instability (18,91). Data base searches (as described under "Experimental Procedures") using the A20 protein sequence gave uniformly negative results, i.e. there were no statistically significant matches to any known finger protein nor to any other proteins. However, intrasequence comparisons (see below) showed that A20 has a highly significant internally repetit,ive structure. Program CMPSEQ84 (26) was used to compute a self-comparison matrix with a window of 43 residues using the PAM 250 values (17) for scoring. A plotting threshold of >2.5 standard deviations above the mean for all scores was used. This threshold corresponds to a p value of <O.OOl although the highest scoring diagonals achieved scores with p values of <lo-". For comparative purposes, the highest scoring repeats in Xenopus transcription factor IIIA have scores with p values of <lo-"' (27).
A A20 was compared with itself using the RELATE program (20), and a score of 20.3 S.D. units was obtained. For comparative purposes, an SD. score of 23.0 is generally indicative of significant internal repeats, and RELATE analysis of the Xenopw transcription factor IIIA gives an S.D. score of 21.9 under similar conditions. Furthermore, segment displacement frequencies computed by RELATE indicated that the repeat unit was approximately 43-46 residues in length. To visualize the organization of repeats within the A20 protein, a selfcomparison matrix was computed (Fig. 2). This analysis reveals that greater than 50% of the A20 protein is composed of repeated elements and that the COOH-terminal half of the protein contains nine tandem copies of this element forming a continuous domain extending from approximately residue 386 to the carboxyl terminus.
Alignment of the repetitive sequences demonstrates a striking conservation of cysteine residues (Fig. 3B). The cysteines in these repeats form a pattern similar to the Cysp/CysZ finger motif found in the C, class of zinc finger proteins (21). Within the putative A20 protein, this finger motif is present 6 times in the form of Cys-X4-Cys-Xi,-Cys-XZ-Cys and once in the form of Cys-X,-Cys-X1,-Cys-X2-Cys ( Figs. 1 and 3B). Besides the nearly invariant occurrence and positioning of cysteines within these repeats, other residues are conserved to a lesser extent. This is illustrated in Fig. 3C by consensus sequences derived from three different cut-off frequencies.
A second unrelated repeat is also seen on the self-comparison matrix plot. Two copies of this element are found between The alignments depicted in panels A and B were derived in the following manner. Start and stop positions for the tandem repeat blocks were estimated from the locations of diagonals on the self-comparison matrix in Fig. 2. Start and stop positions for individual repeat units were then estimated from segment displacements reported by the RELATE program, and individual repeats were aligned manually. Finally, to optimize the alignments, each individual repeat was treated as a separate sequence, and the entire set was simultaneously aligned using program MSA (16). Panel C illustrates the consensus sequences obtained from the repeats in panel B derived at three different cut-off frequencies. Amino acid residues are color-coded according to hydropathy index and charge: hydrophobic, green; acid/amide, red; basic, blue; small and neutral, black. Proline and cysteine are colored yellow and uiolet, respectively, to indicate their unique structural potentials. FIG. 4. Southern blot demonstrating existence of A20-related genes. Human genomic DNA was digested with the indicated restriction enzymes, resolved on a 0.6% agarose gel and transferred onto nitrocellulose.
The blots were probed with a '"P-labeled fragment of the A20 cDNA which spanned from nucleotide 1371 to nucleotide 1793 .   I I I I I I   I  I  I  I  I  I  I  I  I  I  I  I kb 23 -6.5 -0 4.3 --residues 286 and 356 (Fig. 3A). Furthermore, the blank left lower and right upper quadrants on the matrix analysis reveal that the NHP-terminal domain (residues l-285) contains no repetitive sequences and is unrelated to the COOH-terminal half of the molecule. These structural distinctions are consistent with the possibility that separable functional domains exist within the A20 protein and provide a basis for future mutational analysis. Comparison of the putative A20 finger protein with previously described C, finger proteins (21)(22)(23)(24)(25) reveals two unique and important distinctions. The first is that A20 contains multiple repeated motifs of the Cysp/CysP type. Other previously described proteins of this class do not contain more than two nonequivalent fingers. The second distinction is the intercysteine spacing of X,,X,,,X,. This is the first report of a CysZ/CysZ motif in which the finger loop domain is composed of 11 amino acid residues. Furthermore, because an absolute characteristic of C, finger proteins has been the presence of no more than two finger motifs, these findings suggest that A20 is a member of a novel class of finger proteins. This class would be defined by the presence of multiple related Cys,/ Cys, finger motifs each containing a finger-loop domain of 11 amino acids.
The discovery of this novel type of finger motif suggests the possibility that the A20 gene is a member of a new family of zinc finger coding genes related to each other through conservation of this new Cysp/Cyso motif. To test this hypothesis, Southern analysis of human genomic DNA was carried out using as a probe a fragment of cDNA between nucleotides 1371 and 1783, corresponding to amino acids 436 and 576 and including two finger domains. Hybridization followed by a low stringency wash (1 x SSC, 55 "C) resulted in the detection of multiple bands with a wide range of signal intensities in each restriction enzyme digestion (Fig. 4). At increased stringency (0.1 x SSC, 55 "C) many of the bands initially apparent were lost or were markedly reduced. At high stringency (0.1 X SSC, 68 "C) a single prominent band remained visible in each digest. This evidence supports the hypothesis that a family of genes exists which is related to nucleotide sequence homology to A20 through its finger motifs. Additionally, the presence of a single band at high stringency suggests that A20 itself is encoded by a single gene.
In summary, we have identified a novel TNF-inducible gene 0.1 x ssc, 55°C 0.1x SSC,68"C with structural properties defining a unique class of zinc finger proteins. Southern analysis suggests that other genes may exist which belong to this same class. Given the association between other zinc finger proteins and genetic regulation, it is plausible that A20 may function as a TNF-inducible transcriptional factor.