A novel allelic variant of the human TSG-6 gene encoding an amino acid difference in the CUB module. Chromosomal localization, frequency analysis, modeling, and expression.

Tumor necrosis factor-stimulated gene-6 (TSG-6) encodes a 35-kDa protein, which is comprised of contiguous Link and CUB modules. TSG-6 protein has been detected in the articular joints of osteoarthritis (OA) patients, with little or no constitutive expression in normal adult tissues. It interacts with components of cartilage matrix (e.g. hyaluronan and aggrecan) and thus may be involved in extracellular remodeling during joint disease. In addition, TSG-6 has been found to have anti-inflammatory properties in models of acute and chronic inflammation. Here we have mapped the human TSG-6 gene to 2q23.3, a region of chromosome 2 linked with OA. A single nucleotide polymorphism was identified that involves a non-synonymous G --> A transition at nucleotide 431 of the TSG-6 coding sequence, resulting in an Arg to Gln alteration in the CUB module (at residue 144 in the preprotein). Molecular modeling of the CUB domain indicated that this amino acid change might lead to functional differences. Typing of 400 OA cases and 400 controls revealed that the A(431) variant identified here is the major TSG-6 allele in Caucasians (with over 75% being A(431) homozygotes) but that this polymorphism is not a marker for OA susceptibility in the patients we have studied. Expression of the Arg(144) and Gln(144) allotypes in Drosophila Schneider 2 cells, and functional characterization, showed that there were no significant differences in the ability of these full-length proteins to bind hyaluronan or form a stable complex with inter-alpha-inhibitor.

Tumor necrosis factor-stimulated gene-6 (TSG-6) 1 encodes an ϳ35-kDa secreted protein (1) that is likely to have roles in extracellular matrix remodeling and leukocyte migration (2,3). The production of TSG-6 is tightly controlled, with little or no constitutive expression in adult tissues, and is induced during inflammatory disease (e.g. arthritis; see below) as well as in normal "inflammation-like" processes (e.g. cumulusoocyte complex expansion prior to ovulation (4,5) and cervical ripening (6)).
TSG-6 protein has been found in the synovial fluids of patients with different forms of arthritis, including osteoarthritis (OA) and rheumatoid arthritis (RA), but is not detectable in individuals without known joint disease (7); the highest levels of protein have been seen in RA, but similar levels were detected in some patients with OA (7,8). TSG-6 has been immunolocalized to both cartilage and synovium in OA and RA but was not detected in normal samples, making it likely that the source of this protein is the joint tissues themselves (3). For example, in OA cartilage, the majority of chondrocytes expressed TSG-6, usually with extensive staining in the surrounding matrix. In addition, TSG-6 has been detected in cartilage of the STR/ort mouse that develops a natural form of OA (9). Interestingly, TSG-6 protein was detectable in the chondrocyte pericellular matrix before OA lesions developed, with an up-regulation of TSG-6 mRNA as the disease progressed.
In vitro studies have shown that TSG-6 is secreted from a variety of cell types found in articular joints (including chondrocytes, synoviocytes, and vascular smooth muscle cells) in response to inflammatory mediators and growth factors (reviewed in Refs. 2,3). For example, TSG-6 expression (mRNA and protein) is induced in human chondrocytes, from macroscopically normal cartilage, by interleukin-1␤ (IL-1␤), tumor necrosis factor, and transforming growth factor-␤1 (10). An mRNA fingerprinting technique identified TSG-6 as one of the major species to be up-regulated (ϳ6-fold) in response to IL-1␤ in chondrocytes obtained from OA patients undergoing joint surgery (11).
The mature human TSG-6 protein is 260 residues in length and is comprised of an N-terminal sequence of 19 amino acids followed by contiguous Link and CUB modules (residues 37-128 and 129 -250, respectively, in the preprotein (1)) and a 27-amino acid C-terminal sequence. The Link module of human TSG-6, for which a tertiary structure has been determined (12), has been shown to interact with components of cartilage extracellular matrix, i.e. hyaluronan (HA), chondroitin-4-sulfate, and aggrecan (13,14); the position of the HA-binding site on the Link module has been localized recently by NMR spectroscopy (15) and site-directed mutagenesis (16). At present, the ligand binding specificity of the TSG-6 CUB module is unknown; in other proteins the CUB module (defined as Complement subcomponents C1r/C1s, Uegf, Bmp1 (17)) has been implicated in protein⅐protein (18) and protein⅐carbohydrate interactions (19).
Full-length human TSG-6 (expressed in insect cells) has been shown to be a potent inhibitor of neutrophil migration in a mouse air-pouch model of acute inflammation (20). Recombinant TSG-6 was also found to ameliorate collagen-induced arthritis in mice, where there was a large reduction in pannus formation and cartilage erosion (21), and to have a chondroprotective effect in a proteoglycan-induced model of arthritis (22). In vitro, TSG-6 can form a stable (probably covalent) ϳ120-kDa complex with inter-␣-inhibitor (I␣I); TSG-6⅐I␣I complexes of this size have been detected in vivo (e.g. in RA synovial fluids (7) and in expanded cumulus-oocyte complexes (5, 23)), where they are likely to be involved in stabilizing extracellular matrices rich in HA. In addition, TSG-6 is able to potentiate the anti-plasmin activity of I␣I (20), although it is not clear whether formation of the TSG-6⅐I␣I complex is necessary for this activity. It has been hypothesized that the inhibition of neutrophil migration by TSG-6 is due to its effect on the plasmin network (2,20), but this remains to be established.
The TSG-6 gene has been mapped to human chromosome 2 but was not assigned to the p or q arms (24). Previously, as part of a genome-wide screen for OA susceptibility loci (25,26), we obtained evidence for a linkage across a broad region of chromosome 2q ranging from 2q14.1 through to 2q31, with a maximum multipoint LOD score of 1.22, which increased to 2.19 in families that were concordant for hip-only disease. Therefore, it was thought possible that TSG-6 could represent a susceptibility gene for OA, especially given its up-regulation in OA and its biological activities in vivo (described above).
Here we have mapped the TSG-6 gene to 2q23.3 by radiation hybrid mapping and subsequently typed a single nucleotide polymorphism (SNP), involving a non-synonymous G 3 A transition (that results in an Arg to Gln change in the CUB module), in a panel of 400 OA cases and 400 controls. The effect of this coding change on TSG-6 function was investigated by the expression of the full-length allotypes in insect cells and analysis of their relative abilities to bind HA and form a stable complex with I␣I.
Cloning and Sequencing of Human TSG-6 -First-strand cDNA was reverse-transcribed from human primary osteoblast total RNA (kindly provided by Dr. Bev Fermor) and used to amplify full-length TSG-6 (nucleotides 1-831 in Ref. 1) for 30 cycles of 94°C, 57°C, and 72°C (1 min each) with Tli polymerase (Promega). The forward and reverse primers used, 5Ј-TATGGTACCATGATCATCTTAATTTACTTATTTCT-C-3Ј and 5Ј-ATATCTAGATTATTATAAGTGGCTAAATCTTCCAGC-3Ј, introduced KpnI and XbaI restriction sites (underlined), respectively, allowing the product to be cloned into KpnI/XbaI-cut pAcCL29 -1 vector (27). Two internal sequence primers were used (5Ј-GCCTATTGCTAC-AACCC-3Ј and 5Ј-GCCAGTAGCAGATTTGG-3Ј), in addition to the PCR primers, to generate contiguous sequence data on both strands of the insert from three separate clones; sequencing was performed on an Applied Biosystems 377 DNA analyzer.
Amplification-Refractory Mutation System-ARMS analysis, a PCRbased method for detecting single base changes (see Ref. 28), was carried out on genomic DNA from 38 HLA homozygous Epstein-Barr virus-transformed B lymphoblastoid cell lines (10th International Histocompatibility Workshop). Each DNA sample was analyzed in two (allele-specific) reactions, both of which contained control primers A and B (29), which amplify a 360-bp region of the human ␣-1-antitrypsin gene, and a common TSG-6 reverse primer (5Ј-TCTCCACAGTATCTT-CCCACAAAGCCATGG-3Ј). In addition, one reaction included the "g-primer" (5Ј-GGAGTGTGGTGGCGTCTTTACAGATCCAAAGAG-3Ј) and the other the "a-primer" (5Ј-GGAGTGTGGTGGCGTCTTTACA-GATCCAAAGAA-3Ј), which can each amplify a product of 236 bp, in combination with the reverse primer, if the appropriate sequence (G 431 or A 431 , respectively) is present. All reactions contained 1 g of DNA, 1.5 mM MgCl 2 , 120 M each dNTP, 1.25 mM reverse primer, 1.25 mM g-primer or a-primer, 0.12 mM control primers, and 1 unit of Taq Polymerase in a 50-l reaction. Cycling conditions were: 94°C for 5 min, followed by 30 cycles of 1 min each at 94°C, 61°C and 72°C, followed by 10 min at 72°C. Reaction mixtures were analyzed by agarose gel electrophoresis.
Sequencing of PCR Products-A 229-bp PCR product was amplified from three genomic DNA samples (identified as either heterozygous GA, homozygous AA, or homozygous GG from ARMS analysis) using primers (forward: 5Ј-CAAAGGAGTGTGGTGGCGTCTTTAC-3Ј; reverse: 5Ј-CTTCCCACAAAGCCATGGACATCAT-3Ј) that flank nucleotide position 431 of TSG-6. The products were purified on agarose gels and sequenced using the reverse primer.
Homology Modeling-The CUB module from human TSG-6 (amino acids 129 -250 in Ref. 1 with an Arg at position 144) was modeled using the program Modeler4 (30) on the basis of the co-ordinates of three spermadhesins (boar PSP-I and PSP-II (31) and bovine aSFP (32)) and a multiple sequence alignment generated with Multalign (AMPS package (33)), to which minor adjustments were made by eye. One hundred independent models were generated, and Procheck (34) and WhatCheck (35) were used to analyze the six models with the lowest molecular probability density functions. The model with the best overall stereochemistry and energies was chosen for refinement. XPLOR version 3.8 (36) was used to add hydrogen atoms and disulfide bonds and to carry out energy minimization and molecular dynamic simulations with the CHARMm22 force field (37) as described previously (16).
Association Analysis of a TSG-6 Single Nucleotide Polymorphism (SNP) with Severe OA-Association analysis was performed on a proband/spouse case-control cohort; all cases (probands) were ascertained through the Nuffield Orthopaedic Centre in Oxford. Cases had undergone total joint replacement of the hip and/or the knee for primary OA. Patients who underwent total joint replacement secondary to other factors, such as fracture or RA, were excluded. The primary status was supported by clinical, radiological, operative, and histological findings and has been described in detail elsewhere (25). Four hundred cases were studied (Tables I and II). Controls (spouses) had not undergone any joint-replacement surgery or required clinical treatment for OA. During ascertainment, if a spouse had any evidence of symptomatic OA, then that spouse and their affected partner (the case) were excluded from the study. In this way, we identified a case group of severely affected individuals and an age-matched control group in which clinical OA had a very low frequency. All cases and controls were of Caucasian origin and informed consent was obtained from all subjects.
The SNP (at nucleotide 431) does not affect a restriction site. Thus, to create a restriction site we designed a PCR forward primer that spanned nucleotides 396 -430 of human TSG-6 (1) in which a change from A to C was introduced at position 428. Therefore, the PCR product from the G 431 allele contains a BstUI site (CGCG) whereas the product from the A 431 allele does not, which allows the SNP to be typed by BstUI restriction analysis; there are no native BstUI sites in this region of the TSG-6 sequence. The sequence of the forward primer is 5Ј-ACGATAA-AGGAGTGTGGTGGCGTCTTTACAGATCCAACGC-3Ј; the engineered change is indicated in boldface, whereas the underlined portion represents a sequence that was added to balance the number of AT to CG nucleotides. The reverse primer is the same as that used in the ARMS analysis. PCR amplification was performed as described above for the radiation hybrid analysis except that a final concentration of 2 mM MgCl 2 and an annealing temperature of 58°C were used. The amplified product was digested with BstUI (New England BioLabs), using the manufacturer's recommended conditions, and analyzed by gel electrophoresis on a 3% (w/v) agarose gel. A heterozygous individual was always included in each set of samples to verify that the restriction enzyme was active.
Allele and genotype distributions between cases and controls were compared by 2 using standard contingency table analysis. For each stratification analysis, female cases were compared with female controls and male cases with male controls.
Expression of TSG-6 R and Q Allotypes in Schneider 2 Cells-The full-length coding sequence of human TSG-6 (A 431 allele encoding Gln at amino acid 144), including the signal sequence and stop codon, was excised from the pAcCL29 -1 vector by digestion with KpnI/XbaI and cloned into the corresponding sites in the Drosophila Expression System (DES) vector pMT/V5-His B (Invitrogen). This plasmid (designated pDES_TSG6_Q; verified as having the expected DNA sequence) was used to generate a second DES vector, representing the G 431 allele (encoding Arg at residue 144), by mutagenesis using the Transformer site-directed mutagenesis kit (CLONTECH) according to the manufacturer's instructions (see Refs. 16,38). The mutation primer (5Ј-AGAT-CCAAAGCGAATTTTTAAATCTC-3Ј; mutated residue in boldface) and selection primer (5Ј-AGAGGGCCCTAGGTTCGAAGG-3Ј), which changes a unique SacII site in the polylinker of the DES vector to AvrII (underlined), were both synthesized with 5Ј-phosphate groups. A DES plasmid containing the desired mutation and no other changes was identified by DNA sequencing and denoted pDES_TSG6_R.
Stable transfectants were generated for each of the two DES plasmids by cotransfection of Schneider 2 cells (a cell line derived from Drosophila melanogaster (39)) with a vector carrying the hygromycin B resistance gene (pCoHygro) and hygromycin B-resistant cells were selected as a stable polyclonal population following the manufacturer's instructions. Heterologous protein expression was induced by adding CuSO 4 to a final concentration of 500 M ϳ5 h after cells had been transferred into serum-free medium. Culture supernatants (900 ml) were harvested after 66 h and clarified by centrifugation. To these 20 ml of 50% (v/v) SP-Sepharose (Amersham Biosciences) in 300 mM sodium acetate, pH 4.0, were added and incubated for 20 min, then washed twice with the acetate buffer. The SP-Sepharose (in a sintered column) was washed with 50 ml of acetate buffer, and the recombinant protein was eluted with 5 ϫ 10 ml of 20 mM MES⅐HCl, pH 6.5, 500 mM NaCl. Fractions 2 and 3 were combined and then loaded onto a Phenomenex 250 ϫ 10-mm Jupiter C5 column (300 Å, 5 m) at a flow rate of 3 ml/min. Initial conditions were maintained for 10 min, and then the protein was eluted by linear gradients of 20 -55, 55-80, 80 -95% B (in A) over 10, 20, and 5 min, respectively. The absorbance at 277 nm was monitored continuously, and TSG-6 was collected manually. The TSG-6 allotypes were then lyophilized, resuspended in water (at 1 mg/ml), and stored at Ϫ20°C; the protein concentrations were determined by amino acid analysis as described previously (16). The purified proteins were analyzed on 10% (w/v) Tris-Tricine SDS-PAGE (40) following reduction and alkylation with dithiothreitol and iodoacetamide, respectively.
Analysis of HA Binding-The HA-binding activities of the purified recombinant TSG-6 allotypes (human TSG-6R and TSG-6Q) were compared using a colorimetric assay that measures the binding of biotinylated-HA (12.5 ng/well) to protein-coated microtiter plates (0.25-32 pmol/well) at pH 6.0, as described previously (16); the interaction between TSG-6 and HA is maximal at pH 6.0 (14). 2 All absorbance measurements (405 nm) were corrected by subtracting values from uncoated control wells.
Formation of TSG-6⅐I␣I Complex-Purified TSG-6R and TSG-6Q (at 80 g/ml final concentration, 2.7 M) were each incubated in 20 mM HEPES⅐HCl, pH 7.5, 150 mM NaCl, 5 mM MgCl 2 (final concentrations) with I␣I (320 g/ml final concentration, 1.3 M), purified from human serum (41), at 37°C in a final volume of 50 l; these conditions give maximum conversion of free TSG-6 into complex using purified I␣I. 3 At various time points (ranging from 30 s to 1 h) 2.5 l of the TSG-6⅐I␣I reaction mixtures was removed, treated immediately with gel sample buffer, and then analyzed by Western blotting; TSG-6 and I␣I were incubated (for 60 min) individually under identical conditions as controls. Briefly, the samples were run under reducing conditions on 10% (w/v) Tris-Tricine SDS-PAGE and electroblotted onto Hybond-P membranes (Amersham Biosciences). Washes were performed in phosphatebuffered saline, containing 0.1% (v/v) Tween 20, after each stage of the Western blotting procedure, all of which were carried out at room temperature. The membranes were blocked overnight in phosphatebuffered saline containing 10% (w/v) milk powder, 0.1% (v/v) Tween 20, 0.02% (w/v) bovine serum albumin and washed thoroughly. They were then incubated with a 1:20,000 dilution of a rabbit anti-human antiserum (that recognizes the C-terminal 16 amino acids of human TSG-6 (6)) and, following washing, with a 1:5000 dilution of a horseradish peroxidase-conjugated donkey anti-rabbit IgG for 1 h each in 10% (v/v) blocking solution without bovine serum albumin. After a final wash, antibody binding was visualized by the enhanced chemiluminescence system, according to the instructions of the supplier (Amersham Biosciences).   gene between the microsatellite loci D2S2275 and D2S2236 and at 2q23.3. Mouse TSG-6 has been mapped previously to the 28.4 -31.6-centimorgan region of chromosome 2 (4), which is equivalent to 2q24.1-2q24.2 in humans (42). Therefore, it can be seen that TSG-6 in the mouse and human are at similar but not identical chromosomal locations. In humans, the TSG-6 gene is within the 2q12-q35 region of chromosome 2 that we, and others, have found evidence for harboring an OA susceptibility locus (26,43).

Radiation Hybrid
Identification of an SNP in Human TSG-6 -Sequencing of TSG-6 inserts, derived from human osteoblast mRNA and cloned into the pAcCL29-1 vector, revealed a single nucleotide difference from the published sequence (1); with an A at position 431 rather than a G, leading to a Arg to Gln substitution at amino acid 144 in the sequence of the preprotein. ARMS analysis was then performed on 38 genomic DNA samples to investigate whether this sequence difference represented a genuine SNP or a PCR artifact. As shown in Fig. 1A, the ARMS assay can distinguish both G and A nucleotides at position 431. In this panel of samples, 37 out of 38 gave a product with the A-specific primer (1 GG, 11 GA, and 26 AA); direct sequencing of PCR products amplified from three samples (determined to be AA, GA, or GG) confirmed the accuracy of the ARMS typing (Fig. 1B). Therefore, it can be concluded that the sequence identified here (A 431 ) represents an allelic variant of the TSG-6 gene; this sequence has been deposited at the EMBL data base (accession code AJ419936). It should be noted that data bank searching has revealed that the sequences AF086484 and XM_002762 also contain an A at this position, and these are identical to the sequence determined here, in the regions where they overlap (e.g. in XM_002762 this is the entire coding sequence).
PCR products amplified from the 5Ј-flanking regions of GG and AA homozygotes were found to have identical sequences (between Ϫ1316 to ϩ68; numbered as in Ref. 24), indicating that there are no promoter polymorphisms in linkage disequilibrium with either of the 431 alleles; these sequences have been deposited at EMBL with accession codes of AJ413948 and AJ413949, respectively. There were some differences in the sequences determined here from that published previously by Lee et al. (24), with insertions of A, G, C, T, A, C, and TG between nucleotides Ϫ1292 andϪ1291, Ϫ867 and Ϫ866, Ϫ681 and Ϫ680, Ϫ678 and Ϫ677, Ϫ631 and Ϫ630, Ϫ472 and Ϫ471, and Ϫ53 and Ϫ52, respectively. However, none of these insertions are in regions that have been determined experimentally to be transcription factor-binding sites (44,45).
Modeling of the TSG-6 CUB Module-A model of the CUB module from human TSG-6 was constructed to estimate the structural location of residue 144 (Arg and Gln in the G and A alleles, respectively). Modeling was performed on the basis of the co-ordinates from three spermadhesins, each comprised of a single CUB module that has a structure related to the jellyroll fold (31,32), and the alignment in Fig. 2. In the alignment there are a number of insertions in the TSG-6 sequence (relative to the spermadhesins), but none of these occur in regions corresponding to secondary structural elements (31). The model that was chosen for energy minimization (from the 100 generated) had only 2.9% of residues in disallowed areas of the Ramachandran plot. In the final model the total energy and van der Waals terms were Ϫ1033 and Ϫ6 kJ/mol, respectively, indicating that the model was of good quality, with acceptable packing and backbone conformation.
From Fig. 2 it can be seen that the guanidinium group of Arg 144 is predicted to be on the surface of the CUB module; analysis with the program Naccess 4 indicates that ϳ18% of the Arg side chain in the model structure is solvent-accessible. The Arg/Gln substitution at this position could influence the biological activities of TSG-6, because Arg side chains are longer than those of Gln and have a different charge state. In the model, Arg 144 is in close spatial proximity to the N-terminal amino acid of the CUB module (i.e. Asn 129 ; C ␣ -C ␣ distance of 5.1 Å), and thus to the C terminus of the preceding Link module (defined on the basis of the three-dimensional structure as residues 37-128 (12)). Therefore, it is possible that amino acid 144 could make contacts with amino acids on the Link module and modulate its functional activity. In this regard, we have shown previously that the binding of an HA octasaccharide to Link_TSG6 causes an alteration in side-chain dynamics of Asn 129 (15), although this residue is on the opposite face of the protein from the HA-binding surface (16). Residue 144 is also fairly close in space to two VXXDP motifs (residues 138 -142 and 245-249; the C ␣ of Arg 144 is 11.8 Å and 6.6 Å from the C ␣ carbons of Val 138 and Val 245 , respectively). It has been suggested previously that such motif sequences may be involved in the formation of the stable 120-kDa complex between TSG-6 and I␣I (see Ref. 46). Therefore, the TSG-6R and TSG-6Q allotypes could show differences in HA binding and complex formation with I␣I.
Genotyping of OA Cases and Controls using BstUI Restriction Mapping-Initial analysis of 14 DNA samples genotyped  (g and a), one of which contained a primer that only amplifies G 431 (g-primer) whereas the other contained an A 341 -specific oligonucleotide (a-primer). The band at 360 bp is derived from the ␣-1-antitrypsin (ATT) gene and is included in each reaction as a PCR control. The 236-bp band is sequence-specific and indicates the presence of the G 431 and/or A 431 nucleotide in the TSG-6 sequence. In B, sequencing data (from the antisense strand) are shown for AA homozygous DNA (top panel), GA heterozygous DNA (middle), and GG homozygous DNA (bottom). The position of nucleotide 431 is indicated with an arrow. In C, BstUI restriction mapping is shown for three genomic DNA samples: PCR products that are refractory to digestion (245 bp) are derived from the A allele, whereas those that cut with this enzyme (206 bp) indicate a G allele.
using the ARMS assay (including AA, GA, and GG haplotypes) determined that BstUI restriction mapping (where an engineered primer generates a BstUI site in the PCR product from the G but not the A allele; see Fig. 1C) produced identical results. The BstUI method is more suitable than ARMS for typing a large number of individuals, because only one set of primers and one gel lane is required per sample. We typed 400 OA patients and 400 controls for the TSG-6 SNP (Table III). The results from the control group showed that A 431 (identified here) is the major TSG-6 allele found in the Caucasian population with over 75% of individuals typed being homozygotes. Only 1.8% of the controls were homozygous for the G 431 sequence (identified previously (1)) indicating that this allele is relatively rare.
There was no significant difference (p Յ 0.05) in the frequency of GG, GA, or AA genotypes between OA cases and controls. When we stratified the data (Table IV), there were no significant differences in the genotype or allele frequencies between female and male cases, between female and male controls, between female cases and female controls, or between male cases and male controls. There was an increase in the frequency of the AA genotype in knee cases (79.6%, versus 74.0% in hip cases and 74.0% in unstratified controls), and this was reflected by an increase in the frequency of the A 431 allele in the knee cases (89.4%, versus 86.1% in hip cases and 86.1% in unstratified controls), however, this was not significant (p ϭ 0.27 for knee cases versus hip cases, and p ϭ 0.25 for knee cases versus controls).
Expression and Functional Analysis of the TSG-6R and TSG-6Q Allotypes-Preliminary experiments to express fulllength TSG-6Q using a baculovirus system, with the pAcCL29 -1 vector and the insect cell line Sf21, indicated that the protein was produced at low levels (ϳ0.5 mg/liter, data not shown). Therefore, the entire coding sequence of TSG-6Q (i.e. including the signal sequence and stop codon) was subcloned into pMT/V5-His B (a DES vector), which was then used to generate a vector encoding the TSG-6R allotype by site-directed mutagenesis. Transient transfection of Schneider 2 cells indicated that both the TSG-6R and TSG-6Q proteins were secreted into the culture supernatant following induction with CuSO 4 (as determined by Western blot analysis; data not shown). Stable transfectants of these allotypes were then established, and the recombinant proteins were expressed, as described under "Experimental Procedures." The proteins were initially purified from the culture supernatants using SP-Sepharose, in a similar way to that described by Wisniewski and colleagues (46) for baculovirus-expressed TSG-6R; SDS-PAGE analysis indicated that the majority of TSG-6R and TSG-6Q were each eluted in fractions 2 and 3 (data not shown). These fractions were then combined and run on reverse- phase   FIG. 2. Modeling of the CUB module from human TSG-6. In A, the TSG-6 CUB module (CUB_TSG6: residues 129 -250 with an Arg at position 144) is aligned with bovine aSFP (residues 1-111), boar PSP-I (residues 1-108), and PSP-II (residues 3-113). Regions of ␤-sheet (b) secondary structure (SS) in the spermadhesins (31) are indicated below the alignment. Black and gray boxes with amino acids highlighted in white indicate sequence positions with identities or conservative replacements, respectively, in the four sequences. Gray boxes (with black text) denote sequence identities between TSG-6 and one or more of the spermadhesins. The position of residue 144 in TSG-6 is shown in white on a black circle. In B, the CUB_TSG6 model is shown in a Molscript (51) representation, where the right-hand molecule is rotated 90°relative to that on the left. The ␤ strands, identified with DSSP (52), are numbered 1-9. The first ␤ strand in CUB_TSG6 corresponds to strand 2 in the spermadhesins; residues 130 -132 of TSG-6 are predicted to be a ␤-strand only when C ␣ geometry is used as the basis of defining the secondary structure. The N and C termini, which emerge from the same face of the molecule, are denoted by N and C, respectively. Arg 144 (R144; depicted in ball-and-stick mode) is the second residue of ␤ strand 2 of CUB_TSG6. In C, the CUB_TSG6 model is shown in a spacefilling representation, in the same orientations as the structures in B.
high performance liquid chromatography. This second purification step led to TSG-6 proteins of high purity (Fig. 3), which were free of salt. Amino acid sequencing of the two allotypes expressed here revealed that they both had the same N terminus (i.e. WGFKDGIFHN) as that reported previously for protein produced in the baculovirus system (46). In addition, both the recombinant proteins have an apparent molecular mass of ϳ33 kDa, which is very similar to that reported for the native protein (7,46); matrix-assisted laser-desorption/ionization time of flight mass spectrometry on TSG-6Q indicates a mass of ϳ30.1 kDa (data not shown), and this value was used to determine protein concentrations in the functional assays. From Fig.  3 it can be seen that TSG-6Q and TSG-6R have the same apparent mass on SDS-PAGE, which indicates that there are no major differences in the glycosylation of the two allotypes. This is not surprising considering that neither of the two potential N-linked sites in TSG-6 is located in the CUB module (one is in the Link module and the other is in the C-terminal sequence). The recombinant TSG-6 allotypes described here are of high purity (estimated to be Ͼ98% pure) and were both expressed at about the same level (ϳ3 mg/liter; determined on the basis of amino acid analysis).
To see if the TSG-6 allotypes have any significant functional differences, their HA-binding functions and abilities to form stable complexes with I␣I were analyzed. TSG-6R and TSG-6Q were found to exhibit similar HA-binding activities over a wide range of concentrations in a microtiter plate assay (Fig. 4). This assay has been used extensively to characterize the interaction of HA with the TSG-6 Link module (see Ref. 16). The binding of the biotinylated-HA was found to be greatly reduced by the presence of competing unlabeled HA (200-fold molar excess), indicating that the interactions are specific (data not shown). From Fig. 4 it can be seen that maximum binding is achieved in the assay when there is ϳ8 pmol of TSG-6/well. It should be noted that identical assays on the recombinant Link module of TSG-6 have shown that, in this case, maximum binding is seen with ϳ25 pmol/well (data not shown; see also Ref. 16). This might indicate that full-length TSG-6 has a higher affinity for HA than the isolated Link module, but further experiments will be required to test this possibility.
Previously, it has been demonstrated that TSG-6 forms a stable, probably covalent, complex of ϳ120 kDa with the serine protease inhibitor I␣I (46). As can be seen from Fig. 5, incubation of TSG-6R or TSG-6Q with purified I␣I gave rise to a species of ϳ120 kDa that is recognized by an anti-TSG-6 antibody. Similar levels of the ϳ120-kDa species were formed by  both allotypes over a wide range of incubation times (from 30 s to 1 h); SDS-PAGE analysis of identical assays confirmed that there are no detectable differences in complex formation exhibited by the TSG-6Q and TSG-6R proteins (data not shown). In addition, assays conducted with different TSG-6 to I␣I ratios indicate that these allotypes have identical complex-forming properties (data not shown). The experiment in Fig. 5 shows that, under the assay conditions (i.e. pH 7.5, 150 mM NaCl, 37°C), which are similar to physiological, the two TSG-6 allotypes are both able to form a TSG-6⅐I␣I complex very rapidly, and this process is essentially complete within 30 s.

DISCUSSION
Here we have identified an SNP in the human TSG-6 gene (involving a non-synonymous G to A transition at nucleotide 431 of the published TSG-6 cDNA sequence (1)) that results in a coding change of amino acid 144 (Arg to Gln), located within the CUB module. Furthermore, we have shown that the sequence encoding Gln 144 represents the major allele in Caucasians. We have mapped TSG-6 to human 2q23.3 by radiation hybrid mapping, consistent with the assignment of TSG-6 to human chromosome 2 using somatic cell hybrids (24). This is within a region of chromosome 2q for which we have reported suggestive evidence for linkage to OA, i.e. 2q14.1 through 2q31 (25,47); other studies have found linkage to 2q12-q21 (48) and 2q23-q35 (43). Thus, TSG-6 could be considered a possible candidate for OA susceptibility, especially given that the expression of TSG-6 protein is up-regulated in cartilage of patients with OA (3). However, genotyping of a panel of 400 unrelated cases and 400 controls showed no evidence for an association between OA and the TSG-6 alleles when the data was analyzed either unstratified, or stratified according to sex, or the site of joint replacement (i.e. hip or knee). Therefore, the TSG-6 SNP identified here is not a marker for OA susceptibility in the patients that we have studied. This does not exclude the possibility that other, as yet unknown, common variants within the TSG-6 gene may account for at least some of the OA susceptibility that appears to reside in this part of the human genome.
It remains to be investigated whether the TSG-6 A 431 and G 431 alleles are associated with other disease processes. This would seem more likely, if the resulting TSG-6R and TSG-6Q allotypes exhibit either functional differences or if they are expressed at differential levels in response to particular stimuli. In regard to the latter possibility, characterization of a segment of the 5Ј-flanking regions (corresponding to nucleotides Ϫ1324 to ϩ68 in Ref. 24) amplified from homozygous genomic DNA samples showed that these sequences are the same in the two alleles, indicating that there are unlikely to be any differences in their levels of expression. A sequence identical to the ones described here (spanning nucleotides Ϫ1301 to Ϫ20 numbered as in Ref. 24) has also been isolated from human cervical muscle cell genomic DNA (6), suggesting that the differences in the 5Ј-flanking sequence derived from human peripheral blood (24) are more likely to be due to sequencing errors rather than representing polymorphic variants.
Three-dimensional homology modeling of the TSG-6 CUB module, described in this study, predicts that the side chain of Arg 144 is at least partly solvent accessible. Although no function, or ligand-binding activity, has yet been ascribed to the TSG-6 CUB module, this result suggests that replacement of Arg 144 with a glutamine could lead to differences in functions described for the Link module or the full-length protein. To test this possibility, we expressed full-length TSG-6 proteins, corresponding to each of the allotypes (i.e. TSG-6R and TSG-6Q) in a eukaryotic expression system; the Drosophila system used directed expression into the culture supernatant by utilizing the native signal sequence of TSG-6. Both allotypes were expressed and secreted by the Schneider 2 cells at equivalent levels, and they could be purified to near homogeneity by a combination of ion exchange and reverse-phase high performance liquid chromatography. HA binding to TSG-6R and TSG-6Q was compared as was their ability to form a stable ϳ120-kDa complex with I␣I; these unrelated properties of TSG-6 were thought most likely to be affected by the Arg to Gln substitution (see above) and were reasonably straightforward to investigate. Neither the HA-binding activity nor TSG-6⅐I␣I complex formation was significantly different in the two allotypes.
It is not particularly surprising that a mutation within the CUB module has no effect on HA binding to TSG-6 as the functionally important amino acids, which we have identified previously (16), are clustered on the opposite face of the Link module from where the CUB module is attached. Moreover, the individually expressed Link module has been found to be capable of supporting high affinity HA binding (15,16). However, it remains to be established whether full-length TSG-6 has the same affinity for HA as the recombinant Link module.
The formation of complexes between TSG-6 and I␣I may have an important role in the stabilization of tissues rich in HA by cross-linking of the polysaccharide chains (4). TSG-6⅐I␣I complexes of ϳ120 kDa have been identified in the synovial fluids of arthritis patients (including OA) (7) and in the extracellular matrix of mouse cumulus cell oocyte complexes that expand prior to ovulation (5,23); it is becoming apparent that a number of TSG-6⅐I␣I complexes of different composition and structure may exist (23). The ϳ120-kDa complex that we produced here, by incubation of recombinant TSG-6 and purified I␣I in vitro, is not affected by the substitution in the CUB module. These experiments indicate that formation of this TSG-6⅐I␣I complex is likely to be very rapid under physiological conditions. I␣I is comprised of three protein chains (bikunin, heavy chain 1 (HC1), and heavy chain 2 (HC2)) that are held together by an unusual protein⅐glycosaminoglycan⅐protein linkage (49). The heavy chains are synthesized as pro-forms that each contain a VXXDP motif. During their secretory transport, the Asp-Pro bond is cleaved, which is necessary to reveal the ␣-carboxylic acid group of the Asp residues that can then form ester bonds to internal N-acetylgalactosamine residues of the chondroitin 4-sulfate chain (see Ref. 46). Wisniewski et al. (46) noted that TSG-6 contains such a motif near the C terminus of the CUB module (i.e. residues 245-249) and speculated that this could be involved in formation of the complex with I␣I, in an analogous manner to the assembly of I␣I; TSG-6 also has another VXXDP sequence at amino acids 138 -142. However, it seems unlikely that either of these motifs (both of which are reasonably close to the site of the allelic difference; see above) are involved in the formation of the TSG-6⅐I␣I complex produced here, as cleavage after their aspartic acid residues would release the C-terminal portion of TSG-6, which is recognized by our anti-TSG-6 antibody, i.e. such a complex would not be detected by this antiserum. However, this does not exclude the involvement of such motifs in the formation of other TSG-6⅐I␣I complexes.
TSG-6 has also been found to potentiate the anti-plasmin activity of I␣I (20). It has been assumed that formation of the stable ϳ120-kDa complex is required for this activity. However, work in our laboratories has shown that complex formation is not necessary for TSG-6 to modulate the function of I␣I in this way. 5 In this regard, the TSG-6R and TSG-6Q allotypes have identical abilities to potentiate I␣I anti-plasmin activity. 6 Here we have found that human TSG-6 protein with a Gln at position 144 is the major form in Caucasians (the corresponding A 431 allele has a frequency of 87.0%). Interestingly, TSG-6 from mouse (4), rabbit (50), cow, 7 and pig (accession codes: BI342110 and BI467634) all have an Arg at this sequence position, as is found in the rare human allotype (with a G at nucleotide 431). The prevalence of the TSG-6Q allotype identified in this study may suggest that the A 431 allele results in an, as yet, uncharacterized functional change that confers a selective advantage, in the Caucasian population at least. Further comparative studies on the TSG-6R and TSG-6Q allotypes, and typing of different ethnic groups, should clarify this possibility.