Membrane Protein Profiling of Human Colon Reveals Distinct Regional Differences

The colonic epithelium is a highly dynamic system important for the regulation of ion and water homeostasis via absorption and secretion and for the maintenance of a protective barrier between the outer milieu and the inside of the body. These processes are known to gradually change along the length of the colon, although a complete characterization at the protein level is lacking. We therefore analyzed the membrane proteome of isolated human (n = 4) colonic epithelial cells from biopsies obtained via routine colonoscopy for four segments along the large intestine: ascending, transverse, descending, and sigmoid colon. Label-free quantitative proteomic analyses using high-resolution mass spectrometry were performed on enriched membrane proteins. The results showed a stable level for the majority of membrane proteins but a distinct decrease in proteins associated with bacterial sensing, cation transport, and O-glycosylation in the proximal to distal regions. In contrast, proteins involved in microbial defense and anion transport showed an opposing gradient and increased toward the distal end. The gradient of ion-transporter proteins could be directly related to previously observed ion transport activities. All individual glycosyltransferases required for the O-glycosylation of the major colonic mucin MUC2 were observed and correlated with the known glycosylation variation along the colon axis. This is the first comprehensive quantitative dataset of membrane protein abundance along the human colon and will add to the knowledge of the physiological function of the different regions of the colonic mucosa. Mass spectrometry data have been deposited to the ProteomeXchange with the identifier PXD000987.

The physiology and architecture of the human gastrointestinal tract differ along its axis, where the stomach and small intestine are responsible for digestion followed by nutrient absorption. The colon forms the last part of the digestive tract and is required for the reabsorption of the large volumes of fluid and ions from material that has passed through the small intestine. In addition, it also functions as a large anaerobic bioreactor in which the gut microbiota degrade host-indigestible polysaccharides and glycans into short-chain fatty acids that are used as an energy source (1). The human gut hosts 10 13 to 10 14 commensal bacteria, of which the majority are found in the colon (2). In this symbiotic system, direct interaction between bacteria and epithelium is prevented by the continuous secretion of a dense mucus layer. The organization and composition of this mucus layer vary along the digestive tract (3). The colon has a two-layered protective barrier system that can be up to 500 m in thickness, of which only the outer layer is permeable to bacteria (4,5). Bacteria are suggested to be able to control mucus secretion and thereby balance the symbiotic relation between bacterial load and mucus production (6). The core protein of the colonic mucus is the MUC2 mucin, which has a net-like structural organization and dense complex O-glycosylation allowing it to withstand the harsh environment of the intestinal lumen (3). When O-glycans are lacking or truncated as, for example, in mice lacking the core-1 glycosyltransferase, the MUC2 protein loses part of its protective function, and the commensal flora will reach the epithelium and cause colitis (7). The O-glycosylation of MUC2 varies along the axis of the digestive tract (8,9). Terminal glycan epitopes are suggested to be responsible for the selection of our commensal microflora, resulting in distinct communities depending on the glycosyltransferases expressed (10).
The human colon can be divided anatomically from the ileocecal valve into a proximal part covering the cecum, ascending colon, and transverse colon and a distal region that includes the descending colon, sigmoid colon, and rectum. Although the overall function and architecture are considered similar throughout the whole colon, regional variation exists, as highlighted by the favored development of ulcerative colitis and colorectal cancer in the distal part. The origin of ulcerative colitis always involves the distal colon and progresses toward the proximal colon (11). In colorectal cancer, distinct variation exists between the molecular pathways underlying the development of tumors in the proximal and distal colon (12). One explanation for these regional variations might be the different embryological origins of the proximal and distal colon. The proximal colon originates from the embryonic midgut and is supplied by the superior mesenteric artery, and the distal colon originates from the hindgut and is supplied by the inferior mesenteric artery (13). Gene expression data do not support such a sharp border and suggest a gradual change in gene expression along the human colon (14,15). Little is known about the global proximal-distal variation in protein levels, and this has not been studied via proteomics approaches. Targeted studies at the protein level have shown that several transporters are differentially expressed, such as monocarboxylic acid transporter 1 and Na ϩ /H ϩ exchanger 3 (NHE3) 1 (16,17). Monocarboxylic acid transporter 1 is required for butyrate transport, and butyrate is found at the highest concentration in the proximal colon. NHE3 is also more highly expressed in the proximal colon, where most of the fluid reabsorption takes place (18). These results indicate that there is a direct correlation between colonic physiology and protein levels.
Most of the proteomics studies performed on human colon so far have focused on colorectal cancer (19,20). However, these studies have used cell lines or been limited to one colonic segment, frequently the distal colon, as a representation for the complete organ, neglecting regional variation. In this study, we used non-cancerous colonic tissue to demonstrate the variable levels of membrane proteins along the length of the normal human colon. The main focus was on plasma membrane proteins because of their role in maintaining important colon functions such as ion and water homeostasis and epithelial barrier functions (21,22), although the enriched membrane fraction analyzed contained most membrane proteins after the removal of nuclei and mitochondria. Mass spectrometry analyses were performed on ascending, transverse, descending, and sigmoid colon for both characterization of the protein composition and quantification using a label-free approach. We show that various biological processes were found to differ between the distal and proximal colon, such as metabolism, antigen presentation, protein Oglycosylation, and ion transport. This extensive dataset emphasizes that the colon is a more dynamic organ than often assumed.

Isolation of Epithelial Cells from Human Colonic Biopsies-Macro-
scopically normal biopsies from ascending, descending, transverse, and sigmoid colon (two biopsies from each colon region) were obtained from four patients referred for routine colonoscopy for diagnostic purposes (ϳ3-mm diameter). Biopsies were frozen in liquid nitrogen and stored at Ϫ80°C until use. Approval was granted by the Human Research Ethical Committee, Gothenburg University, and written informed consent was obtained from all study subjects. Epithelial cells were isolated as described in Ref. 23, with slight modifications. Briefly, tissues were washed in PBS for 5 min and then incubated in PBS containing 3 mM EDTA and 1 mM DTT at 4°C for 60 min while gently shaken. The solution was replaced with fresh PBS, and epithelial cells were dissociated from the tissue by vigorous shaking for 30 s. The remaining tissue was removed from the solution using a forceps, and cells were pelleted via centrifugation at 500 rpm (5415R, Eppendorf, Hamburg, Germany).
Membrane Protein Extraction, Digestion, and Peptide Fractionation-Pelleted cells were resolved in 500 l of 2 M NaCl, 1 mM EDTA in 10 mM HEPES, pH 7.4, containing complete protease inhibitor mixture (Roche) and lysed by means of tip-probe sonication (T-8, Turrax, IKA, Staufen, Germany). Membrane proteins were extracted as described in Ref. 24. Briefly, proteins were pelleted via centrifugation at 130,000 ϫ g for 20 min in a tabletop ultracentrifuge (Optima MAX, Beckman Coulter, Fullerton, CA) and dissolved once in 0.1 M Na 2 CO 3 , twice in 1 mM EDTA, pH 11.3, and finally in 5 M urea, 100 mM NaCl, 10 mM HEPES, pH 7.4, with pelleting via ultracentrifugation between each step. The final pellet was washed twice with 1 ml of 0.1 M Tris/HCl, pH 7.6, and centrifuged for 10 min at 20,000 ϫ g. Proteins were solubilized in 0.1 M DTT, 4% SDS, 0.1 M Tris/HCl, pH 7.6, added on 30,000-kDa cutoff filters (NanoSep, Pall, Ann Arbor, MI), and digested according to the filter-aided sample preparation method (25) using two-step digestion with endoproteinase Lys-C (Wako, Richmond, VA) overnight followed by trypsin (Promega, Madison, WI) for 4 h, both at room temperature. The concentration of eluted peptides was determined by means of Qubit fluorescent measurement (Invitrogen). 10 g of each sample was fractionated on a ZIC-HILIC column (3.5 m, SeQuant, Umeå, Sweden) packed in a fused silica capillary (150 mm ϫ 0.32 mm inner diameter) connected to an Ettan LC (Amersham Biosciences). The following buffers were used: A, 5 mM ammonium acetate in 0.5% formic acid, 95% acetonitrile; and B, 5 mM ammonium acetate. Peptides were eluted using a gradient of 5% to 50% B, and the absorbance was monitored at 280 nm. Six fractions were collected, dried under vacuum, and reconstituted in 15 l of 0.1% TFA.
Mass Spectrometry Analysis-Sample injection and nano-liquid chromatography were performed using an HTC-PAL autosampler (CTC Analytics, Zwingen, Switzerland) equipped with a Cheminert valve (0.25-mm bore, C2V-1006D-CTC, Valco Instruments, Schenkon, Switzerland) connected to an Agilent 1100 Series degasser and capillary pump (Agilent, Palo Alto, CA). Five microliters of the protein digest mixture was trapped on a fritted pre-column (4 cm ϫ 100 m inner diameter) packed with 2 cm of 5-m Reprosil-Pur C18-AQ particles (Dr. Maisch, Ammerbuch, Germany) connected between two MicroTee connectors (Upchurch, Oak Harbor, WA) in a valve switching configuration. The analytical column consisted of a fused silica capillary (15 cm ϫ 75 m inner diameter, 10 m tip, New Objective, Woburn, MA) packed with the 3-m Reprosil-Pur C18-AQ particles (Dr. Maisch). After sample loading in buffer A (0.2% formic acid), the peptides were separated using a piece-linear gradient (10% to 40% B over 60 min and 40% to 70% B over 15 min) with mobile phase B (80% acetonitrile in 0.2% formic acid) at a flow rate of ϳ300 nl/min. Mass spectrometry analysis was performed on an LTQ-Orbitrap XL (Thermo) operated in a data-dependent mode automatically switching between scan modes, performing MS/MS on the six most intense ions per precursor scan. MS scans in the mass range of m/z 350 -1600 were obtained in the Orbitrap at a resolution of 60,000 measured at 400 m/z, using the lock-mass feature for internal calibration (m/z 371.101). MS/MS fragmentation scans were obtained in the ion trap using collision-induced dissociation of 30% followed by a 60-s exclusion time.
Data Analysis-Raw spectral data were converted using MaxQuant version 1.3.0.5 (26), identified by the integrated database search engine Andromeda (27), and searched against the human Swiss-Prot protein database (release 2013 3, 21,324 entries) combined with a database of common contaminates concatenated with the same sequence database in reversed order for false discovery rate estimation. The following parameters were used for searches: (i) two missed cleavages, trypsin; (ii) precursor tolerance of 20 ppm in the first search used for recalibration, followed by 7 ppm for the main search and 0.5 Da for fragment ions; (iii) carbamidomethyl cysteine (fixed), oxidized methionine, and acetylated protein N-terminal (variable); (iv) a maximum of four modifications per peptide allowed; and (v) match between runs of 2 min. Relative protein quantification was performed based on the extracted ion chromatograms over the elution time window of each identified peptide. Peptide signals were combined for all identified charge states and variable modifications. The matchbetween-runs feature was used to determine whether peptides also occurred in the same retention time window in adjacent fractions, and the total sum was used for quantification. Identifications and quantifications were combined using the "identify" module in MaxQuant, applying a false discovery rate for both peptide and protein identifications of 1% based on the reversed peptide identifications (cutoff score Ͼ 50.83); protein identification was based on a minimum of one unique peptide, and proteins were grouped when based on the same set of peptides. Annotated spectra for all protein identifications based on a single peptide used for quantification are provided as supplemental Fig. S1. Non-unique peptides were strictly used for the quantification of the protein with the most identified peptides (26). The intensity data were converted so that the sum of each protein over the four segments was 1, allowing for comparison between patient datasets. Additional available protein information was retrieved through the mapping feature of UniProt, and functional analyses were performed using the Database for Annotation, Visualization, and Integrated Discovery (DAVID) (28). Membrane proteins were predicted using TMHMM 2.0 (29). Protein abundance factors were calculated by dividing the summed peptide intensities for each protein by the number of theoretically observable peptides of all fully tryptic peptides between 700 and 2500 Da; missed cleavages were neglected, and only carbamidomethylation of cysteine was considered as a fixed modification (30). A paired t test was performed to determine statistical significance between the segments with a false discovery rate value of 0.05. Further data analysis and statistical analysis were performed using the R language and environment for statistical computing. The mass spectrometry data have been deposited to the ProteomeXchange Consortium (http://proteomecentral. proteomexchange.org) with the dataset identifier PXD000987.
Secondary Alexa-555-conjugated antibodies were used for detection with anti-mouse anti-rabbit (1:1000; Invitrogen), Hoechst 33258 nuclear counterstained, and imaged using an LSM 700 Axio Examiner Z.1 confocal imaging system with identical settings for all sections.

RESULTS AND DISCUSSION
Characterization of the Colon Membrane Proteome-We aimed to characterize the epithelial cell membrane protein composition and segmental protein levels along the length of the normal human colon to gain insight into the dynamic and distinct regional protein level differences. The proteins were isolated from two 1-mm-sized routine biopsies of ascending, transverse, descending, and sigmoid colon segments, covering the full length of the colon. These biopsies were collected from four different patients without any known colon diseases and with macroscopically normal mucosa. The epithelial cells were isolated, and membrane proteins were isolated and digested using the filter-aided sample preparation method and offline prefractionated using ZIC-HILIC chromatography prior to mass spectrometry analysis. An overview of the sample preparation method is presented in Fig. 1. The mass spectrometry analysis identified between 2598 and 2682 proteins per patient based on the combined identifications of the four segments with a false discovery rate of 1% at both protein and peptide levels. Of the total identifications, 87% were based on at least two unique peptides (peptide spectra matches) with an average of seven unique peptides per protein (median ϭ 4). When we focused only on proteins identified in all patients, a total of 2508 unique proteins were selected for further data analysis ( Fig. 2A, supplemental Table  S1). 1729 proteins were identified in all patients and in all four segments. Only a small variation in identified proteins was observed, suggesting notable homogeneity in human colon membrane proteins. This group of proteins was for 96% of the proteins based on two or more unique peptides (median ϭ 7) and showed strong correlation among the four segments. This group of proteins was used to quantify the membrane proteins (supplemental Table S2). To address the similarity among the different patients and segment samples, we performed hierarchical clustering of the protein intensities for all shared identified proteins. The sigmoid and ascending samples were both separately grouped with all four samples together, and the two central segments were mixed with a higher correlation toward the sigmoid colon (Fig. 2B).
To assess the dynamic range of the analysis, we estimated the relative abundance of each protein based on the sum of peptide ion intensities per protein divided by the number of theoretical tryptic peptides (700 -2500 Da) to normalize for varying protein length (30). The proteins were ranked depending on their estimated abundance and spanned over 5 orders of magnitude between the highest and lowest abundant protein, showing the depth of our analysis (Fig. 2C). A majority of the 25 most abundant proteins originated from the mitochondria comprising parts of the ATP synthesis and cytochrome C oxidase complexes (Fig. 2D). The abundance of proteins belonging to the mitochondrial respiratory chain suggests that the highly biological active colonic cells require vast amounts of energy. The bottom part of the abundance plot consists mainly of soluble proteins, as the applied sample preparation method favors hydrophobic proteins (Fig. 2E). The relative abundance estimation can therefore be used only for hydrophobic and transmembrane-spanning proteins.
Enrichment of Membrane Proteins from Colon Epithelium-In standard proteomics workflows, membrane proteins are often underrepresented because of their amphiphilic properties. Various methods for enriching membrane proteins have been developed, and here we used an established method based on sodium carbonate washes at high pH combined with ultracentrifugation (32,33). We identified a total of 2508 proteins from the four colon segments, of which 1098 (44%) were predicted to contain transmembrane-spanning domains based on the TMHMM model (29), exceeding the 20% to 30% membrane proteins predicted in the human genome (34). The effectiveness of the membrane protein extraction was further evaluated by comparing the hydrophobicity of the identified proteins to a reference colon proteome extracted from the Human Protein Atlas (35). The proteins selected from the database had staining that was annotated as moderate or strong in colonic glandular cells (v. 12, 2013; 10,088 entries). The overall distribution of the identified proteins showed a general increase in hydrophobicity relative to this reference proteome (Fig. 3A). The percentage of identified proteins increased along the hydrophobic scale, which is consistent with the greater number of membrane proteins expected in this region. Increased amounts of membrane proteins were also observed when the number of identified proteins was compared with proteins predicted to contain transmembranespanning domains (Fig. 3B). On average we identified 54% of the proteins containing transmembrane-spanning domains in the reference proteome, compared with only 20% of the proteins without predicted transmembrane-spanning domains. Fifteen out of the 25 most abundant proteins contained transmembrane-spanning domains, whereas only 5 out the 25 least abundant ones did (Figs. 2D, 2E). These numbers are likely an underestimation, as they do not take into account membrane-coupled and peripheral membrane proteins. For example, the most abundant protein identified was ATP5H, which is part of the complex responsible for proton transport over the mitochondrial membrane. Overall, the high levels of identified proteins with high hydrophobicity indicate that the enrichment methodology was efficient, resulting in the identification of 1098 membrane proteins.
Differences in Biological Processes from Ascending to Sigmoid Colon-To obtain a general overview of which biological processes are gradually changed along the colon and in which direction, we functionally classified significantly regulated proteins according to PANTHER (36). Significance was determined by performing a paired t test comparing the normalized intensities of the ascending and sigmoid segments. A total of 261 proteins (p Ͻ 0.1) were selected for functional enrichment analysis in which the entries for ascending and transverse were compared with those for descending and sigmoid (supplemental Table S2). The significance threshold was chosen to give sufficient input to reach significance in the enrichment analysis. A majority of the identified regulated processes were related to metabolism, protein synthesis, transport, and immunity (Fig. 4A). In the ascending and transverse colon segments, the majority of the processes were involved in metabolism, which could be assumed to rapidly decrease toward the distal colon as exemplified in Fig. 4B. Digestion and absorption of nutrients largely take place in the small intestine, whereas colon is supposed to reabsorb fluid and ferment indigestible carbohydrates from the food and host. Our results suggest that the absorption of nutrients is not restricted to the small intestine and that the proximal colon is also active in the final stages of digestion (Fig. 4B).
The proteins involved in lipid metabolism are most abundant in the ascending colon and decrease gradually toward the sigmoid colon. Apolipoprotein B-100, required for the formation of chylomicrons by enterocytes (37), is a good example, as it decreased 1.6-fold toward the distal colon.
We also observed a rapid decline in proteins involved in MHC class I presentation of antigens (Fig. 4C). The trend of this had a similar direction for 85% of all identified HLA proteins (14 entries) and included MHC class II proteins as well, suggesting that the ascending colon has a more active role in antigen presentation than the distal colon (supplemental Table S3). In germ-free mice, the MHC class proteins have been shown to be reduced, indicating a relation between MHC and the gut microbiota (38). The high abundance of these antigen-presenting molecules in proximal colon could also suggest more contact between the luminal content and the epithelial surface, something that could be related to the more permeable mucus in the proximal part of the colon (39). Additionally, glycosyltransferases involved in initiating protein O-glycosylation are increased toward the distal colon, potentially resulting in a more dense glycosylation pattern on the proteins expressed in the distal colon (Fig. 4D).
Membrane Proteins with Greatest Segmental Variation in Amount-The majority of proteins identified in all segments were found in similar amounts along the length of the colon. The distribution of ratios between ascending and sigmoid colon was normal, confirming this conclusion (Fig. 5A). Only a small group of proteins were present in significantly different amounts between contiguous segments (see column AC-AE in supplemental Table S2), and we therefore chose to compare the two extremes, the ascending and the sigmoid colon This selection gave 144 proteins that were significantly differ-

FIG. 3. Enrichment of transmembrane-spanning proteins.
A, a reference proteome was obtained from the Human Protein Atlas containing all proteins expressed in the colonic epithelium (glandular cells) that had an antibody staining of moderate or strong intensity. The hydrophobicity of all proteins was calculated based on the grand average of hydropathy. Dark green bars represent the proteins identified via mass spectrometry, and those in light green represent the reference proteome. The secondary axis shows the percentage of proteins identified. B, transmembrane-spanning helixes for all proteins were predicted using the TMHMM algorithm. All proteins containing 1 to 20 helixes were compared between the identified and control datasets. ent between the two segments (p Ͻ 0.05). We performed hierarchical clustering analysis on the 105 proteins that were predicted to have transmembrane domains using TMHMM or that were annotated as membrane proteins according to UniProt annotation (29,40). Two distinct clusters were observed, one group containing 19 proteins with the highest levels in the sigmoid colon and one group of 86 proteins that were more abundant in the ascending colon (Fig. 5B). The direction of the heat map gradient suggests that most proteins were gradually changing along the length of the colon, with the majority of the proteins showing decreased levels toward the sigmoid colon.
Transport Proteins along the Length of the Colon-In ascending colon, a large subset of significantly regulated proteins belonged to the solute carrier (SLC) transporter protein family involved in the transport of various molecules over the membranes (Fig. 5B). The amount of transporter SLC16A1, involved in butyrate transport, was found to be 2.5 times greater in the ascending colon than in the sigmoid colon, something that correlates with previous studies of mRNA expression levels (14). Transporters involved in the absorption of amino acids (SLC3A2 and SLC6A7) and glucose (Na ϩ /glucose co-transporter SLC5A1) showed a similar protein pattern. The amount of SLC5A1 was 2.4-fold greater in the ascending colon than in the sigmoid colon, which also correlates with mRNA expression (41). We also found that the transport of bile acids via basolateral organic solute transporter subunit alpha (SLC51A) was increased 4.2-fold in the ascending colon (42). Ion transport is known to differ along the length of the colon, but how this relates to differences in protein levels of the involved transporters is not fully understood. Our dataset allowed us to analyze the segmental levels of transporters involved in cation transport mechanisms. Colonic ion absorption takes place via electroneutral or electrogenic uptake mechanisms. Electroneutral absorption operates via exchangers, such as the apical Na ϩ /H ϩ exchangers (NHE3 (SLC9A3) and NHE2 (SLC9A2)) coupled to the Cl Ϫ /HCO 3 Ϫ exchanger (DRA, SLC26A3) resulting in apical uptake of NaCl. Absorbed Na ϩ ions exit the cell via the basolateral transporters Na ϩ /K ϩ -ATPase and NHE1 (SLC9A1), and the Cl Ϫ ions exit the cell via the Cl Ϫ /HCO 3 Ϫ exchanger AE1 (SLC4A1). Our results show that NHE3 levels were highest in the ascending colon and gradually decreased (5.8-fold) toward the sigmoid colon (Table I). In contrast, all other transporters (NHE1, NHE2, DRA, and Na ϩ /K ϩ -ATPase) showed constant protein levels along the length of the colon. That the NHE3 and DRA levels did not follow each other suggests that coupling of these exchangers is not obligatory and that additional factors are involved in regulating NaCl absorption. One of the ratelimiting factors for DRA activity is the intracellular concentration of HCO 3 Ϫ . Studies have shown that DRA activity depends on the intracellular production of HCO 3 Ϫ through carbonic anhydrase 2, which converts H 2 O and CO 2 to HCO 3 Ϫ (43). The pattern of carbonic anhydrase 2 followed that of NHE3, with the highest levels in the ascending colon (1.8-fold increase), suggesting that despite similar amounts of DRA along the colon, the transport activity might be higher in the ascending colon than in the sigmoid colon because of the greater intracellular production of HCO 3 Ϫ . Anion secretion (chloride and bicarbonate) creates the driving force for fluid secretion. Our analyses of the transporters involved in chloride secretion showed constant levels of both the apical CFTR chloride channel and the basolateral Na/K/Cl co-transporter NKCC1 that mediates uptake of Cl Ϫ along the length of the colon. In contrast, the Ca 2ϩ -dependent K ϩ channel KCNN4, whose activity is necessary for maintenance of the negative intracellular electrochemical gradient in favor of anion secretion, showed higher levels in the ascending colon. This suggests a stronger driving force for chloride secretion in this segment, as has been shown in functional studies (44,45). We also identified two members of the Anoctamin family (ANO9 and ANO10), a family that has been suggested to consist of Ca 2ϩ -activated chloride channels (46). ANO10 has been shown to form a functional channel, whereas the ion channel status of ANO9 is unknown (47,48). ANO9 showed greater amounts in the ascending colon, whereas ANO10 showed constant levels along the length of the colon.
Colonic bicarbonate secretion involves basolateral uptake of HCO 3 Ϫ via Na ϩ /HCO 3 Ϫ co-transport (NBCe1, SLC4A4) and apical exits via the CFTR channel or via DRA. Our results showed constant amounts along the length of the colon for CFTR, DRA, and the basolateral NBCe1 (Table I). As for chloride, bicarbonate secretion requires activation of the K ϩ channel KCNN4, suggesting that bicarbonate secretion might also be higher in the ascending colon than in the sigmoid colon (49). Bicarbonate secretion mediated via CFTR, DRA, and NBCe1 is specific to the enterocytes (17), but the goblet cells have specific bicarbonate transport systems mediated via the electroneutral Na ϩ /HCO 3 Ϫ co-transporter NBCn1 (SLC4A7) that have been suggested to be important for mucus formation. Our results showed higher levels of NBCn1 in the ascending colon, suggesting that goblet-cell-mediated bicarbonate transport is increased in the ascending colon relative to the sigmoid colon (Table I). The increased levels of the small GTPase RAB11B toward the sigmoid colon responsible for the recycling of ion channels expressed at the apical membrane can also influence ion transport (Fig. 5B) (50).
Glycosyltransferases along the Length of the Colon-In general we found a proximal-to-distal increase in amounts of enzymes involved in O-glycosylation (Fig. 4D). These enzymes take part in the biosynthesis of all colonic glycoproteins; however, by far the most abundant highly glycosylated protein is the MUC2 mucin (51). This mucin is produced by the goblet cells and is required for the formation of the protective mucus layer. The proteins' central region is composed of two serine, proline, and threonine repeats (ϳ2600 amino acids) that upon O-glycosylation contribute to 80% of the proteins' molecular mass. Glycomic studies have shown that the glycan structures on the MUC2 mucin in the colon are complex but are relatively homogeneous within one segment (9,52). Complex mucin type O-glycosylation involves a broad range of glycosyltransferases, which can be found throughout the Golgi apparatus. As glycosyltransferases have a transmembrane domain, the membrane enrichment method used allowed relative quantification of the different transferases along the four segments, providing indirect support for the O-glycan composition on the MUC2 mucin (Fig. 6). The relative abundance of each glycosyltransferase was estimated based on the total peptide signal obtained for each protein normalized to the number of theoretically observable tryptic peptides (Fig. 7A).
The family of peptidyl GalNAc-transferases required for initiating the O-glycosylation contains 20 members (53). In the colon we identified eight members of this family (Fig. 6), of which two are active only on glycopeptides (T7 and T12). The other ppGalNAc-transferases all have distinct specificities. The two most dominant ppGalNAc transferases identified were numbers 3 and 7, which we previously showed to be the most efficient in glycosylating MUC2-derived peptides (54). ppGalNAc-transferases 2, 3, 4, 5, and 7 all increased toward the sigmoid colon, which could reflect the fast turnover of MUC2 in the distal colon (55). The same set of ppGalNActransferases was previously found to be more abundant in relation to other glycosyltransferases in a study of colon tissue in relation to adenocarcinoma (56).
The glycosyltransferases B3GNT6 and GCNT3 are responsible for the extension of the attached GalNAc (Tn antigen) into the core 3 (GlcNAc␤1-3GalNAc) and core 4 (GlcNAc␤1-3(GlcNAc␤1-6)GalNAc) structures. These are the major core structures found in human colon MUC2, with core 3 and core 4 as the most predominant forms in the ascending segment (9). We also identified GCNT1, responsible for core 2 synthesis. However, as the formation of core 2 requires initial extension of core 1, an enzyme that was absent, there was no substrate for this enzyme. Further elongation of the core structures is achieved by the addition of the GlcNAc and Gal, for which we identified B3GNT3, B3GALT5, and B4GALT4 transferases, with B3GALT5 adding Gal on the 3-branch and B4GALT4 on the 6-branch.
The greatest variation in protein levels was observed for the glycosyltransferases required for adding terminating residues. The fucosyltransferases FUT3, -5, and -6 required for the incorporation of Fuc on GlcNAc (Fuc1-3,4GalNAc) showed a distinct reduction in levels from the ascending toward the sigmoid colon, which is reflected in the low level of fucosyla- tion in the sigmoid colon (52). The signal for FUT3/5 was combined as a result of high sequence homology. Still, FUT3/5 was 14-fold more abundant than FUT6 (Fig. 7A), giving a clear indication that FUT3 and FUT5 are major contributors to colon fucosylation.
The sialyltranferases ST6GALN1 and ST3GAL4 and the sulfotransferase CHST5 responsible for sulfating GlcNAc increased toward the sigmoid colon. This is in line with the previously observed high acidity of the O-glycan structures in the distal colon (9). The Sd a /Cad epitope (NeuAc2-3(GalNAc1-4)Gal), typical for the human colon, was most abundant in the distal colon (9). This epitope is synthesized by ST3GAL4 adding NeuAc followed by B4GALNT2 adding GalNAc and absolutely requires NeuAc on its substrate. ST3GAL4 showed increased levels in the distal direction and was thus responsible for the distal gradient of the Sd a /Cad epitope. That B4GALNT2 was most abundant in the proximal colon might seem counterintuitive, but there was obviously sufficient enzyme in distal parts also, as this enzyme was relatively abundant throughout the length (Fig. 7A).
Immunohistochemistry of sections from ascending and sigmoid colon for two of the significantly regulated glycosyltransferases B4GALNT3 and GCNT3 confirmed our proteomic observations. A reduction in staining intensity was observed between the ascending and sigmoid colon sections for all four patients, indicating reduced amounts of proteins in the distal segment (Fig. 7B).
These results are the first global overview of the glycosyltransferases responsible for O-glycosylation in the human colon and their variation along the colonic axis. The identified glycosyltransferases are directly correlated with the O-glycan structures previously determined, and the estimated relative abundance of each identified transferase explains which enzyme is responsible for each step. If the glycan structures had not been determined as in this case, this type of information could be used to predict the oligosaccharide structures produced.

CONCLUSIONS
Our focus in this study was on the membrane proteins of colonic epithelial cells. Major cellular events are membrane associated, such as, for example, ion homeostasis and epithelial barrier function. The complexity of the biological samples and unfavorable properties of membrane proteins limit the in-depth analysis of these proteins unless specific enrichment strategies are used. We successfully used a simple enrichment method compatible with the small amounts of starting material available from millimeter-sized human colonic biopsies. More laborious membrane-enrichment protocols are not compatible with such small amounts of starting material. The presented results show that simple preparation methods combined with today's sensitive mass spectrometers are sufficient to provide in-depth knowledge of low abundant proteins such as, for example, ion channels and glycosyltransferases.
To the best of our knowledge, this is the first comprehensive profile of membrane protein levels along the length of the healthy human colon emphasizing regional heterogeneity. This information can be used to obtain a better understanding of how various biological processes of the colonic epithelium vary along the proximal-distal axis. Additionally, the dataset can be used for comparisons between healthy and diseased tissue specimens, such as samples from patients with inflammatory bowel disease or colon cancer. We further confirmed the presence of proteins responsible for phenomena previ- ously only recorded via electrophysiology as ion transport activities or end products of glycosyltransferase activities.