Re-evaluation of lysyl hydroxylation in the collagen triple helix: lysyl hydroxylase 1 and prolyl 3-hydroxylase 3 have site-differential and collagen type-dependent roles in lysine hydroxylation

Collagen is the most abundant protein in humans and is heavily post-translationally modified. Its biosynthesis is very complex and requires three different types of hydroxylation (two for proline and one for lysine) that are generated in the rough endoplasmic reticulum (rER). These processes involve many enzymes and chaperones which were collectively termed the molecular ensemble for collagen biosynthesis. However, the function of some of the proteins in this molecular ensemble is controversial. While prolyl 3-hydroxylase 1 and 2 (P3H1, P3H2) are bona fide collagen prolyl 3-hydroxylases, the function of prolyl 3-hydroxylase 3 (P3H3) is less clear. A recent study of P3H3 null mice demonstrated that this enzyme had no activity as prolyl 3-hydroxylase but may instead act as a chaperone for lysyl hydroxylase 1 (LH1). LH1 is required to generate hydroxylysine for crosslinking within collagen triple helical sequences. If P3H3 is a LH1 chaperone that is critical for LH1 activity, P3H3 and LH1 null mice should have similar deficiency in lysyl hydroxylation. To test this hypothesis, we compared lysyl hydroxylation in type I and V collagen from P3H3 and LH1 null mice. Our results indicate LH1 plays a global role for lysyl hydroxylation in triple helical domain of type I collagen while P3H3 is indeed involved in lysyl hydroxylation particularly at crosslink formation sites but is not required for all lysyl hydroxylation sites in type I collagen triple helix. Furthermore, although type V collagen from LH1 null mice surprisingly contained as much hydroxylysine as type V collagen from wild type, the amount of hydroxylysine in type V collagen was clearly suppressed in P3H3 null mice. In summary, our study suggests that P3H3 and LH1 likely have two distinct mechanisms to distinguish crosslink formation sites from other sites in type I collagen and to recognize different collagen types in the rER. Author summary Collagen is one of the most heavily post-translationally modified proteins in the human body and its post-translational modifications provide biological functions to collagen molecules. In collagen post-translational modifications, crosslink formation on a collagen triple helix adds important biomechanical properties to the collagen fibrils and is mediated by hydroxylation of very specific lysine residues. LH1 and P3H3 show the similar role in lysine hydroxylation for specific residues at crosslink formation sites of type I collagen. Conversely, they have very distinct rules in lysine hydroxylation at other residues in type I collagen triple helix. Furthermore, they demonstrate preferential recognition and modification of different collagen types. Our findings provide a better understanding of the individual functions of LH1 and P3H3 in the rER and also offer new directions for the mechanism of lysyl hydroxylation followed by crosslink formation in different tissues and collagens.


27
biosynthesis is very complex and requires three different types of hydroxylation (two for proline and one 28 for lysine) that are generated in the rough endoplasmic reticulum (rER). These processes involve many 29 enzymes and chaperones which were collectively termed the molecular ensemble for collagen 30 biosynthesis. However, the function of some of the proteins in this molecular ensemble is controversial.

31
While prolyl 3-hydroxylase 1 and 2 (P3H1, P3H2) are bona fide collagen prolyl 3-hydroxylases, the 32 function of prolyl 3-hydroxylase 3 (P3H3) is less clear. A recent study of P3H3 null mice demonstrated 33 that this enzyme had no activity as prolyl 3-hydroxylase but may instead act as a chaperone for lysyl 34 hydroxylase 1 (LH1). LH1 is required to generate hydroxylysine for crosslinking within collagen triple 35 helical sequences. If P3H3 is a LH1 chaperone that is critical for LH1 activity, P3H3 and LH1 null mice 36 should have similar deficiency in lysyl hydroxylation. To test this hypothesis, we compared lysyl 37 hydroxylation in type I and V collagen from P3H3 and LH1 null mice. Our results indicate LH1 plays a 38 global role for lysyl hydroxylation in triple helical domain of type I collagen while P3H3 is indeed 39 involved in lysyl hydroxylation particularly at crosslink formation sites but is not required for all lysyl 40 hydroxylation sites in type I collagen triple helix. Furthermore, although type V collagen from LH1 null 41 mice surprisingly contained as much hydroxylysine as type V collagen from wild type, the amount of

58
Collagen is not only the most abundant protein, but is also one of the most heavily post-translationally 59 modified proteins in the human body [1,2]. These post-translational modifications (PTMs) play essential 60 roles in providing biological functions to collagen molecules. Two distinct classifications of PTM exist 61 prior to the incorporation of collagen into extracellular matrices (ECMs), occurring in the unfolded state 62 (a single α-chain) in the rough endoplasmic reticulum (rER) and the folded state (triple helical structure) 63 in the Golgi and the ECM space [3,4]. Interestingly, there is the case that the extent of the PTM in the 64 Golgi and the ECM space is governed by the PTMs in the rER. Crosslink formation is an important PTM 65 occurring on a collagen triple helical structure and adds important biomechanical properties to the 66 collagen fibrils [5,6]. However, the pathway of crosslink formation in type I collagen depends on the 67 presence and absence of lysyl hydroxylation in both the collagenous and the telopeptide region [7,8].

68
Additionally, O-glycosylation, which is generated after lysyl hydroxylation, is involved in crosslink 69 formations and the amount depends on the type of tissue, the rate of triple helix formation and the 70 presence or absence of ER chaperones [9][10][11][12]. Thus, PTMs of unfolded α-chains in the rER are critical 71 for quality control in a collagen ultrastructure.

72
Collagen biosynthesis including PTMs is complex and involves many enzymes and chaperones, which 73 are collectively termed the molecular ensemble [13]. Because some of the enzymes only can modify 74 unfolded, and not triple helical, collagen chains the time that collagen chains remain unfolded in the rER 75 is a critical factor for correct PTMs and foldases control the rate of triple helix formation [14][15][16][17]. There 76 are three hydroxylations (proline 3-hydroxylation, proline 4-hydroxylation and lysine hydroxylation) that 77 are fundamental PTMs occurring before triple helix formation [3,13]. Interestingly, the function of 78 prolyl 3-hydroxylase 3 (P3H3) is controversial. It has been suggested that this protein has no prolyl 79 hydroxylase activity and instead acts as a chaperone for lysyl hydroxylase 1 (LH1) [18]. LHs hydroxylate 80 specific lysine residues in both the collagenous and telopeptide regions, and the three isoforms (LH1, 2 81 and 3) have been proposed to play specific roles based on collagen sequences [16]. LH2 is specific for 82 hydroxylating the telopeptide [16], and LH1 has been suggested to hydroxylate the triple-helical regions 83 [13,[19][20][21]. The study of patients with LH1 mutations, and of LH3 mutant mice indicate that both LH1 84 and LH3 could have substrate preferences (e.g. LH1 and LH3 prefer type I/III collagen and type II/IV/V 85 collagen, respectively) [22][23][24][25]. However, this indication could not fully explain the diversity of lysyl 86 hydroxylation in tissues of the LH1 null mouse model and type I collagen from different tissues of Ehlers-87 Danlos Syndrome (EDS)-VIA patients [23,26]. Few analyses have investigated the level of lysyl 88 hydroxylation in purified collagens from mutant or LH knockout models using both qualitative and 89 quantitative measurements.

5
In this study, we aimed to re-evaluate the role of LH1 in collagen triple helices and test whether P3H3 is 91 essential for the LH1 activity. To achieve this, we compared the levels of overall lysyl hydroxylation and 92 PTM occupancy at individual sites between collagens extracted from P3H3 or LH1 null mice.

101
Basic characterization of P3H3 null mice -P3H3 null mice were generated by Ozgene as shown in 102 Figure 1A and with more detailed information in the methods section. Figure 1B displays the result of 103 PCR genotyping showing that the P3H3 null allele product is smaller than WT due to the deletion of 104 Exon 1. To confirm that the expression of P3H3 protein was abolished, Western blotting was performed 105 using a whole kidney lysate and the protein signal corresponding to MW of P3H3 (79 kDa) was absent in 106 the P3H3 null lysate ( Figure 1C). As previously reported [18], P3H3 null mice were also viable and we 107 did not observe any obvious growth or skeletal phenotypes by growth curves and X-ray images, 108 respectively ( Figure 1D and E). LH1 null mice were generated and characterized previously [26].

110
Biochemical characterization of purified type I collagen from different tissues of P3H3 and LH1 111 null mouse models -To enable qualitative and quantitative analyses, type I collagen was purified from 112 tissues by pepsin treatment followed by sodium chloride precipitation. We analyzed three different 113 tissues (tendon, skin and bone) from each mouse model. We evaluated the level of PTMs by comparing 114 migration using SDA-PAGE [27] and determined the thermal stability using circular dichroism (CD) 115 spectra [28,29] of purified type I collagen from the different tissues of P3H3 null and LH1 null mice 116 ( Figure 2). Type I collagen from both P3H3 null and LH1 null skin migrates a little faster and shows a 117 lower melting temperature than WT, whereas there is no clear difference for type I collagen from tendon 118 and bone between WT and nulls in gel migration or melting temperature ( Figure 2). This suggests that 119 skin is the most affected tissue in both P3H3 null and LH1 null mice. 120 121 Quantitative analysis to determine the total level of post-translational modifications of type I 122 collagen in P3H3 null and LH1 null mice -Amino acid analysis (AAA) was used to quantify the total 123 number of PTMs in the purified type I collagens. Neither P3H3 nor LH1 null mice had changes in 124 proline hydroxylations (prolyl 3-and 4-hydroxylation), however, both strains had interesting changes in 125 lysyl hydroxylation ( Figure 3 and Table 1). LH1 deficiency significantly decreased the amounts of 126 hydroxylysine in tendon, skin and bone although bone was a slightly lesser extent. In contrast, P3H3 127 deficiency had a much smaller effect on lysyl hydroxylation than LH1 whereby skin showed further 128 reduction of hydroxylysine compared to tendon and bone. Next, we determined the occupancy of O-129 glycosylation of hydroxylysine in tendon and skin by liquid chromatography-mass spectrometry (LC-130 MS). The calculated value of galactosyl hydroxylysine (GHL) does not show any significant difference 131 in skin but does slightly increase in tendon for both P3H3 null and LH1 null mice ( Figure 4 and Table 2).

132
Interestingly, the magnitude of change in unmodified hydroxylysine and glucosylgalactosyl 133 hydroxylysine (GGHL) seems to have some correlation in both tendon and skin. For example, both P3H3 134 7 null and LH1 null type I collagen in tendon showed unmodified hydroxylysine was decreased whereas 135 GGHL was increased by a similar magnitude of decreasing unmodified hydroxylysine (Figure 4 and   136   Table 2). While the effect is opposite manner, the same observation is also seen in skin of P3H3 null and 137 LH1 null type I collagen ( Figure 4 and   Table 3 and Table 4). In LH1 null tissues, the level of lysyl hydroxylation and subsequent O-151 glycosylation were significantly decreased at all lysyl hydroxylation sites of both tendon and skin. In 152 P3H3 null tissues, we confirmed the reduction of lysine modifications in K87 in both the α1 and α2 of 153 type I collagen as previously reported [18], and a large reduction was also found at α1 K930 and α2 K933 154 which are near the carboxy-terminus of the triple helical domain involved in crosslink formation. The 155 other sites α1 K99, α1 K174, α2 K174 and α2 K219 also showed clear reduction, however, there was no 156 notable decrease in the level of lysyl hydroxylation at the sites in the middle of the triple helix of α1 chain 157 (α1 K219 and α1 K564). In summary, LH1 might play a global role for lysyl hydroxylation at all sites in 158 the triple helical domain of type I collagen whereas the role of P3H3 could be restricted to specific sites.

160
Qualitative and quantitative characterization of skin type V collagen from WT, P3H3 null and LH1 161 null mice-Type V collagen is heavily lysyl hydroxylated and O-glycosylated and abundant in skin 162 compared to tendon and bone [33]. We isolated type V collagen from skin of P3H3 and LH1 null mice 163 and subjected it to gel migration analysis. Although type V collagen from LH1 null mice did not show a 164 clear difference in gel migration however, type V collagen from P3H3 null mice appeared to migrate 165 faster compared to WT ( Figure 6A). To confirm these observations, we identified the level of PTMs in 166 both P3H3 null and LH1 null type V collagens by AAA. The ratio in prolyl hydroxylations is slightly 167 different between control animals from the P3H3 and LH1 mouse strains ( Figure 6B). One potential 8 explanation is that the analyses were done on skin from 2~5-month-old and 10-week-old mice for P3H3 169 and LH1 mice, respectively (more detailed information in the methods section). Similar to type I collagen, 170 neither P3H3 nor LH1 null mice had changes in proline hydroxylations (prolyl 3-and 4-hydroxylation).
171 Surprisingly, P3H3 null mice, but not LH1 null mice, had reduced levels of lysyl hydroxylation in type V 172 collagen isolated from skin ( Figure 6B and Table 1) however the occupancy of O-glycosylation on 173 hydroxylysine was not changed ( Table 2). The reduced lysyl hydroxylation in P3H3 null mice influenced 174 the thermal stability of type V collagen and CD melting curves showed only one of the two thermal 175 transitions seen in WT ( Figure 6C). Since lysine residues at the Yaa position of collagenous Gly-Xaa-176 Yaa triplets are extensively glycosylated in type V collagen [33], site-specific characterization of lysine 177 modifications was difficult due to missed cleavage at hydroxylysine glycosides by trypsin [34]. We were 178 able to analyze two sites, α1(V) K84 and α2(V) K87 (Figure 7 and Table 5), that are involved in crosslink 179 formation [35]. At both sites, the level of GGHL was decreased and the magnitude of reduction of GGHL 180 corresponds to that of the increased unmodified lysines in the absence of P3H3, however this change at 181 α1(V) K84 was not significant statistically (Table 5). This suggests P3H3 could play an important role in 182 lysyl hydroxylation and/or subsequent O-glycosylation at the site of crosslink formation consistently. In 183 LH1 null mice, there was a marginal change at α1(V) K84, however, α2(V) K87 was clearly affected.

184
Potential explanation is that the α2-chain of type V collagen is classified as an A-clade chain, which 185 includes both the α1-and α2-chain of type I collagen, whereas the α1-chain of type V collagen belongs to 186 B-clade [36,37]. LH1 seems to hydroxylate the α2(V) K87 preferentially. Nevertheless, the ratio of two 187 α1-chains and one α2-chain in type V collagen could hide the effect caused by impaired LH1 activity and 188 not show any distinct difference in type V collagen between WT and LH1 null observed in Figure 6B. In 189 summary, P3H3 is required for proper lysyl hydroxylation of type I and type V collagen whereas LH1 is 190 dispensable for lysyl hydroxylation in skin type V collagen.

195
Electron microscopy showed that the average diameter of collagen fibrils is similar between P3H3 null 196 (84.9 ± 35.6 nm) and WT (85.1 ± 25.9 nm) ( Figure 8D), however the distribution of fibril diameters was 197 broader in P3H3 null ( Figure 8E) and this was also reported in LH1 null skin [26]. In summary, a precise 198 number of PTMs in the rER is required to maintain an appropriate ultrastructure in collagen rich tissues.

9
In the rER, many enzymes and post-translational modifiers interact with molecular chaperones 202 via either strong or weak affinity interaction to improve their functions [38,39]. In particular, the 203 molecular ensemble for collagen biosynthesis consists of variety of protein-protein interactions [13,40]. 204 When an interaction is impaired, as found in genetic disorders, the magnitude of the impact depends on 205 what type of protein-protein interaction is disrupted. Prolyl 3-hydroxylase 1 (P3H1), cartilage-associated

223
To evaluate the correlation between LH1 and one of the LH1-associated proteins, P3H3, we 224 conducted quantitative analyses and directly compared the level of PTMs between WT, P3H3 null and 225 LH1 null mouse tissues. Our results suggest that if P3H3 acts as a LH1 chaperone, this chaperone 226 function is not required for all LH1 sites, as very specific sites related to crosslink formation in type I 227 collagen were affected ( Figure 5). Figure 9 represents the magnitudes of change of unmodified lysine 228 residues in individual lysyl hydroxylation sites between three different null mouse models compared to 229 WT. These observations imply that both P3H3 and CypB play important roles for the function of LH1 230 and that a lack of even one of the components attenuates the amounts of hydroxylysine in the crosslink 231 formation sites. Conversely, other lysyl hydroxylation sites demonstrate very diverse effects between 232 P3H3 null, LH1 null and CypB null mouse tissues. There are specific patterns that are changed in each 233 null mouse model. Modified lysine residues were hardly found in LH1 null, whereas P3H3 null showed 234 normal or slightly increased unmodified lysine residues. In contrast, CypB null showed normal or 235 10 decreased unmodified lysine residues despite increasing unmodified lysine residues at crosslink formation 236 sites as well as P3H3 and LH1 nulls. Moreover, additional sugar attachments are found at other lysyl 237 hydroxylation sites (e.g. K174

245
Additionally, size exclusion chromatography demonstrated that P3H3 and SC65 were possibly associated 246 in a tight interaction like the P3H1/CRTAP/CypB complex, however neither LH1 nor CypB was a part in 247 this tight interaction [48]. We imagine a very precise molecular interplay is required particularly around 248 crosslink formation sites and this is not simply determined as chaperone effects and/or a complex 249 formation. Collectively, we would like to term this precise mechanism as a "local molecular ensemble".

250
We looked for the specific binding or enhancer sequences of type I collagen near lysyl hydroxylation sites 251 based on our results ( Figure 10). As a previous report suggested [32], the KGH sequence occurs at or 252 near crosslink formation sites to provide preferential interaction sites for the CypB-involved SC65/P3H3 253 ER complex to facilitate LH1 activity. This hypothesis is possible, but cannot explain the reduction of

262
Here we note that a previous report showed the difference in PTMs at α1(V) K87 [18], however the actual 263 residue 87 is arginine instead of lysine as we showed above and this is also confirmed by database 264 (UniProt entry numbers: P20908 for human and O88207 for mouse, NCBI accession number: bovine for 265 XP_024855494).

266
The previous studies of LH1 null and EDS type VIA patients suggested tissue specific and 267 collagen type specific lysyl hydroxylation could exist since different magnitudes of reduction in lysyl 268 hydroxylation between tissues was found [22][23][24][25]. Potential explanations were suggested such as the 269 11 complexity of different collagen types between tissues, a distribution in the expression and/or protein 270 levels of LH isoenzymes or a compensation by two other LH isoenzymes, which could hydroxylate the 271 peptides containing the sequences of hydroxylation sites in triple helices of type I and type IV collagen [9, 272 16, 51]. Here, we show that the purified type I collagen without telopeptide regions of tendon, skin and 273 bone from LH1 null are differentially modified between tissues (Figure 3). We also find a decrease of 274 total amount of hydroxylysine in type I collagen from P3H3 null mouse tissues, however the decrease was 275 less than in LH1 null mouse tissues (Figure 3). In contrast to LH1 null and P3H3 null, CypB null mice 276 showed the tissue dependent alteration in total amount of hydroxylysine. Hydroxylysine was reduced in 277 type I collagen from tendon and skin while an increase was observed in bone [31,32,47]. Therefore, 278 lysyl hydroxylation in triple helical domain of type I collagen is likely reactive by disruption of LH1 and 279 LH1 associated proteins. Contrary to the results found in type I collagen, there was interestingly no 280 significant difference in type V collagen from LH1 null skin [26]. Surprisingly, P3H3 null showed an 281 obvious reduction of lysyl hydroxylation in type V collagen from skin as well as type I collagen (Figures 282 3 and 6). Considering the results from the lung and kidney of LH3 mutant mice that demonstrate that the 283 amount of hydroxylysine was not changed in type I collagen rich fractions, but was reduced by 30% in 284 type IV and type V collagen rich fractions [24], we suggest that LH1 and LH3 have a substrate specificity 285 for at least type I collagen and type V collagen, respectively. In addition, given that the LH3 mutant mice 286 only showed a decrease of 30% in type IV and type V collagen rich fractions, that type V collagen from 287 P3H3 null showed a decrease of hydroxylysine content and also that the hydroxylysine content of bone 288 type I collagen was decreased in P3H3 null despite increasing in CypB null, we speculate that P3H3 289 might have a lysyl hydroxylation activity, namely, LH4 like enzyme in the rER. Although P3H3 belongs 290 to the prolyl 3-hydroxylase family, no substrate or enzyme activity have been identified. P3H3 could 291 interact with both LH1 and LH3, however this is unlikely because the phenotype of P3H3 null mice is 292 very mild. Thus, direct prolyl and lysyl hydroxylase activity assays are needed to determine if P3H3 acts 293 as prolyl and/or lysyl hydroxylase. However, attempts to produce necessary quantities of recombinant 294 protein have not succeeded.

295
In conclusion, P3H3 and LH1 play critical roles to hydroxylate lysine residues in crosslink 296 formation sites in type I collagen whereas they likely have distinct mechanisms to modify other sites in 297 type I collagen and to recognize different collagen types in the rER.      Values are given as means ± S.D. Biological replicates were n = 4 for all tissues and genotypes. 595 596 were obtained using mass spectrometry. Note: Lys; unmodified lysine, Hyl; unmodified hydroxylysine, GHL; galactosyl hydroxylysine, GGHL; 597 glucosylgalactosyl hydroxylysine, Lys-Lys; unmodified lysine and unmodified lysine, Lys-Hyl; unmodified lysine and unmodified hydroxylysine, Hyl-Hyl; 598 unmodified hydroxylysine and unmodified hydroxylysine 599 26 Values are given as means ± S.D. Biological replicates were n =3 for P3H3 WT and null, n ≥ 3 for LH1 WT and n = 4 for LH1 null. The mouse Leprel2 gene was eliminated using FRT sites and loxP sites by an FLPe recombinase 608 followed by a Cre recombinase in ES cell, respectively. (B) PCR genotyping of P3H3 wild type 609 Figure 10. The sequence alignment surrounding lysyl hydroxylation sites in type I and type 730 V collagen between human, mouse and bovine. The sequences are aligned ± 12 residues from 731 the lysine residue which is modified to hydroxylysine and highlighted by yellow with Bold. 732 Glycine residues in GXY repeats and the residues not conserved between human, mouse and 733 bovine are highlighted by cyan and green. Arginine residues in RGXY sequences are also 734 highlighted by magenta because these residues are critical for Hsp47 to bind to collagen triple 735 helices [53]. Uniprot entry numbers are as follows: human COL1A1 (P02452), mouse COL1A1 736