Crystal Structure of a Proteolytic Fragment of the Sensor Histidine Kinase NarQ

: Two-component signaling systems (TCSs) are a large and important class of sensory systems in bacteria, archaea, and some eukaryotes, yet their mechanism of action is still not fully understood from the structural point of view. Many TCS receptors are elongated ﬂexible proteins with transmembrane (TM) regions, and are di ﬃ cult to work with. Consequently, truncated fragments of the receptors are often used in structural studies. However, it is not fully clear whether the structures of the fragments correspond well to their native structures in the context of full-length proteins. Recently, we crystallized a fragment of Escherichia coli nitrate / nitrite sensor histidine kinase, NarQ, encompassing the sensor, TM, and HAMP domains. Here we report that a smaller proteolytic fragment consisting of the sensor and TM domains can also be crystallized using the in meso approach. The structure of the fragment is similar to the previously determined one, with minor di ﬀ erences in the vicinity of the truncation site. The results show that the crystallization of such sensor–TM fragments can be accomplished and can provide information on the packing of transmembrane helices, albeit limited, and that the proteolysis may or may not be a problem during crystallization.


Introduction
Two-component signaling systems (TCSs) are a large and important class of sensory systems in bacteria, archaea, and some eukaryotes [1][2][3][4]. As their name implies, they generally consist of two parts: (i) a sensory protein or a complex of proteins, often embedded in the membrane, and (ii) a response regulator (RR) protein, which is phosphorylated or dephosphorylated depending on the nature of the signal. Phosphorylation is achieved by a histidine kinase (HK), which phosphorylates its own histidine amino acid, and then transfers the phosphate to the RR protein. The most common Class I HKs act independently, whereas Class II HKs act as a part of large arrays formed by chemoreceptors and accessory proteins. Class I HKs and chemoreceptors often employ the same sensory domains and, in the case of transmembrane (TM) sensors, the cytoplasmic HAMP domain as a signal transduction module [5]. Overall, HKs have modular architecture; some minimalistic sensors consist of only three domains: sensor, dimerization/histidine phosphotransfer (DHp), and catalytic (CA), whereas more elaborate receptors have more domains, with TM, HAMP, PAS, and GAF domains interspersed between the sensor and DHp/CA domains [6][7][8][9]. Sensor proteins are inherently dynamic and can adopt multiple conformations; the presence of multiple domains and the transmembrane helices further complicates the structural studies. Consequently, no atomistic high-resolution structure of a full-length transmembrane TCS sensor protein has yet been determined experimentally, although several models were built using a combination of crystallography, electron microscopy, electron tomography, and molecular modeling [10][11][12]. In the absence of reliable models of full-length sensors in different signaling states, our understanding of the mechanisms of signal transduction in TCSs-especially across membranes-remains limited [5,8,9].
One of the popular approaches in structural studies of TCS proteins is the generation of truncated fragments encompassing one or more domains. Many structures of sensor domains of both chemoreceptors and histidine kinases in different signaling states have been determined, as well as those of intracellular modules such as HAMP, PAS, GAF, DHp, and CA [5][6][7][8][9]. Consequently, signal generation in the sensor domain and regulation of the catalytic domain are relatively well-understood. On the contrary, little direct structural information is available for the transmembrane fragments [5]. Experimentally determined atomic resolution structures obtained in the native state are available for only two proteins: Escherichia coli nitrate/nitrite sensor histidine kinase NarQ [13] and Natronomonas pharaonis sensory rhodopsin transducer HtrII [14,15]. Although the mechanism of the transmembrane signal transduction is most likely the piston-like motions of the TM helices [5], it is not clear whether it is general and how it is accomplished in the proteins from different subfamilies.
Recently, we crystallized a fragment of E. coli nitrate/nitrite sensor histidine kinase, NarQ, encompassing the sensor, TM, and HAMP domains and lacking the S-helix, GAF-like, DHp, and CA domains [13]. The protein was crystallized in three different forms-symmetric apo, symmetric holo, and asymmetric holo-which provided important structural data on TM signaling. Here we report that a smaller proteolytic fragment consisting of the sensor and TM domains could also be crystallized using the in meso approach and discuss the implications of this observation.

Cloning, Protein Expression, and Purification
Cloning, protein expression, and purification were performed exactly as described previously [13]. In short, the protein was expressed in E. coli using an auto-inducing medium [16] and purified using metal affinity (Ni-NTA) and size-exclusion chromatography. Protein-containing fractions were pooled and concentrated to 30 mg/mL for crystallization.

Mass Spectrometry
The matrix used for mass spectrometry contained 10 mg/mL sinapinic acid in a 50/50 v/v mixture of water and acetonitrile with 0.1% trifluoracetic acid. The sample was diluted 1:2 in the matrix and was deposited directly on the target. The sample was analyzed using a MALDI-TOF mass spectrometer (Autoflex, Bruker Daltonics, Bremen, Germany) operated in linear positive mode.

Crystallization
The crystals were grown using the in meso approach [17][18][19], similarly to our previous work [13]. The solubilized protein in the crystallization buffer was added to the monooleoyl-formed lipidic phase (Nu-Chek Prep, Elysian, MN, USA). Crystallization trials were set up using the NT8 robotic system (LCP version, Formulatrix, Bedford, MA, USA). The crystals were grown at 22 • C and reached the final size of~100 µm within 3 months. The crystals were obtained using the precipitant solution consisting of 1 M KH 2 PO 4 /Na 2 HPO 4 pH 5.2 and 5 mM NaNO 3 . Before harvesting, the crystals were incubated Crystals 2020, 10, 149 3 of 9 for 5 min in the respective precipitant solutions supplemented with 20% glycerol. All crystals were harvested using micromounts (MicroLoops HT, MiTeGen, Ithaca, NY, USA), then flash-cooled and stored in liquid nitrogen.

Acquisition and Treatment of Diffraction Data
The diffraction data were collected at 100 K at the European Synchrotron Radiation Facility (ESRF) beamline ID23-1 [20] equipped with a PILATUS 6M-F detector (Dectris, Baden-Daettwil, Switzerland). The data collection statistics are reported in Table 1. In all cases, the diffraction was anisotropic as determined by decay of the CC 1/2 values in 20 • cones along the reciprocal cell directions [21]. Diffraction images were processed using XDS (version from 15 October 2015) [22]. XSCALE (version from 15 October 2015) [22] was used to merge different datasets and to scale the data for the phasing steps. POINTLESS (version 1.9.16) [21] and AIMLESS (version 0.3.11) [21] were used to merge, scale, assess the quality, convert intensities to structure factor amplitudes, and generate Free-R labels.

Structure Determination and Refinement
The structure was solved using molecular replacement with MOLREP (version 11.2.08) [23] and the sensor and transmembrane domains from the structure of the sensor-TM-HAMP fragment (Protein Data Bank Identifier 5IJI) [13] as a search model. The model was refined manually using Coot (version 0.7.2) [24] and REFMAC5 (version 5.8.0073) [25]. The refinement statistics are summarized in Table 1.

Expression and Proteolysis of a NarQ Fragment
Similarly to our previous work [13], we overexpressed the fragment of E. coli NarQ encompassing the sensor, TM, and HAMP domains (residues 1-230), and a 6xHis tag at the C-terminus, in its native host, E. coli. Following expression, the protein was solubilized and purified using immobilized metal affinity chromatography and size exclusion chromatography. Following purification, the protein migrated as two bands in SDS-PAGE gels: a major band corresponding to the expected molecular weight (MW), and a minor band at a lower MW ( Figure S1). The major band could be stained using anti-His tag antibodies, whereas the minor band could not. Matrix-Assisted Laser Desorption/Ionization-Time Of Flight (MALDI-TOF) mass spectrometry revealed that the sample contained two species, the major one with MW of 26,751.5 Da, close to the expected MW of 26,780 Da, and the minor one with MW of 20,507 Da ( Figure S2). Consequently, we assumed that the minor species corresponded to a proteolytic fragment of NarQ missing the HAMP domain and C-terminal His tag (residues 1-181 or 1-182, expected MW 20,407 or 20,535 Da, correspondingly) that was copurified in heterodimers with the intact His-tagged construct. The sample was used for in meso crystallization without further purification.

Crystallization of a Proteolytic NarQ Fragment
While trying to crystallize NarQ in the apo form, we attempted an extensive search for crystallization conditions. One of the identified conditions resulted in crystals that diffracted to the resolution of~2 Å. Detailed analysis revealed that the crystals belong to the space group I4, and that their diffraction is anisotropic: cross-correlation CC1/2 reached 0.8 at 2.3 Å along the reciprocal axes a* and b*, and at 2.6 Å along axis c*. The data collection statistics are summarized in Table 1. The structure could be solved using molecular replacement with the structure of the sensor and TM domains from the previously determined structure of the sensor-TM-HAMP fragment of NarQ (PDB ID 5IJI [13]) as a search model, and revealed the positions of residues 13 to 170 comprising the sensor and TM domains. The NarQ fragments form membranelike layers in the crystals, as observed in type I crystals of other proteins grown in meso [18,19]. However, unlike in the previously obtained NarQ crystals, where neighboring proteins packed head-to-tail, in space group I4 the proteins packed head-to-head and in the same orientation ( Figure 1a). While some residual densities were observed in the interlayer Crystals 2020, 10, 149 4 of 9 space, no polypeptide chains could be traced there, and thus the nature of the contacts between the layers was not clear. Anisotropy of diffraction and faster decay of the diffraction quality along the reciprocal axis c* likely reflected the worse ordering in the direction normal to the layers.

Absence of the HAMP Domain in the Crystallized Fragment
No residual electron densities that could be ascribed to the HAMP domain were observed in the present data. Moreover, while there was enough space for the remaining TM helix residues (7-12 and 171-175) between the adjacent layers, there was not enough space for a folded nor an unfolded HAMP domain (Figure 1a). The unit cell size in the direction normal to the membrane was~183 Å (~91.5 Å per layer), as opposed to 118-120 Å per layer in the previously obtained crystals of the sensor-TM-HAMP fragment (space groups I2 1 2 1 2 1 , F222, and P2 [13]). Given that the protein sample that was used for crystallization contained a proteolyzed fraction, we concluded that the crystals likely corresponded to the proteolytic fragments. This was surprising, because homodimers of the proteolytic fragments could not be purified using metal affinity chromatography due to the lack of His tags, and thus could not be present in the original sample used for crystallization. Therefore, either the exchange of the protomers within the dimers could occur in the lipidic cubic phase during crystallization, or proteolysis was also happening after the setup of crystallization probes. Neither possibility can be excluded, as the I4 crystals appeared much slower, in~3 months, compared to the I2 1 2 1 2 1 , F222, and P2 crystals of sensor-TM-HAMP fragments that appeared in 1-2 weeks in similar conditions [13]. Another peculiarity was that the protein was proteolyzed despite being native to the expression host, E. coli, and no functional role of such proteolysis can be imagined at the moment.
His tags, and thus could not be present in the original sample used for crystallization. Therefore, either the exchange of the protomers within the dimers could occur in the lipidic cubic phase during crystallization, or proteolysis was also happening after the setup of crystallization probes. Neither possibility can be excluded, as the I4 crystals appeared much slower, in ~3 months, compared to the I212121, F222, and P2 crystals of sensor-TM-HAMP fragments that appeared in 1-2 weeks in similar conditions [13]. Another peculiarity was that the protein was proteolyzed despite being native to the expression host, E. coli, and no functional role of such proteolysis can be imagined at the moment.

Structure of the Proteolytic NarQ Fragment
The determined structure reveals the positions of residues 13 to 170 and comprises the sensor and TM domains. The sensor domain is in the ligand-bound conformation with the nitrate ion bound in its pocket (Figure 1b). Overall, the structure is similar to the symmetrical holo structure determined previously (Figure 2a Whereas sensor-proximal ends of the TM helices are well-ordered and positioned similarly to those in the bigger fragment, the ends distant from the sensor domain are progressively disordered, having weaker electron densities and elevated B-factors. A small deviation from the symmetrical holo structure is observed at the cytoplasmic side, which could be best described as a diagonal scissoringlike displacement for ~1.5 Å (Figure 2b,c). This is likely a consequence of the lack of the HAMP domain and highlights the fact that scissoring-like motion is permitted in NarQ [5,13].
Recently, Pollard and Sourjik found that similar sensor-TM constructs of the E. coli chemoreceptor Tar (but not Tap) are able to cluster due to the interactions mediated by the TM helices [27]. We did not observe any interactions between the TM domains belonging to different NarQ dimers (Figure 1a), however this likely reflects the fact that sensor histidine kinases do not require clustering for function, as opposed to bacterial chemoreceptors.

Impact of Truncation on the Structure
Several structures of bacterial nitrate sensors employing similar fold are currently available and allow us to estimate the effects of truncation. First, Cheung and Hendrickson determined the structure of the isolated sensor domain of the E. coli sensor histidine kinase NarX, closely related to NarQ, both in the ligand-free and the ligand-bound forms [28]. Second, Boudes et al. determined the structure of the full-length soluble transcription antiterminator protein NasR, which revealed that its NIT domain is structurally and functionally similar to the dimeric sensor domain of NarX [29]. Later, we determined the structure of a transmembrane fragment of NarQ [13].

Structure of the Proteolytic NarQ Fragment
The determined structure reveals the positions of residues 13 to 170 and comprises the sensor and TM domains. The sensor domain is in the ligand-bound conformation with the nitrate ion bound in its pocket (Figure 1b). Overall, the structure is similar to the symmetrical holo structure determined previously (Figure 2a, position of the membrane boundaries is calculated using the PPM server [26]), with root-mean-square deviation (RMSD) of C α positions of 1.0 Å and RMSD of the sensor domain C α positions (residues 37-145) of 0.24 Å.
Whereas sensor-proximal ends of the TM helices are well-ordered and positioned similarly to those in the bigger fragment, the ends distant from the sensor domain are progressively disordered, having weaker electron densities and elevated B-factors. A small deviation from the symmetrical holo structure is observed at the cytoplasmic side, which could be best described as a diagonal scissoring-like displacement for~1.5 Å (Figure 2b,c). This is likely a consequence of the lack of the HAMP domain and highlights the fact that scissoring-like motion is permitted in NarQ [5,13].
Crystals 2020, 10, x FOR PEER REVIEW 6 of 9 McpN, which turned out to be similar to that of the previously characterized nitrate-binding proteins [30]. In McpN, NarX, and NarQ, the nitrate-binding domain is found in similar structural contexts (proximal to a bundle of 4 TM helices), whereas in NasR the context is different (the protein is soluble). Consequently, we compared only the homodimeric nitrate-bound structures of McpN, NarX, and NarQ. In all of these structures, the residues in the ligand-binding region are ordered the best, whereas the residues closer to the truncation site are progressively disordered (Figure 3). Positions of the sensor proximal parts of the helices TM2 in the NarX structure mirror those in the NarQ structures, and positions of the TM helices in the structure of the sensor-TM fragment mirror those in the structure of the sensor-TM-HAMP fragment. Still, the termini at the truncation sites lack some of the native interactions; they are less ordered and could be affected by crystal contacts. Consequently, the information on their conformations should be used with care. Recently, Pollard and Sourjik found that similar sensor-TM constructs of the E. coli chemoreceptor Tar (but not Tap) are able to cluster due to the interactions mediated by the TM helices [27]. We did not observe any interactions between the TM domains belonging to different NarQ dimers (Figure 1a), Crystals 2020, 10, 149 6 of 9 however this likely reflects the fact that sensor histidine kinases do not require clustering for function, as opposed to bacterial chemoreceptors.

Impact of Truncation on the Structure
Several structures of bacterial nitrate sensors employing similar fold are currently available and allow us to estimate the effects of truncation. First, Cheung and Hendrickson determined the structure of the isolated sensor domain of the E. coli sensor histidine kinase NarX, closely related to NarQ, both in the ligand-free and the ligand-bound forms [28]. Second, Boudes et al. determined the structure of the full-length soluble transcription antiterminator protein NasR, which revealed that its NIT domain is structurally and functionally similar to the dimeric sensor domain of NarX [29]. Later, we determined the structure of a transmembrane fragment of NarQ [13]. Finally, Martín-Mora et al. determined the structure of the sensor domain PilJ of the Pseudomonas aeruginosa chemoreceptor McpN, which turned out to be similar to that of the previously characterized nitrate-binding proteins [30].
In McpN, NarX, and NarQ, the nitrate-binding domain is found in similar structural contexts (proximal to a bundle of 4 TM helices), whereas in NasR the context is different (the protein is soluble). Consequently, we compared only the homodimeric nitrate-bound structures of McpN, NarX, and NarQ. In all of these structures, the residues in the ligand-binding region are ordered the best, whereas the residues closer to the truncation site are progressively disordered (Figure 3). Positions of the sensor proximal parts of the helices TM2 in the NarX structure mirror those in the NarQ structures, and positions of the TM helices in the structure of the sensor-TM fragment mirror those in the structure of the sensor-TM-HAMP fragment. Still, the termini at the truncation sites lack some of the native interactions; they are less ordered and could be affected by crystal contacts. Consequently, the information on their conformations should be used with care.  Whereas the residues in the ligand-binding region are always well-ordered, the residues closer to the truncation site are progressively disordered.
Full-length transmembrane TCS sensors reach hundreds of angstroms in length, and probably are too flexible for crystallization. Choosing the site of truncation to generate a crystallizable construct is not always easy and straightforward. The two major considerations are (i) that the fragment should not be too flexible, and (ii) that the structure of the fragment should generate the required information. If only the information on the ligand binding mode is being sought, then the crystallization of a solitary sensor domain is sufficient. However, for studies of the signal transduction, not only the structures of the individual domains in the signaling and inactive states of  [30]. B-factor range is 11-98 Å 2 , average is 23 Å 2 . (b) Sensor domain of NarX, PDB ID 3EZH [28]. B-factor range is 13-86 Å 2 , average is 30 Å 2 . (c) Sensor and TM domains of NarQ, present work. B-factor range is 20-192 Å 2 , average is 71 Å 2 (20-122 Å 2 and 50 Å 2 for the sensor domain, correspondingly). (d) Sensor, TM, and HAMP domains of NarQ, PDB ID 5IJI [13]. B-factor range is 15-99 Å 2 , average is 34 Å 2 . Whereas the residues in the ligand-binding region are always well-ordered, the residues closer to the truncation site are progressively disordered.

Implications for Crystallization of TCS Sensors
Compared to the crystallization of soluble proteins, the crystallization of membrane proteins is hampered by the fact that the TM region of the protein must be embedded in the membrane, membrane mimic, or a detergent during purification and crystallization. For efficient crystallization, the protein molecules should form a three-dimensional network of contacts. Larger membrane proteins can be crystallized while being solubilized in detergent, similarly to soluble proteins. However, for smaller proteins, detergent micelles surrounding them shield the proteins from each other and preclude the formation of stable contacts. As a viable alternative, the in meso approach has been developed, reliant on the three-dimensional membranous cubic phases of special lipids that provide the medium for the proteins to diffuse and crystallize [17][18][19]31]. The approach has its own limitations, as the protein should fit into the cubic phase and should not be too big.
Full-length transmembrane TCS sensors reach hundreds of angstroms in length, and probably are too flexible for crystallization. Choosing the site of truncation to generate a crystallizable construct is not always easy and straightforward. The two major considerations are (i) that the fragment should not be too flexible, and (ii) that the structure of the fragment should generate the required information. If only the information on the ligand binding mode is being sought, then the crystallization of a solitary sensor domain is sufficient. However, for studies of the signal transduction, not only the structures of the individual domains in the signaling and inactive states of the protein should be determined, but also the structures of the linkers between them. The structure of the sensor-TM linker was the same in the proteolytic fragment studied here and in the previously determined structure of the sensor-TM-HAMP fragment. The structure of the TM-HAMP linker was the same in the previously determined structure and the structure of the TM-HAMP fragment of the protein Af1503 [32]. Yet, for crystallization, a fragment with the intact domains at both sides of the membrane is preferable, so that stable contacts can be formed between the proteins from different membranous layers: the sensor-TM-HAMP fragment of NarQ crystallized better and the crystals diffracted to a higher resolution [13] compared to the sensor-TM fragment presented here.

Conclusions
Our results clearly show that minimalistic TM fragments encompassing sensor and TM domains could be crystallized and could provide some information on the packing of the receptor transmembrane helices. The inclusion of a cytoplasmic domain would be preferable to fix the cytoplasmic ends of the TM helices in the correct arrangement. Partial proteolysis and the resulting impurity in the sample can have consequences for the crystallization of the original construct, as evidenced by the previously obtained structures [13]. The crystals of a proteolytic fragment could also be obtained from the same sample, but the information gained from the structure of this fragment would be inferior compared to the information gained from structures of bigger fragments.