1 Introduction

Photoproteins provide organisms with the ability to control many aspects of biology in response to light by absorbing photons and coupling the resulting photophysical or photochemical events to cellular signal transduction pathways. Dedicated photosensory structures such as the camera eyes of vertebrates, cephalopods, and box jellyfish are a familiar case, but many organisms lacking such structures nevertheless exhibit photosensory responses [1,2,3,4]. For example, plants can regulate initial development (morphogenesis), sexual development (flowering), tropic movements, chloroplast development, and the transcriptional control of thousands of genes in response to light [5,6,7,8,9,10,11,12,13]. Similarly, cyanobacteria can regulate movement, cell–cell aggregation, and expression of light-harvesting pigments and chromophores in response to the color and direction of light [14,15,16,17,18,19,20,21,22,23,24,25]. Non-photosynthetic organisms can also exhibit striking light responses, such as control of fruiting body formation in Stigmatella aurantiaca by red and far-red light [26] and control of sporulation in filamentous fungi by blue and red light [27,28,29]. A broad range of photoreceptors have evolved to meet these needs, typically utilizing small organic molecules as chromophores. Light absorption by such chromophores can trigger a variety of photochemical reactions, such as photoisomerizations about formal double bonds [30,31,32,33,34,35,36,37,38].

Photon absorption promotes the photoprotein chromophore into an excited electronic configuration, and subsequent de-excitation triggers cascading structural changes in the chromophore and surrounding protein matrix that alter the signaling state of the whole protein and thus provide a biological readout of the light environment [39,40,41,42]. The absorption spectrum of the holoprotein (protein–chromophore complex) determines the action spectrum for a given photobiological effect, providing a biological response to colors of light that are resonant with that absorption spectrum. However, such systems can also detect light intensity, day/night cycles, and even polarization of the incident light [4, 43,44,45,46,47]. As with light-harvesting proteins in photosynthesis, light absorption also poses a challenge for photoreceptors themselves. A photon at 750 nm, at the far-red edge of the visible spectrum, provides more than 35 kcal/mol of energy (~ 1.65 eV) when absorbed. This amount can exceed the free energy of unfolding for globular proteins [48]. To function as a biologically relevant, reusable sensor, the protein scaffold must channel this energy into de-excitation pathways that avoid irreversible denaturation and provide a biological function.

The photoreceptors of the phytochrome superfamily illustrate these processes. Different members of this family exhibit wavelength responses peaking at 378–764 nm, [49, 50] spanning the visible spectrum. These proteins use linear tetrapyrrole (bilin) chromophores that are covalently attached to conserved Cys residues and bound within a conserved pocket in a GAF (cGMP-specific phosphodiesterase/cyanobacterial adenylate cyclase/formate hydrogen lyase transcription activator FhlA) domain [51]. Light absorption by the bilin triggers photoisomerization of the 15,16–double bond between rings C and D, which flips the D-ring within the pocket [30, 31, 34, 36, 38, 52,53,54,55]. This ring flip triggers rearrangement of the protein–chromophore interactions and typically results in formation of a photoproduct with distinct spectral and biochemical properties relative to the dark-adapted state [5, 56, 57]. Such photoproducts can decay rapidly or persist for weeks, allowing this family to sense temperature or light intensity as well as light color [4, 58]. In land plants, the dark-adapted state Pr absorbs red light, whereas the Pfr photoproduct absorbs far-red light. Plant phytochromes thus can measure the ratio of red to far-red light, providing an assay for depletion of photosynthetically active red light by neighboring plants [5, 9]. Phytochromes are also found in eukaryotic algae, fungi, and other eukaryotes, as well as in both photosynthetic and nonphotosynthetic bacteria [59]. Photoisomerization of the bilin chromophore competes with alternate de-excitation mechanisms, including fluorescence and radiationless decay [60,61,62].

Phytochromes have also diversified considerably in cyanobacteria. Cyanobacterial members of this superfamily typically incorporate biliverdin IXα (BV) [63,64,65,66] or phycocyanobilin (PCB) [67,68,69] chromophores. In most phytochromes, the bilin-binding GAF domain is part of a larger knotted PAS-GAF-PHY photosensory core module requiring approximately 500 amino acids for self-assembly with chromophore and for full-reversible photoconversion [5, 70]. Cyanobacterial genomes encode such photosensors, but they also contain knotless GAF-PHY phytochromes [5, 71,72,73,74,75,76,77] and GAF-only cyanobacteriochromes (CBCRs) [56, 57, 78]. Different CBCRs regulate the transition between motile and sessile growth, phototaxis, and complementary chromatic acclimation (CCA or CA), in which cyanobacteria express different light-harvesting pigments and proteins under different colors of incident light [14, 15, 20,21,22,23, 25, 79,80,81,82]. CBCRs require < 200 amino acids for self-assembly and photoconversion [83,84,85]. CBCR domains are typically found as photosensory components in larger proteins. Tandem CBCR arrays are also common, associated with C-terminal signaling MCP (methyl-accepting chemotaxis protein; Fig. 1A) or histidine kinase domains [45, 49, 82, 85]. Oddly, at least one CBCR lineage is instead found as a full-length protein of < 180 amino acids, comprising only an isolated bilin-binding GAF domain, but is nevertheless conserved in a number of cyanobacterial genomes [86].

Fig. 1
figure 1

Photoconversion of bilin chromophores in cyanobacteriochromes (CBCRs). A Jellybean domain diagrams are shown for full-length Npun_R6012 (NpPixJ) and Npun_F2164 (NpPtxD) [82, 85]. CBCR domains are colored by photocycle. For Npun_F2164, photocycles are NpF2164g2, UV/blue (insert-Cys subfamily); NpF2164g3, violet/orange (insert-Cys subfamily); NpF2164g4, red/green; NpF2164g5, red-inactive; NpF2164g6, red/green; NpF2164g7, orange/green [45, 49]. For Npun_R6012, all three CBCR domains exhibit red/green photocycles [45]. NpF2164g1 and NpR6012g1 are homologous to the non-bilin-binding GAF1 of AnPixJ [85]. The red-inactive GAF5 domains of NIES21_09470 and WP_017322544 (this work) are found in equivalent primary sequence contexts to NpF2164g5, but their neighboring domains have not been characterized. Domain abbreviations: GAF, cGMP-specific phosphodiesterase/cyanobacterial adenylate cyclase/formate hydrogen lyase transcription activator FhlA; MCP, methyl-accepting chemotaxis protein. B Red/green CBCRs reversibly photoconvert between a C5–Z,syn C10–Z,syn C15–Z,anti red-absorbing dark-adapted state (left) and a twisted C5–Z,syn C10–Z,syn C15–E,anti green-absorbing photoproduct (right) via 15,16–photoisomerization of the PCB chromophore [36, 38, 54, 55]. C Photoconversion of wild-type NpR6012g4 is shown [90]. Absorption spectra for the 15Z state and the 15E photoproduct are shown in blue and orange, respectively. The 15Z15E difference spectrum is shown in green (zero change, black dashed line). D Absorption spectra and difference spectrum are shown for wild-type NpF2164g5, [45] using the color scheme of C. Illumination does not result in photoproduct formation (orange circles)

CBCRs have diversified into a range of subfamilies that can detect light from 378 – 741 nm via a range of tuning mechanisms [49, 54, 55, 65, 66, 81, 87,88,89,90,91,92,93]. Prototypical members of one abundant CBCR subfamily, the red/green CBCRs, exhibit a red-absorbing 15Z PCB chromophore in the dark-adapted state (Fig. 1B) [45, 85]. Photoconversion of this state with red light yields a green-absorbing 15E photoproduct (Fig. 1B, C). Such red/green CBCRs can function as sensors of light quality (color) or intensity and belong to the eXtended Red/Green (XRG) lineage, which also includes several other subfamilies with diverse photocycles [45, 46, 49, 66, 94,95,96]. Determination of solution and crystal structures for two prototypical red/green CBCR domains in both photostates elucidated conserved structural changes that occur upon photoconversion [36, 38]. Site-directed mutagenesis of one of these, NpR6012g4, has also identified conserved residues that are essential for spectral tuning of the green-absorbing photoproduct state [90]. Studies on NpR6012g4, AnPixJg2, and slr1393g3 implicated residues required for tuning the red-absorbing state, but these requirements are apparently somewhat more variable [33, 90, 97]. Such canonical red/green CBCRs, belonging to the XRG lineage and exhibiting red/green photocycles, are one of the more thoroughly studied CBCR types. Nevertheless, this subfamily also includes the closely related NpF2164g5 and AnPixJg3, which exhibit red-absorbing dark states that fail to undergo photoconversion (Fig. 1D) [45, 85, 98].

NpF2164g5 and AnPixJg3 remain folded after illumination, so light energy is channeled into non-destructive de-excitation pathways; however, they lack the biologically relevant photosensory de-excitation pathway of photoisomerization. Thus, red/green CBCRs provide an experimentally accessible system to study how protein–chromophore interactions can channel the excited state into different de-excitation pathways within a single protein scaffold once such interactions are identified. Such studies can also provide new insights for protein engineering efforts. There is growing interest in developing red-responsive tools based on CBCRs [46, 99,100,101,102], so the mechanisms that determine whether a given domain undergoes photoconversion or exhibits other de-excitation mechanisms can also inform the choice of initial mutations for development of a fluorescent or photoactive reporter which could then be subjected to further optimization.

In the current work, we have identified protein–chromophore interactions controlling photoisomerization in red/green CBCRs. Using phylogenetic analysis, we demonstrate that NpF2164g5 belongs to a lineage of CBCR domains from apparent orthologs of Npun_F2164/PtxD, which is required for phototaxis in Nostoc punctiforme [82]. Characterization of recombinant CBCR domains after co-expression in E. coli cells engineered to produce various bilins [103105] demonstrates that this lineage is a conserved group of red-inactive CBCRs from candidate PtxD orthologs (riPD-CBCRs). We then introduce similar behavior into NpR6012g4 with only three substitutions and restore photoconversion in NpF2164g5 with three converse substitutions. Fluorescence and circular dichroism (CD) spectroscopy provides evidence that the 15Z states of riPD-CBCRs exhibit subtle but significant differences from those of photoactive red/green CBCRs, and engineered red-inactive CBCRs mimic that inactive state. Our work thus identifies protein–chromophore interactions that determine the de-excitation mechanisms employed by bilin chromophores in this subfamily of photoproteins, implicating the existence of similar gating residues in other photoreceptor families.

2 Results

2.1 A lineage of red-inactive CBCRs

CBCRs exhibit astonishing spectral diversity, and this diversity is amenable to protein engineering. Site-directed mutagenesis has identified substitutions that can produce spectral changes of over 300 nm [86, 96, 106], changes in chromophore protonation state [81], and changes in chromophore specificity [65, 66, 91]. This plasticity also can make structure/function studies challenging, because even closely related proteins can exhibit markedly different behavior. For example, Npun_R6012 and AnPixJ are candidate phototaxis receptors with identical domain architectures [85]. Each has three red/green CBCR domains: GAF2, GAF3, and GAF4. NpR6012g4 is a well-characterized protein with a stable photoproduct state [36, 45, 54, 55, 90, 96, 98, 107,108,109,110], but the equivalent domain (AnPixJg4) instead gives rise to a very unstable photoproduct and hence can function as a sensor of light intensity [46]. AnPixJg2 is also well-characterized [33, 46, 65, 85, 111,112,113,114,115,116], whereas NpR6012g2 is unstable and is only amenable to basic characterization [45]. AnPixJg3 is inactive, whereas NpR6012g3 undergoes facile photoconversion [45, 85]. The difficulty in predicting such properties from primary sequence of closely related proteins makes it difficult to identify amino acids conferring those properties, particularly for cases in which only a single sequence is associated with a property of interest. For example, identification of residues that allow photoisomerization in NpR6012g4 or block it in NpF2164g5 would be challenging: to date, comparison with other red/green CBCRs shows that NpF2164g5 exhibits a unique combination of high fluorescence and no photoisomerization [45, 85, 97, 117].

We, therefore, began by seeking possible NpF2164g5 orthologs from an updated phylogenetic analysis of the XRG lineage (Fig. 2) [66, 96]. For this analysis, BLAST searches were used to identify CBCRs of interest. A multiple sequence alignment with 175 sequences was used to infer a maximum-likelihood phylogeny in PhyML-Structure as described in “Methods”, with AnPixJg2 (PDB ID 3W2Z), NpR6012g4 (6BHN and 6BHO), slr1393g3 (5DFX and 5M82), and JSC1_58120g3 (6XHH) as structural references [33, 38, 66, 118]. The resulting phylogeny placed NpF2164g5 within a small group of CBCR domains from apparently orthologous proteins (Figs. 1A, 2). All these proteins include seven GAF domains, with the N-terminal one predicted to be non-bilin-binding [85], and with C-terminal HAMP and MCP domains. The full-length Npun_F2164 protein (NpPtxD) is the major photoreceptor for positive phototaxis in Nostoc punctiforme hormogonia [82]. NpPtxD has six CBCR domains in tandem, all of which have been characterized (Fig. 1A) and all of which belong to the XRG lineage. NpF2164g2 and NpF2164g3 belong to the insert-Cys subfamily, and NpF2164g7 is an intensity sensor with peak responsiveness to orange light [45, 49]. NpF2164g5 itself is sandwiched between two photoactive red/green CBCRs, NpF2164g4 and NpF2164g6, which have similar photocycles but vary in photoproduct stability [45]. These three domains form a clade in the phylogenetic analysis, within which each CBCR clusters with equivalent GAF domains from apparently orthologous PtxD sensors. As such, NpF2164g5 groups with other GAF5 domains from proteins with equivalent domain structures, NpF2164g4 clusters with other GAF4s, NpF2164g6 with other GAF6s, and these together form a GAF4/5/6 clade (Fig. 2). This is consistent with evolution of these domains via internal duplication of a CBCR GAF domain within an ancestral phototaxis sensor, thus identifying potential NpF2164g5 orthologs. However, a similar pattern is seen for NpR6012g3 and AnPixJg3 (Fig. 2), despite the different behaviors of these domains [45, 85], so the putative NpF2164g5 orthologs might still exhibit photoconversion.

Fig. 2
figure 2

Phylogenetic analysis of XRG CBCRs. A phylogenetic tree is shown for XRG CBCRs. The recently described DPYLoar lineage, the insert-Cys CBCRs, and the outgroup of DXCF CBCRs are indicated [49, 66, 96] Proteins used in this study are indicated in bold. Known photoactive CBCRs are in blue, and known red-absorbing, photochemically inactive CBCRs are in red. Clades of interest are highlighted, including those containing AnPixJg2 and NpR6012g4, as is the conserved cluster of red-inactive CBCRs from candidate PtxD orthologs (riPD-CBCRs)

We, therefore, characterized two such domains: NIES21_09470g5 (hereafter, N21g5) is the fifth GAF domain encoded in the candidate phototaxis receptor BAY15132 (or NIES21_09470) from Anabaenopsis circularis NIES-21, and WP_017322544g5 (hereafter, 1732g5) is the fifth GAF domain in the candidate phototaxis receptor WP_017322544 from unclassified cyanobacterial species strain PCC7702. N21g5 incorporated PCB less efficiently than NpF2164g5 (Table 1), but this protein had the expected red-absorbing peak and failed to undergo photoconversion (Fig. 3A). 1732g5 incorporated PCB more efficiently than N21g5 but exhibited similar behavior otherwise (Fig. 3B and Table 1). Both proteins had peak wavelengths blue-shifted relative to photoactive red/green CBCRs but close to that of NpF2164g5 (Table 1). Crucially, both proteins are red-inactive CBCRs from candidate PtxD orthologs, defining a conserved riPD-CBCR lineage that is distinct from their photoactive closest relatives.

Table 1 Spectroscopic characterization of CBCRs with PCB chromophores
Fig. 3
figure 3

Characterization of red-inactive CBCRs from candidate PtxD orthologs (riPD-CBCRs). A Absorption spectra and difference spectrum are shown for N21g5 using the scheme of Fig. 1D. B Absorption spectra and difference spectrum are shown for 1732g5 using the scheme of Fig. 1D. C Normalized spectra are shown for the chromophore bands of NpF2164g5 (dark blue) and PZO58049g2 (red). D Absorption spectra and difference spectrum are shown for PZO58049g2 using the scheme of Fig. 1D

2.2 Engineering a red-inactive NpR6012g4 variant

The conserved behavior of the riPD-CBCR lineage allowed us to seek conserved protein–chromophore interactions that dictate this behavior. We first looked for residues conserved in these three proteins and proximal to the chromophore in the 15Z state. Five such residues were identified, which correspond to Tyr624, Gly632, His688, Phe695, and Leu714 in the photoactive CBCR NpR6012g4 (Fig. 4, left). For NpF2164g5, these residues correspond to Phe894, Val902, Tyr958, Ile965, and Phe984, respectively. Based on the NMR structure of NpR6102g4, these residues are indeed proximal to the chromophore in the 15Z red-absorbing state (Fig. 4, right): Tyr624, His688, and Leu714 are all within 4 Å of PCB, and the conserved hydrogen bonding between His688 and the chromophore is thought to play an important role in the red/green photocycle [33, 36, 38, 116]. Gly632 and Phe695 contact the Trp655 “lid Trp” known to be important for spectral tuning in AnPixJg2 and NpR6012g4 [33, 90]. Phe695 and Phe634 are also critical for spectral tuning of the photoproduct state, and Phe695 is also in direct contact with the chromophore D-ring in that photostate [36, 90]. We, therefore, sought to evaluate whether these residues might determine whether or not a given red/green CBCR is photoactive or photoinactive.

Fig. 4
figure 4

Sequence analysis of riPD-CBCRs. (left) Regions of interest are shown for CBCRs (photoactive red/greens, names in black; riPD-CBCRs, names in coral). Phe634 and Trp655 (cyan) are implicated in NpR6012g4 spectral tuning [90]. Five other ‘hallmark’ riPD-CBCR residues align with Tyr624 (periwinkle), Gly632 (mauve), His688 (mauve), Phe695 (periwinkle), and Leu714 (mauve) in NpR6012g4. Selected sequences from the alignment used for phylogenetic analysis were extracted without changing the alignment, and gap-only columns in the resulting alignment were removed. (right) The solution structure of NpR6012g4 in the 15Z state (PDB accession 6BHN) [36] is shown. Tyr624, Gly632, His688, Phe695, and Leu714 are highlighted in the same color scheme as in the alignment; numbering for NpF2164g5 is in parentheses

Previous studies have shown that the Y624H variant of NpR6012g4 exhibited normal red/green photoconversion, and the equivalent Phe894 residue of NpF2164g5 is found not only in riPD-CBCRs but also in the known photoactive red/green CBCRs NpR5113g2 and NpR4776g3 [45, 90], so it was clear that a Phe in this position is not critical for the photoinert Pr state of NpF2164g5. Similarly, both F695L and F695W variants of NpR6012g4 also exhibit normal forward photoconversion, not a surprising result given that the Ile965 residue of NpF2164g5 is found in the photoactive CBCR NpR4776g3 [45, 90]. Phe894 and Ile965 of NpF2164g5 thus seem unlikely candidates for gating photoisomerization: they are present in naturally occurring red/green CBCRs, and substitutions at these positions do not impede photoisomerization in NpR6012g4. The remaining riPD-CBCR candidate residues for blocking photoconversion correspond to G632V, H688Y, and L714F variants of NpR6012g4. Two of these variants have been previously characterized [90]. H688Y NpR6012g4 exhibited modest spectral shifts in both photostates, but photoisomerization proceeded efficiently. The same substitution is also present in more distantly related, photoactive CBCRs in the XRG lineage, such as the insert-Cys CBCRs and the hybrid CBCR Anacy_3174g6 [49, 96]. L714F NpR6012g4 also exhibited facile photoconversion and a normal red/green photocycle [90]. Again, these residues seem unlikely candidates as gating residues on their own.

We, therefore, tested the remaining candidate residue by characterizing the G632V variant of NpR6012g4. This protein exhibited a largely normal red-absorbing dark state (Fig. 5A), but its apparent first-order photoconversion (formally an approach to photoequilibrium) was notably slower than that of wild-type NpR6012g4 (Fig. 5B), consistent with a lower photochemical quantum yield. The G632V variant also exhibited a broadened and red-shifted photoproduct (Fig. 5A), indicating that Gly632 modulates both forward photoconversion and photoproduct tuning. A Val residue placed at this position in riPD-CBCRs thus seemed likely to play an important role in gating photoconversion but was not sufficient to block photoconversion by itself.

Fig. 5
figure 5

Engineering a red-inactive variant of NpR6012g4. A Absorption spectra and difference spectrum are shown for the G632V variant of NpR6012g4 using the scheme of Fig. 1C. B The time course of photoconversion under red light (650 ± 20 nm) is shown for wild-type and G632V NpR6012g4 (purple circles and green squares, respectively) at approximately equal optical density. The reaction was monitored by measuring absorbance at 652 nm, and data were fit to a single exponential (wild-type: kapp = 2.6 min−1; G632V, kapp = 0.13 min−1). C Absorption spectra and difference spectrum are shown for ri-NpR6012g4 (G632V H688Y L714F triple substitution: Table 2) using the scheme of Fig. 1D. D Absorption spectra and difference spectrum are shown for ri-AnPixJg2 using the scheme of Fig. 1D. E The absorption spectrum of ri-JSC1_58120g3 is shown. F Absorption spectra and difference spectrum are shown for ri-AM1_C0023g2 using the scheme of Fig. 1D. Amino acid substitutions used to construct candidate red-inactive sequences are presented in Table 2. All proteins were co-expressed with cyanobacterial enzymes for biosynthesis of PCB [103, 105]

Based on these results, we hypothesized that a combination of riPD-CBCR ‘hallmark’ substitutions would be required to ablate photoconversion in NpR6012g4. Some combination of G632V, H688Y, and L714F seemed more likely than combinations including Y624F and F695I due to known sequence variation as described above. Further support for this working hypothesis was obtained by mining the natural diversity of CBCRs. Extensive BLAST searches for CBCRs related to the recently described DPYLoar lineage of XRG CBCRs [66] identified one CBCR domain that possessed three out of five riPD-CBCR 'hallmark' residues, equivalent to Val902, Tyr958, and Phe984 in NpF2164g5 (or Gly632, His688 and Leu714 in NpR6012g4). This protein, GAF2 of PZO58049 from Phormidesmis priestleyi (hereafter, PZO58049g2), was not closely related to the riPD-CBCR lineage (Fig. 2; PZO58049g2 in red). PZO58049g2 was, therefore, characterized after recombinant expression with co-synthesis of PCB. PZO58049g2 exhibited a red-absorbing 15Z dark state with a similar peak wavelength to that of NpF2164g5 but with a broadened lineshape (Fig. 3C). PZO58049g2 also failed to exhibit photoconversion (Fig. 3D). The lack of photoconversion in this distantly related protein provided compelling support for a possible role of these three residues in gating photoconversion for XRG CBCRs.

We, therefore, expressed and characterized the trebly substituted G632V H688Y L714F variant of NpR6012g4. Similar to natural riPD-CBCRs (636–640 nm), this protein exhibited a red-absorbing 15Z dark state with peak absorption at 642 nm (Fig. 5C; Table 1) that was blue-shifted relative to wild-type NpR6012g4 (652 nm: Table 1). Illumination of G632V H688Y L714F NpR6012g4 (hereafter, ri-NpR6012g4 for “red-inactive” NpR6012g4) with red light failed to generate significant photoproduct (Fig. 5C), as was the case for naturally occurring riPD-CBCRs (Fig. 3 and Table 1). These three riPD-CBCR hallmark substitutions thus ablated photoconversion in NpR6012g4 with PCB chromophore.

2.3 Photoconversion and chromophore specificity in red/green CBCRs

Chromophore specificity can be quite variable in the XRG CBCR lineage. For example, red/green CBCRs from Acaryochloris marina, such as AM1_C0023g2 [119], can incorporate either PCB or BV, with the latter giving a red-shifted far-red/orange photocycle due to its longer conjugated system. DPYLoar proteins such as JSC1_58120g3 provide a contrast to this: they cannot bind PCB at all and are specific for BV or 181,182–dihydrobiliverdin (DHBV) [66]. We, therefore, tested the extent to which ablation of photoisomerization by the three riPD-CBCR hallmark residues would hold true for other bilin chromophores.

We first tested NpF2164g5 and ri-NpR6012g4 co-expressed in cells producing phytochromobilin (PΦB), the bilin chromophore produced by land plants [5, 120, 121]. Both proteins were able to incorporate this chromophore, and the resulting red-shifted adducts were unable to undergo photoconversion (Fig. 6). The observed red shift with PΦB is expected for red/green CBCRs [45], because this chromophore has a longer conjugated system relative to PCB. We next introduced the three riPD-CBCR hallmark substitutions into other CBCRs. Equivalent substitutions (Table 2; hereafter, ri-alleles) were introduced into the red/green CBCRs AnPixJg2, AM1_C0023g2, and JSC1_58120g3 for comparison with the respective wild-type proteins [66, 85, 119]. In contrast to wild-type AnPixJg2, ri-AnPixJg2 exhibited extremely poor chromophorylation (Table 1). Nevertheless, the small amount of chromophore present exhibited a red-absorbing 15Z state that was blue-shifted relative to wild type and failed to undergo photoconversion (Fig. 5D; Table 1). An even more severe effect was observed for ri-JSC1_58120g3, which was effectively unchromophorylated (Fig. 5E). By contrast, ri-AM1_C0023g2 exhibited robust incorporation of PCB, exhibiting a blue-shifted peak absorption relative to wild-type (634 nm; Table 1). This variant also failed to form a photoproduct with PCB (Fig. 5F), consistent with a general inhibitory effect of these substitutions on photoconversion when chromophorylation is not entirely lost.

Fig. 6
figure 6

Characterization of riPD-CBCRs with phytochromobilin chromophores. A Absorption spectra and difference spectrum are shown for NpF2164g5 incorporating phytochromobilin using the scheme of Fig. 1D. B Absorption spectra and difference spectrum are shown for ri-NpR6012g4 incorporating phytochromobilin using the scheme of Fig. 1D. C Normalized absorption spectra are shown for NpF2164g5 with PCB (dark blue) and PΦB (dashed mauve). D Normalized absorption spectra are shown for ri-NpR6012g4 with PCB (dark blue) and PΦB (dashed mauve)

Table 2 Amino acids targeted in protein engineering

We also examined the behavior of ri-AM1_C0023g2 with BV as chromophore, because the wild-type CBCR can bind either chromophore. A ri-AM1_C0023g2 preparation co-expressed with the cyanobacterial heme oxygenase gene ho1 for BV synthesis [105] exhibited two inactive bands in the red to far-red region of the spectrum, suggesting some type of heterogeneity (Fig. 7A; Table 3). We next tested additional heme oxygenases to see whether this behavior was reproducible. Expression of ri-AM1_C0023g2 with an α-specific variant of Pseudomonas aeruginosa HemO (or PigA) [122] resulted in lower chromophorylation (Table 3; Fig. 7A), and co-expression with the candidate heme oxygenase POZ53545 from the methanotroph Methylovulum psychrotolerans (strain Sph1T) [123, 124] resulted in higher chromophorylation (Table 3; Fig. 7A). All three preparations of ri-AM1_C0023g2 exhibited two bands in the red to far-red region (Fig. 7A; Table 3), and neither peak was affected by illumination (Fig. 7B). Interestingly, acid denaturation of ri-AM1_C0023g2 incorporating BV revealed only a single band (Fig. 7C), with peak wavelengths and properties similar to those of covalent 15Z BV adducts of AM1_c0023g2 and JSC1_58120g3 (Table 3) [66]. We conclude that both PCB and BV adducts of ri-AM1_C0023g2 are photoinactive CBCRs. Taken together, these experiments demonstrate that riPD-CBCR hallmark variants of red/green CBCRs may exhibit severe chromophorylation defects but are generally unable to undergo photoconversion.

Fig. 7
figure 7

Characterization of ri-AM1_C0023g2 with biliverdin chromophore. A Detailed views of the chromophore bands are shown for 15Z absorption spectra of ri-AM1_C0023g2 after co-expression with the heme oxygenase ho1 from Synechocystis (bronze), with a variant of Pseudomonas aeruginosa HemO (PigA) that is specific for production of biliverdin IXα (α–HemO, dark purple) [122], and with POZ53545 from Methylovulum psychrotolerans Sph1T (coral). B A detailed view is shown for photochemical characterization of ri-AM1_C0023g2 after co-expression with POZ53545 (15Z, blue; + far-red light, brick red circles; + subsequent red light, open triangles; total difference spectrum, mauve). Far-red light was supplied with LEDs (728 nm, Sanyo). Red light was supplied with a filtered xenon lamp (670 ± 20 nm; see “Methods”). C Absorption spectra and difference spectrum are shown for ri-AM1_C0023g2 after co-expression with variant HemO and concentration. Protein was then illuminated with far-red light (not shown), denatured (orange circles), and illuminated with white light (blue trace, 15Z form). A detail view of the chromophore bands is presented, with a color scheme equivalent to that of Fig. 1D

Table 3 Characterization of biliverdin adducts of ri-AM1_C0023g2

2.4 Restoration of activity in naturally occurring riPD-CBCRs

We next tested whether converse substitutions are sufficient to restore photoisomerization in natural riPD-CBCRs. We, therefore, constructed “reactivated” alleles of NpF2164g5 and N21g5 (hereafter, ra-NpF2164g5 and ra-N21g5; Table 2). Although chromophorylation was substantially reduced relative to wild type for both proteins (Table 1), both exhibited photoactive red-absorbing states that yielded yellow-absorbing photoproducts (585–590 nm; Fig. 8A–C) upon illumination with red light. Acid denaturation and subsequent photoconversion of the ra-NpF2164g5 photoproduct confirmed that photoisomerization of the PCB chromophore had been restored (Fig. 8D). These three residues are thus able to function to gate photoisomerization, by either blocking isomerization or permitting it.

Fig. 8
figure 8

Characterization of reactivated variants of riPD-CBCRs. A Absorption spectra and difference spectrum are shown for ra-NpF2164g5 using the scheme of Fig. 1C. B The time course of forward photoconversion is shown for concentrated ra-NpF2164g5 by monitoring depletion of the 15Z state (654 nm, blue circles) and rise of the 15E state (586 nm, orange squares) under red light (650 ± 20 nm). C Absorption spectra and difference spectrum are shown for concentrated ra-N21g5 using the scheme of Fig. 1C. A detail view of the chromophore bands is presented. D Absorption spectra and difference spectrum are shown for concentrated ra-NpF2164g5. Protein was illuminated with red light, denatured (orange trace), and then illuminated with white light (blue trace, 15Z form). A detail view of the chromophore bands is presented, with a color scheme equivalent to that of Fig. 1C

2.5 Conserved properties of riPD-CBCRs

NpF2164g5 exhibits higher red fluorescence and stronger CD signals relative to the 15Z states of photoactive red/green CBCRs [45]. We, therefore, characterized a range of CBCRs using these techniques to assess whether these behaviors are conserved in the riPD-CBCR lineage and in the synthetic ri-alleles. Wild-type NpR6012g4 exhibits an unusually high quantum yield for forward photoisomerization [98, 107, 109, 110], confounding its characterization using static fluorescence techniques. Wild-type AM1_C0023g2 with PCB chromophore was better behaved (Fig. 9A). This CBCR had an estimated fluorescence quantum yield of 5.2% (Table 4), higher than that observed in photoactive phytochromes [104]. As observed in NpF2164g5, the riPD-CBCRs N21g5 and 1732g5 exhibited bright fluorescence with higher fluorescence quantum yields (Fig. 9B, C; Table 4). Engineered riPD-CBCR variants also faithfully replicated these properties (Fig. 9D–F; Table 4). In the course of these studies, we also observed prep-to-prep variation in the fluorescence quantum yield of NpF2164g5. Whereas a previous preparation exhibited comparable fluorescence to the engineered red-fluorescent Y176H variant of Cph1 [45, 104], more recent preparations of NpF2164g5 exhibited even higher relative fluorescence. The preparation used in the current work was found to have a quantum yield of 22% when compared to YH Cph1 and to a commercial reference dye (Fig. 9F; Table 4), values comparable to those obtained with other red-inactive CBCRs (Table 4). No NpF2164g5 preparation exhibited photoconversion.

Fig. 9
figure 9

Characterization of riPD-CBCRs using fluorescence spectroscopy. Normalized absorption and fluorescence spectra are shown for AM1_C0023g2 (A), N21g5 (B), 1732g5 (C), ri-NpR6012g4 (D), and ri-AM1_C0023g2 (E). Absorption, dashed dark purple; fluorescence excitation, bronze; emission, coral. Excitation wavelength for emission spectra was 625 nm (A, C–E) or 630 nm (B). Emission wavelength for excitation spectra was 670 nm (A, D, E), 675 nm (B), or 665 nm (C). F Fluorescence quantum yields were estimated using a modification of the ratio method previously used for Y176H Cph1 [104] as described in “Methods”, shown here for NpF2164g5 (dark blue circles) and ri-AM1_C0023g2 (brick red squares) with Alexa647 (orange triangles) as reference

Table 4 Characterization of CBCRs using fluorescence spectroscopy

Our previous characterization of red/green CBCRs using CD spectroscopy revealed considerable variation in the strength of the red band in the dark-absorbing state but not the photoproduct [45]. This work was not quantitative due to variations in expression levels and protein stability that required microcuvettes for some samples, so we examined a panel of nine photoactive red/green CBCRs in the dark state under consistent conditions to provide a better reference set (Table 5). Six of these CBCRs exhibited robust formation of thermally stable photoproducts that also allowed quantitation of the photoproduct without contaminating signals. NpR6012g4 was measured repeatedly to provide an internal control for day-to-day variation. This proved negligible for these samples, with a mean relative CD of – 18 ± 1 mdeg/Abs for the red band and – 61 ± 1 mdeg/Abs for the green band (error reported as one standard deviation with n = 6). As expected [45], the rotational strength of the red-absorbing 15Z band varied considerably between different photoactive red/green CBCRs, whereas the rotational strength of the 15E photoproduct was more consistent (Fig. 10, Fig. S1; Table 5). In contrast, the red-absorbing band of riPD-CBCRs gave stronger signals with less variation between proteins (Fig. 11; Table 5). Engineered ri-NpR6012g4 and ri-AM1_C0023g2 authentically replicated this behavior (Fig. 11; Table 5). Hence, engineered red-inactive variants not only prevent forward photoconversion but also mimic the slight blue shift (Table 1), higher fluorescence (Table 4), and stronger CD (Table 5) of naturally occurring riPD-CBCRs in their red-absorbing 15Z states.

Table 5 Characterization of CBCRs using CD spectroscopy
Fig. 10
figure 10

Characterization of photoactive red/green CBCRs using CD spectroscopy. CD spectra are shown for the 15Z (blue) and 15E (orange) states of AnPixJg2 (A), NpF2164g4 (B), NpR6012g4 (C), and NpR5113g2 (D). Absorption and CD spectra for other proteins in Table 5 are available in Fig. S1

Fig. 11
figure 11

Characterization of red-inactive CBCRs using CD spectroscopy. CD spectra are shown for the 15Z states of N21g5 (A), 1732g5 (B), ri-NpR6012g4 (C), and ri-AM1_C0023g2 (D)

3 Discussion

In this work, we have identified protein–chromophore interactions that control whether red/green CBCRs are able to undergo photoisomerization. Photoisomerization in phytochromes and CBCRs can proceed via conventional passage through a conical intersection during de-excitation, but this process can also yield a high-energy intermediate on the ground-state surface that itself can generate photoproduct, resulting in a higher photochemical quantum yield [107, 110]. Other de-excitation mechanisms are also present in this photoreceptor superfamily, including fluorescence and excited-state proton transfer [62, 125, 126]. It is thus striking that only three substitutions can change the fate of the excited population in CBCRs such as NpR6012g4 and NpF2164g5. Nevertheless, we have demonstrated that introduction of three residues found in riPD-CBCRs is sufficient to ablate photoconversion in photoactive red/green CBCRs (Fig. 7) and that the converse substitutions restore photoconversion in naturally occurring riPD-CBCRs (Fig. 8). Furthermore, only one CBCR domain has been found to date that has these three residues and is not a candidate NpF2164g5 ortholog (i.e., a GAF5 within a domain architecture equivalent to that of Fig. 1F). That domain, PZO58049g2, is also a red-inactive CBCR (Fig. 3D). We thus conclude that these three residues effectively block photoisomerization in red/green CBCRs.

We hypothesize that this effect arises from subtle differences in the chromophore-binding pocket that constrain chromophore motions in riPD-CBCRs. Both absorption and CD spectroscopy implicate the existence of a conserved, distinct ground-state geometry in riPD-CBCRs. Natural and engineered riPD-CBCRs show slight but significant blue shifts for both the red-absorbing band and the Soret transition in the near-UV relative to photoactive red/green CBCRs (Fig. 12A). Similarly, the relative rotational strength of the 15Z red-absorbing band in riPD-CBCRs (calculated as [peak CD signal]/[maximum absorbance]) is significantly stronger than that of photoactive red/green CBCRs (Fig. 12B), an effect not seen for the Soret transition (Fig. 12C). The high fluorescence seen in riPD-CBCRs should be associated with a longer-lived excited state, as is the case for engineered fluorescent Cph1 and for NpF2164g5 itself [62, 98, 127]. However, a distinct, sterically constrained ground state could also populate a distinct excited-state geometry with a higher barrier for isomerization. Such an effect could arise due to corresponding changes in the excited-state potential energy surface or by populating a distinct region on a surface similar to that of photoactive CBCRs. In either case, this effect would disfavor a major de-excitation pathway, slowing the decay of the excited state. Future studies will be needed to test this hypothesis, with a particular need for vibrational studies that are sensitive to slight differences in ground-state chromophore geometry [112, 128,129,130].

Fig. 12
figure 12

Characteristic spectral parameters of riPD-CBCRs. A 15Z peak absorption wavelengths for the Soret band (y-axis) and red band (x-axis) are plotted for the 15Z photostates of photoactive red/green CBCRs (blue circles) and riPD-CBCRs (red squares). The blue shift in peak wavelength was statistically significant for both bands (p < 0.01; see “Methods”). B Relative CD intensity (CD/Abs, Table 5; y-axis) is plotted against peak absorption wavelength (Table 1; x-axis) for the 15Z red band of photoactive red/green CBCRs and riPD-CBCRs. The difference in relative CD intensity for the two types of CBCR was statistically significant (p < 0.01). C Parameters are plotted as in B for the 15Z Soret band. The difference in relative CD intensity for the two types of CBCR was not statistically significant (p = 0.107). Dashed lines indicate mean values, and error arcs are drawn at two standard deviations

We have also demonstrated that it is possible to restore photoconversion in the naturally inactive NpF2164g5 and N21g5. The resulting ra-CBCR alleles undergo photoisomerization but do not restore a normal red/green photocycle (Fig. 8). However, this is to be expected: in the absence of photoisomerization, there is no selective pressure to maintain spectral tuning of the photoproduct. Consistent with this interpretation, the observed photoproducts have peak absorption at 585–590 nm. These states are red-shifted relative to the observed variation in red/green CBCRs (528–556 nm) but lie within the range observed for NpR6012g4 variants that exhibit defective photoproduct tuning (568–594 nm) [45, 90]. It may be possible to restore a green-absorbing photoproduct with additional engineering, but this is beyond the scope of the present investigation. These reanimated proteins may also provide a tool for testing the in vivo function of domains such as NpF2164g5 by restoring photoconversion in the inactive domains of authentic, full-length phototaxis receptors such as Npun_F2164 (NpPtxD) [82].

We have also demonstrated that the three substitutions used to generate riPD-CBCR alleles are not specific for the PCB chromophore, an important prerequisite for potential applications in systems that do not contain this chromophore or CBCRs that do not bind this chromophore. Both engineered ri-NpR6012g4 and intrinsically inactive NpF2164g5 remain inactive with PΦB (Fig. 6). Similarly, ri-AM1_C0023g2 retains affinity for BV chromophore and generates two inactive species (Fig. 7). These species cannot be distinguished using an acid denaturation assay, indicating that they are both some type of covalent 15Z BV adduct. Known crystal structures for BV-binding CBCRs have demonstrated that adduct formation can occur at either C31 or C32 of the A-ring endo-vinyl moiety [65, 66]. The two bands observed in ri-AM1_C0023g2 incorporating BV could thus differ at the site of covalent attachment. Alternately, these bands could arise due to heterogeneous protein–chromophore interactions, as has been observed in NpR6012g4 [36, 90], or due to differences in stereochemistry at a C31 linkage that might alter the conformation of the A-ring.

We also found that chromophorylation of ri-AM1_C0023g2 exhibited considerable variation apparently dependent on the heme oxygenase used to convert heme into BV during recombinant co-expression in E. coli. This result implicates a potential confounding variable in future applications of CBCRs in heterologous systems: the efficiency of chromophorylation can vary considerably depending on the heme oxygenase that is present in the target cell. Optimization for expression in a given system may thus require provision of an appropriate heme oxygenase or specific optimization for the available heme oxygenases. Hence, this work provides new insights into structure/function relationships of CBCRs and into the potential payoff and pitfalls of developing CBCR-based reagents for optogenetics and synthetic biology.

4 Methods

4.1 Phylogenetic analysis

CBCR sequences were identified using BLAST [131] searches against the Genbank and DOE-IMG databases. Multiple sequence alignments were performed in MAFFT [132] (v7.450; command-line settings –genafpair –maxiterate 16 –clustalout –reorder). For maximum-likelihood phylogenetic analysis using structural information, the resulting alignment was processed with an in-house script to remove positions having ≥ 5% gaps and to add secondary structure and solvent accessibility assignments using STRIDE [133] with AnPixJg2 (PDB ID 3W2Z), NpR6012g4 (6BHN and 6BHO), slr1393g3 (5DFX and 5M82), and JSC1_58120g3 (6XHH) as reference structures [33, 36, 38, 66]. The final alignment had 175 species and 157 characters. A phylogeny was then inferred in PhyML-structure using the six-matrix EX_EHO model for positions of defined structure and the LG substitution matrix for positions with no structural information, with support assessed using the Shimodaira–Hasegawa approximate likelihood ratio test (SH-aLRT) method (command-line settings -m EX_EHO -M PART -a e -c 4 -v e -o tlr) [118, 134]. The resulting tree was processed using FigTree (https://github.com/rambaut/figtree/releases). Structural figures were prepared using VMD and Tachyon [135, 136].

4.2 Protein expression and purification

Amino acid regions and locus information are reported in Table 6. Expression of intein-CBD fusion constructs for purification of NpR6012g4 and NpF2164g5 was carried out as described previously [45, 109]. Variants of NpR6012g4 were constructed using the Quikchange procedure (Agilent), and the resulting proteins were purified following the procedures used for the wild-type proteins. The 1732g5, PZO58049g2, and the trebly substituted ra-NpF2164g5 variant were obtained as synthetic genes from Genscript with NcoI and XmaI sites permitting expression as intein-CBD fusion proteins. They were then purified following the same procedure.

Table 6 Amino acids used in expression constructs

CBCRs AM1_C0023g2, N21g5, 1732g5, and their variants were acquired as synthetic genes from Genscript and were cloned into pET28-RcaE by replacing the RcaE ORF for expression and purification as previously described [81]. Wild-type AnPixJg2 was expressed and purified following the same procedure, using the previously described His-tagged fusion construct [85] (generous gift of Prof. Rei Narikawa, Shizuoka University). N21g5, ra-N21g5, ri-AnPixJg2 and ri-AM1_C0023g2 were obtained as synthetic genes from Genscript, and those proteins were expressed and purified following the same procedure used for other His-tagged proteins.

All CBCRs were purified after co-expression with enzymes providing chromophore biosynthesis. PCB and PΦB were generated for intein-CBD constructs using pPL-PCB and pPL-PΦB, respectively [103, 104] For His-tagged constructs, PCB was produced using pKT271 [105]. For testing incorporation of BV into His-tagged ri-AM1_C0023g2, three approaches were used. In one approach, plasmid pACYC-Duet-PaHemO was constructed using an NdeI–XhoI fragment containing an α-specific variant of Pseudomonas aeruginosa HemO (also known as PigA) [122] from a pET21-based expression construct (generous gift of Prof. Angela Wilks, University of Maryland) that was cloned into pACYC-Duet (Novagen), retaining a 3′ sequence encoding an in-frame C-terminal S-tag. In another approach, additional SpeI and PstI sites were added to the pKT271 derivative Bam-pKT [121] to yield Spam-pKT. This facilitated cloning of alternate heme oxygenases or bilin reductases into the pKT271 context. We then introduced the candidate heme oxygenase POZ53545 from the genome of Methylovulum psychrotolerans Sph1T, a methanotrophic psychrophile [123, 124], into Spam-pKT using NdeI and XhoI sites. This plasmid and pKT271 were then cut with XhoI and SalI to excise the pcyA gene, resulting in plasmids having either POZ53545 or ho1 from Synechocystis sp. PCC6803 as heme oxygenases without a ferredoxin-dependent bilin reductase. Co-expression with ri-AM1_C0023g2 was carried out as for other His-tagged proteins. In brief, fresh transformants were used to establish 100 ml starter cultures in lysogeny broth (LB) supplemented with 20 µg/ml chloramphenicol and kanamycin. After overnight growth, 5 ml starter culture was inoculated into 1 l of the same media supplemented with 84 mg amino-levulinic acid (ALA, Frontier Scientific) and grown to OD600 of 0.8 at 37 °C in the dark with shaking (200 rpm). Both plasmids were then induced by addition of 1 ml IPTG (Teknova) from 1 M stock solution to each flask. Shaking was reduced to 120 rpm and temperature was reduced to 20 °C for overnight growth. Cells were harvested by centrifugation, and pellets were stored at – 80 °C to await purification. Purification of the resulting BV adducts was carried out as for other His-tagged constructs.

4.3 Spectroscopic characterization of purified CBCRs

Absorption spectra were acquired on a Cary 50 spectrophotometer in TKKG buffer (25 mM TES-KOH pH 7.8, 100 mM KCl, 10% (v/v) glycerol) at 25 °C. Photoconversion was triggered in the absorption cuvette using either 728 nm LEDs at room temperature (Sanyo) or a red laser pointer (632.8 nm, 2 mW), a green laser pointer (532 nm, 2 mW), or a xenon source equipped with band-pass interference filters from CVI Melles Griot (500 ± 10 nm, 550 ± 35 nm, 600 ± 20 nm, 650 ± 20 nm, 670 ± 20 nm) or Chroma (580 ± 20 nm) at 25 °C. For denaturation assays [49, 83, 84, 137, 138], a 100 µl aliquot of protein was added to 1 ml of 7 M guanidinium chloride/1% HCl (v/v). Denatured samples were illuminated using the xenon lamp equipped with a 320 nm long-pass filter. CD spectra were acquired on an Applied Photophysics Chirascan using a 2 nm bandwidth as baseline-corrected, unsmoothed single scans. To assess statistical significance for peak wavelengths and relative rotational strengths, Student’s t test was used with no assumption of equal variance with n = 9 for photoactive red/green CBCRs and n = 5 for riPD-CBCRs, all using PCB chromophore. One riPD-CBCR, ri-AnPixJg2, was omitted from this analysis due to poor chromophorylation. Chemical configuration for all holoproteins was assigned using the denaturation assay; in some cases, native samples were concentrated using MicroCon 10 K centrifugal concentrators prior to photochemical characterization and subsequent denaturation.

Fluorescence spectra were acquired on a QM-6/2005SE fluorimeter equipped with red-enhanced photomultiplier tubes (Photon Technology International 814 Series). Fluorescence quantum yields were estimated using a ratio approach [104]. In brief, dilution series were prepared for one or more unknown samples and one or more reference samples of known quantum yield. Reference samples were Y176H Cph1 [104], NpF2164g5 [45], and commercial Alexa647 (Thermo Fisher). Absorbance and emission spectra were then taken for each sample and concentration, using a constant excitation wavelength for emission spectra. Numerically integrated emission was plotted versus absorbance at the excitation wavelength for each sample and fit by linear regression. In this treatment, the fitted slope is proportional to the quantum yield, allowing the fluorescence quantum yield of an unknown sample to be estimated from the ratio of its slope to the slope of the reference(s). All statistical and regression analyses were performed in Kaleidagraph.