Characterisation of the salmon cystic fibrosis transmembrane conductance regulator protein for structural studies

The cystic fibrosis transmembrane conductance regulator protein (CFTR) is a chloride channel highly expressed in the gills of Salmo salar, with a role in osmoregulation. It shares 60% identity with the human CFTR channel, mutations to which can cause the common genetic disorder cystic fibrosis CF. The expression and localisation of salmon CFTR have been investigated, but the isolated protein has not been extensively characterised. Here we present a protocol for the purification of recombinant salmon CFTR, along with biophysical and structural characterisation of the purified protein. Salmon CFTR was overexpressed in Saccharomyces cerevisiae, solubilised in the detergent LPG-14 and chromatographically purified by nickel-affinity and size-exclusion chromatography methods. Prior to size-exclusion chromatography samples of salmon CFTR had low purity, and contained large quantities of aggregated protein. Compared to size-exclusion chromatography profiles of other orthologues of CFTR, which had less evidence of aggregation, salmon CFTR appeared to have lower intrinsic stability than human and platypus CFTR. Nonetheless, repeated size-exclusion chromatography allowed monodisperse salmon CFTR to be isolated, and multi-angle light scattering was used to determine its oligomeric state. The monodispersity of the sample and its oligomeric state were confirmed using cryo-electron microscopy and small-angle X-ray scattering (SAXS). These data were also processed to calculate a low-resolution structure of the salmon CFTR, which showed similar architecture to other ATP-binding cassette proteins.


Introduction
Cystic fibrosis transmembrane conductance regulator (CFTR; ATP-binding cassette (ABC) sub-family C, member 7) is an integral membrane protein expressed mainly in vertebrate epithelia [1,2].CFTR is a chloride channel with complex regulation and a role in osmoregulation in several tissues (for reviews see [3,4,5]).In humans, CFTR mutations are associated with cystic fibrosis (CF), the most common lethal genetic disease in populations of European descent (for reviews see [6,7,8]).It is predominantly expressed in the apical membrane of epithelia, where it provides a pathway for anion movement and regulates fluid flow.
In salmon (and other teleost fish), CFTR is expressed in specialised cells of the gills called ionocytes, where it aids in osmoregulation during adaptation to salt water [9].Immunohistochemistry has shown that CFTR is localised to the apical crypt of ionocytes in saltwater-adapted fishes [10].Salmon have two isoforms of the CFTR gene (I and II) coding for proteins of 1518 and 1519 amino acids respectively (Table 1) [11].These protein isoforms have 95.3% identity to one another and 59.5-60% identity with human CFTR (Table 1).Expression of the two isoforms is regulated by cortisol [12] and quantitative PCR studies have shown that the relative amounts of mRNA for each isoform alters when salmon adapt between freshwater and marine environments [13].Beyond this, the regulation and function of the salmon CFTR (sCFTR) protein have not been subject to extensive study.
Both due to its role in CF, and in order to better understand its fundamental function, elucidating the structure of CFTR is an important goal.The molecular structure of the full-length human CFTR (hCFTR) protein has been successfully studied by low-resolution electron microscopy (EM) methods [14,15,16,17], while high-resolution structures have been solved only for the soluble nucleotide binding domains (NBDs) [18,19].Further details on the structure of the intracellular domains of CFTR have been obtained by small-angle X-ray scattering (SAXS): the description of the ATP regulation of the NBDs [20], the effect of a potentiator on the NBD stability [21] and three-dimensional models of the native and the phosphorylated R-region [22,23].Defining the structure of full-length CFTR at high resolution is complicated for a number of reasons.Most methods for high-resolution structural study demand large amounts of pure, monodisperse protein.CFTR is a high molecular mass (~168 kDa) membrane protein, which poses considerable problems for its expression, purification and solubilization.Expression levels in mammalian cell lines are relatively low, and though stably transfected cell lines are now available they are expensive to maintain.CFTR is difficult to extract from the membranes of insect cells [24], while attempts to express the large membrane proteins in bacteria have met with little success, probably due to the size of the gene and the different mechanisms in prokarya and eukarya for post-translational modifications.For this reason, expression in yeast may provide a useful compromise between cost, yield and function of the protein.
For large membrane proteins, moving from purified protein to diffracting protein crystals remains challenging, and several approaches to solving this problem have been suggested [25,26,27,28].Targets for crystallization are increasingly identified using high throughput approaches.This has allowed the parameters defining good protein targets and purification strategies to be methodically defined [27,29,30,31].Considering more options (i.e.screening multiple gene constructs, orthologues from diverse species and even homologues from different phyla) allows the most productive targets to be selected.Such an approach was used successfully in determining a structure of mouse P-glycoprotein at 3.8 Å resolution [32].Therefore, one rationale for considering diverse orthologues of CFTR is to identify one that may be more easily purified to monodispersity for structural studies.
It has been hypothesized that the tendency of a protein to form large aggregates is related to its instrinsic stability [33,34,35] and there is considerable evidence that monodisperse protein samples are more likely to crystallise [26,35,36,37].Techniques based on the scattering of electromagnetic radiation are also sensitive to heterogeneity due to the high intensity of scattering from large aggregated complexes, which may obscure signal from monomeric protein.For EM studies heterogeneity does not prevent data acquisition but can hinder structure determination due to the automation of particle selection, which will not discriminate between aggregated and non-aggregated material.Therefore, if possible, protein for structural studies should be monodisperse.The monodispersity of a protein sample is routinely monitored in size-exclusion chromatography (SEC) [29,38], during which aggregated material can be removed.Alternatively, monodispersity can be monitored using light scattering methods.Combining these approaches with monitoring the UV absorption and refractive index (RI) of the protein as it elutes from the SEC column can be used to calculate the molecular weight and detergent content of a protein sample; this enables its oligomeric state to be more accurately estimated than from UV abs measurements alone [38,39,40].Such methods can be used, for example, to compare various orthologues of CFTR to identify one that is amenable to purification in large quantitites.
Another possibility is the introduction of specific mutations to human CFTR to improve its solubility and stability in solution, and increase the likelihood of producing monodisperse samples after purification.This was used to great effect in determining structures of the NBD-1 of CFTR, as well as for several G-protein coupled receptors (GPCRs), hitherto intractible to crystallization [41,42,43,44].Indeed, screening multiple orthologues of CFTR to identify residues that increase their intrinsic stability compared to human CFTR may be a productive approach to producing a human-like CFTR that can be used for structural studies.Such work has already begun by comparing the amino acid sequence of chicken CFTR, known to have high thermal stability, with that of human CFTR.Certain key residues have been identified that appear to contribute to the stability of chicken CFTR, yet are not conserved in humans [45].
In this study, we have expressed the salmon CFTR I isoform (Uniprot accession number Q9W750) in Saccharomyces cerevisiae in order to purify and characterise the isolated protein.Expression of CFTR in yeast was previously reported [46], but until recently has not been extensively pursued as a source of purified protein.CFTR is difficult to solubilise from non-mammalian (i.e.insect and yeast) cell membranes [24,46,47,48], and is sensitive to protease degradation.In order to overcome these difficulties we have used a protease-deficient yeast strain (FGY217) and codon-optimised the CFTR gene for expression in yeast, according to a strategy developed by Urbatsch and co-workers [49].The construct also includes a poly-histidine tag for purification, and a small ubiquitin-like modifier (SUMO) tag and a green fluorescent protein (GFP) tag to promote trafficking and to allow detection of the CFTR.A new protocol for preparation of recombinant full-length CFTR in Saccharomyces cerevisiae now allows purified CFTR to be obtained in milligram quantities [50,51].It has enabled more extensive study of CFTR, and is a quantity sufficient to attempt structural analyses like SAXS, cryo-electron microscopy and 3D crystallisation.We have used a combination of biophysical and structural techniques to determine ab initio structures of salmon CFTR at low resolution, and to assess its suitability as a candidate for higher-resolution structural analysis.

Expression and purification of salmon CFTR
A vector carrying the GAL promoter, codon-optimized Salmo salar CFTR I gene, N-and C-terminal polyHistidine tags, an N-terminal SUMO tag and a C-terminal green fluorescent protein tag transfected into the FGY217 strain of S. cerevisiae was used for GAL-induced protein expression in 20 L batches [50].Crude microsomal membranes were isolated using a glass bead mill as previously described [50].CFTR was solubilised by LPG-14 and purified by Ni-affinity and gel permeation chromatography as described in Pollock et al (2014) [51], with minor alterations.Namely, the solubilization buffer contained 20 mM Tris pH 8, 50 mM NaCl, 10% glycerol, 1% LPG-14 and 1 mM dithiothreitol (DTT), and solubilization time was increased to 2 h.The protocol for Ni-affinity chromatography was unchanged, but the buffer for SEC was altered to contain 0.04% (w/v) LPG-14.In preparative SEC, protein eluting from the Superose 6 10/300 column was monitored with 215, 254 and 280 nm UV detectors (ÄKTA purifier, GE Healthcare, Buckinghamshire, UK), and an in-line fluorescence detector (Shimadzu UK, Buckinghamshire, UK) at 485/512 nm (excitation/emission).Fractions of 0.5 mL were collected and analysed using a fluorescence spectrophotometer (Cary Eclipse, Agilent, Abingdon, Oxfordshire, UK) with a quartz cuvette to estimate the concentration of the GFP-CFTR fusion protein, and SDS-PAGE to estimate purity of the eluted fractions.Absolute yield of protein was estimated by image analysis of SDS-PAGE gels (ImageJ, [52]) or using fluorescence spectrometry at 10 °C of the GFP-CFTR fusion protein.Fluorescence spectrometry is a rapid and non-destructive method for estimation of the GFP-CFTR concentration, but must be regarded as an estimate due to the sensitivity of the fluorescence signal to changes in buffer conditions and temperature.

Size-exclusion chromatography and multi-angle laser light scattering (SEC-MALLS)
CFTR purified as described above underwent further analysis by SEC coupled with multi-angle laser light scattering (MALLS) to estimate its molecular mass.Material eluting between 9.5 and 11 mL after preparative SEC was concentrated to 0.5 mL total volume (approximately 0.5 mg/mL) and re-injected into the Superose 6 column.Material eluting from the column was analysed by three detectors: UV (λ = 280 nm), refractive index (Optilab rEX, λ = 623 nm) and light scattering (Dawn Heleos, λ = 658 nm, scattering angles between 13 and 157.8°).The refractive index increment (dn/dc) value used for molecular mass estimation was 0.180 mL/g.

Bioinformatics
The ProtParam tool (http://web.expasy.org/protparam/) was used to estimate molecular weight and extinction coefficient of the polyHis-SUMO-sCFTR_I-GFP construct based on its amino acid sequence.The ClustalOmega tool (http://www.ebi.ac.uk/Tools/msa/clustalo, [53,54]) was used to align the amino acid sequences of sCFTR I and II (Uniprot accession numbers Q9W750, Q9IAR8) and the human, platypus and chicken CFTR (Uniprot accession numbers P13569, Q07DZ6 and A0M8U4 respectively) using default settings.ClustalOmega produced sequence alignments and calculated an identity matrix of the sequences.

Electron microscopy
Carbon-coated EM grids were glow-discharged and sCFTR was fixed on the grid as previously described [51].The grids were analysed using a Tecnai G2 Polara electron microscope (FEI, Hillsboro, OR, USA) with a 200 kV field emission gun.Images were recorded on a USC 4000 CCD camera (Gatan UK, Abingdon, UK) at 3.725 Å/pixel with spherical aberration of 2.0.A 48 pixel box was used to select 19500 particles, followed by calculation of a 3D model refined with C1 symmetry using the EMAN-2 data processing suite [55].The resolution of the map was determined by an even-odd test (FSC = 0.5).
The map was displayed in Chimera 1.9 (UCSF, USA) at a threshold that corresponded with the molecular weight of a monomer of CFTR, based on an assumption of 1.21 Å 3 /Da [56].The map was fitted with the Sav1866 inward-facing electron density map (PDB: 2HYD; [57]) using the automated "Fit in Map" routine.

Sample preparation for SAXS
Prior to SAXS analysis, samples were dialysed for 30 minutes against 10 mM phosphate and 154 mM NaCl at pH 7.4 to minimize background detergent signal.Protein concentration was determined after dialysis by UV/Vis spectrophotometry using the extinction co-efficient calculated from the amino acid sequence.Dithiothreitol (5 mM) was added as free radical scavenger to minimize radiation damage to the sample.

SAXS data collection and processing
Small-angle X-ray scattering spectra of 0.7 mg/mL to 1.25 mg/mL of the purified sCFTR construct were collected at the BL-11 beam line of the ALBA Synchrotron Light Facility (Barcelona, Spain) connected to an in-vacuum undulator with a period of 5.5 mm.Scattered radiation was recorded in a two-dimensional CCD detector.The sample-detector distance of 2.39 m covered the range of momentum transfer 0.03 < q < 4.5 nm -1 (q = 4π sin (θ)/λ, where 2θ is the scattering angle and λ = 0.128 nm is the X-ray wavelength; the optical path of the X-ray through the sample is about 3 mm).Data were collected from the sample kept at 7 °C.For each sample, 40 spectra of 10 s each were recorded, corresponding to a total of 6.7 min of data acquisition.CCD camera images were radially integrated using Fit2D [58].The comparison of the successive exposures of an acquisition experiment indicated no changes in the scattering patterns (i.e.no measurable radiation damage to the protein samples).Data were normalized to the intensity of the transmitted beam, and the scattering data from the background buffer (identical to that of the sample, but lacking protein), tested before and after each corresponding sample measure, were averaged and used to subtract the background.
The low angle region of the SAXS spectra (q < 0.10 nm -1 ) showed a rapid increase of the intensity as q decreased (data not shown), probably due to some aggregation of the protein in the sample.These data were removed prior to analysis.The minimal accepted q was identified by fitting the initial part of the spectra (0.06 nm -1 ≤ q < 0.1 nm -1 ) with I(q) = a exp(-q 2 ) + B, an empirical function that accounts for the scattering produced by high molecular mass material [59], where B was fixed to the parameter of the fit of the spectra in the high angle region (q ≥ 2.6 nm -1 ) where q adopts an asymptotic form q 4 I(q) → A + B q 4 , [59,60].We consider that the contribution of this initial part of the spectra becomes irrelevant when it is approximately constant (i.e. for q < 0.26 nm -1 ).Acquired spectra of sCFTR solutions in the range of concentrations measured were perfectly reproducible.Consequently we averaged all data, normalised to the highest concentration (1.25 mg/mL).
We processed the data using standard procedures for ATSAS programs [61].The forward scattering I(0) and the radius of gyration R g were computed using the Guinier approximation for sR g < 1.3 [62,63].The distance distribution function P(r) was calculated using the indirect Fourier transform method implemented in the program GNOM [64].It was limited to s ≤ 2.6 nm -1 , to avoid the high noise found for larger s.P(r) represents the probability of finding a point within the observed particle at a distance, r, from a defined point of reference.Sample molecular mass was estimated by comparing the extrapolated forward scattering I(0) to a reference solution made of bovine serum albumin.

Purity and yield of sCFTR
The fusion protein of polyHis-SUMO-sCFTR_I-GFP solubilized in 1% LPG-14 bound well to the Ni-affinity column, with little appearing in the flow-through and early wash fractions (Figure 1).The sCFTR was eluted from a Ni-affinity column with buffer including 250 mM imidazole in four 2 mL fractions.The fractions were analysed by SDS-PAGE and Coomassie staining followed by image analysis to estimate the yield (Table 2) and purity (Figure 1) after Ni-affinity chromatography.From its amino acid sequence, the predicted molecular weight of the polyHis-SUMO-sCFTR_I-GFP protein was 214S kDa, though the fluorescent band corresponding the the GFP-CFTR fusion protein migrated close to the 250 kDa marker.Purification in LPG-14 yielded approximately 150 μg sCFTR per litre of S. cerevisiae culture at 50-80% purity.
The fractions with the highest fluorescence from the GFP tag (Figure 1, lane 7, 8) were pooled and concentrated to 0.5 mg/mL for preparative SEC (Figure 2).The UV absorbance and fluorescence traces of the preparative SEC experiment show that the CFTR isolated by Ni-affinity purification was not monodisperse (Figure 2A).Preparative SEC removed low molecular weight contaminants, and higher molecular weight aggregates and proteolytic fragments of CFTR.At least 50% of the CFTR in the sample eluted in the void volume of the column, suggesting that it consisted of high molecular weight aggregates of the protein.This reduced the yield of pure sCFTR to approximately 50 μg/L culture (Figure 2B, Table 2).Even after SEC purification, some SDS-resistant aggregates of CFTR remained (Figure 2B, lower panel).However, SEC was sufficient to remove most of the contaminating proteins, leaving the CFTR at > 90% purity (Figure 2B, upper panel).

Structure and oligomeric state of sCFTR
Several methods were used to estimate the oligomeric state of CFTR.Cryo-electron microscopy allowed direct visualization of the protein particles (Figure 3A), which were evenly dispersed across the grids and showed no sign of polydispersity.The electron density maps generated from cryo-EM gave a resolution of 32 Å, and were fitted with a dimer of Sav1866, a bacterial half-ABC transporter, whose crystal structure has been determined at high resolution (PDB: 2HYD; [65]) (Figure 6).Since CFTR is a full-length ABC protein, this suggested that sCFTR purified in LPG-14 was present in a monomeric state.
These data were confirmed by SAXS analysis.The spectrum in Figure 7A, showing the scattering intensity plotted against the inverted space (I(q) -q), corresponds to the average of spectra of sCFTR.As evident in the figure, the SAXS plot was of a high quality with good experimental statistics up to q ≈ 2.6 nm -1 .It showed a marked minimum at q < 1 nm -1 .The extrapolation of the sCFTR scattered intensity at q = 0, I(0) (see inset in Figure 7A), estimated from the Guinier plot of the data yielded an effective molecular mass of 205.7 kDa.This is compatible with the molecular mass value inferred from the primary amino acid sequence of the protein (214.3 kDa).This confirmed that CFTR remained monomeric in solution under the experimental conditions used for carrying out the SAXS experiments, and that most of the detergent was removed by the dialysis step before the experiments.The radius of gyration (R g ) of sCFTR calculated from the Guinier plot was 3.12 ± 0.12 nm.This was further confirmed from the estimation of R g = 2.85 nm obtained from the distance distribution function, P(r), shown in Figure 7B.The shape of the P(r), with a maximum distance D max = 7.41 nm, is characterised by two maxima, consistent with a bilobular-elongated body [66].
By contrast, initial SEC-MALLS analysis of salmon CFTR suggested that two oligomeric states of the protein were present in the Ni-affinity and SEC-purified material.The molecular weight estimate for the material eluting from the Superose 6 column at 10 mL was 550 kDa, which would correspond to two molecules of CFTR and an associated detergent micelle (Figure 3B, blue lines).A second peak with a retention volume of 12 mL (Figure 2A) had an estimated molecular weight of 250 kDa, which would correspond to a monomer of CFTR with associated detergent.However, reinjection of the fractions eluting at 10-11 mL from the Superose 6 column resulted in a different molecular weight estimate (290 kDa) for material eluting at the same retention volume (10-11 mL) after the second SEC experiment (Figure 2B, red lines).In both SEC experiments, the micelle molecular weight of LPG-14 was measured as 25 kDa (Figure 3, black trace).Thus the SEC-MALLS estimate of 290 kDa for the CFTR-LPG-14 protein-detergent complex is still a little higher than expected.
From this we conclude that the incomplete resolution of monomeric CFTR from the void volume of the column led to an overestimate of its molecular mass.The UV absorbance trace indicated that in the final SEC-MALLS experiment (Figure 2B, blue) very little material eluted in the void volume, but due to its large size, it gave a large signal in the light scattering trace (Figure 3B, blue dashed trace).Considering the cryo-EM, SAXS and SEC-MALLS data together, it seems likely that CFTR in LPG-14 was monomeric.

Comparison of salmon CFTR with other orthologues
The primary amino acid sequences of the two salmon isoforms were aligned, and shown to have 95.3% identity (Table 1).Salmon CFTR I was also aligned to the sequence of three evolutionarily diverse species: human, platypus and chicken (Table 1).These sequences shared between 59.9 and 83.6% identity.
The salmon orthologue had several insertions between positions 700 and 950 that were not shared by the other orthologues (Figure 4).Two of these insertions (positions 691-711, and 741-747) are in the regulatory (R) region of the protein (Figure 4, R-region indicated by blue line).Both isoforms of sCFTR have the same number of amino acids in the R-region, with a local sequence identity of 88.7% (compared to 95.3% identity overall), which indicates that the R-region is one area in which the two sCFTR sequence have diverged (Table 1, italicized brackets).The sequence identity in the R-region between salmon and human CFTR was no greater than 45.5%, compared to their overall sequence identity of at least 60%.The additional insertion to salmon CFTR is in the predicted transmembrane helix 7-8 linker (approximately at positions 922-931, though residue numbers are difficult to define as this region has low identity across the four orthologues).
Fluorescence-SEC profiles revealed that samples of sCFTR typically contained a larger proportion of aggregated material (Figure 5) in the first SEC experiment than the purified human, chicken and platypus orthologues of CFTR.This observation was reproduced across at least 3 separate purifications.

Discussion
This description of the purification and characterization of salmon CFTR I is part of a longer-term goal to produce CFTR for 3D crystallization and high-resolution structure determination.The introduction of the yeast expression system for this protein has been an important step towards this goal due to the high yield of CFTR from the S. cerevisiae cells.Previous studies have indicated that non-human orthologues of many proteins may be more amenable to purification and structural analysis, yet have high enough homology to provide useful information on the structure and function of their human counterparts (e.g.P-glycoprotein structures).Therefore we have chosen to study a range of vertebrate CFTR orthologues with the aim of identifying a CFTR molecule that can be purified to monodispersity with sufficient yield for extensive 3D crystallization screening, as well as other structural and biophysical analyses.
Salmon CFTR expressed in S. cerevisiae and solubilized in LPG-14 was insufficiently pure for structural analysis after Ni-affinity chromatography.Indeed, samples of sCFTR were less pure than chicken, human and platypus orthologues of CFTR isolated by the same method ( [51], and unpublished data).Considering the similarity of the proteins in each case (i.e.expression and solubilization conditions, purification tags and protocol), this was a surprising result, but indicates that it may be a fundamental property of the salmon orthologue.
The first size-exclusion chromatography (SEC) experiment removed protein contaminants from samples of salmon CFTR (Figure 2B, upper panel) and improved the purity of the sample, but did not fully resolve monomeric CFTR from its larger aggregates (Figure 2A and B).Once again, in a comparison between salmon CFTR and other orthologues expressed and purified in the same way, sCFTR appeared to be more polydisperse (Figure 5).As a result, a lower proportion of the sCFTR was recovered after SEC, compared to samples of human, chicken and platypus CFTR.
SAXS analysis undertaken after Ni-purification and one SEC purification revealed a fraction of aggregated protein that produced a large increase of scattering intensity at very low q.After removal of this part of the spectrum, SAXS data were compatible with a monodisperse monomeric protein.
Micrographs from cryo-EM of sCFTR purified in the same way were not dominated by aggregated material, and were used to reconstruct a 3D map of the protein (Figure 6).Analysis of the distance distribution of CFTR from SAXS data indicated that the molecule consisted of two lobules connected by well-folded linker of approximately 30 Å (Figure 7B).This corresponded well with the overall shape and size observed in the cryo-EM-based electron density map.Therefore, the sCFTR used for structural analysis was not fully monodisperse, but the samples were nonetheless useful for low-resolution structural analysis, and produced consistent data both by cryo-electron microscopy and by SAXS.
Further SEC analysis with light scattering confirmed that small quantities of aggregated material remained in the sample during two further SEC purifications.The molecular weight of the detergent micelle was estimated at 25 kDa (Figure 3B, black trace), which was slightly lower than previous reports of 30-44 kDa [65,66].Hence we would expect a molecular weight for the protein-detergent complex (PDC) of 240 kDa.The difference between the expected (240 kDa) and the measured molecular weight (290 kDa: Figure 3B, red trace) may be explained by the persistence of aggregated material in the sample, even after three SEC experiments.Nonetheless, the molecular weight estimate from the final monodisperse sample confirmed that the majority of sCFTR purified in LPG-14 was monomeric.It is notable that after purification in DDM, CFTR dimers have been observed by EM [51], suggesting that LPG-14 and DDM interact with CFTR in different ways.Though LPG-14 has enabled the purification of CFTR in this and other studies [24,51], it cannot be overlooked that it is a relatively harsh detergent, and may not be optimal for protein stability.Chicken CFTR purified in LPG-14 has lower ATPase activity both before and after reconstitution into phospholipids than the same protein purified using dodecylmaltoside (DDM) [51].There is also evidence that even at low concentrations (i.e.0.5% w/v) LPG-14 destabilises the isolated NBD1 of human CFTR purified from a mammalian cell line [69].However, purification of human and chicken CFTR in DDM does not produce pure protein, even after SEC.Thus the use of LPG-14 for purification of full-length CFTR is currently necessary in order to produce monodisperse samples, and we have tried to mitigate its destabilizing effect by minimising the detergent concentration at each step of the purification.To better overcome these difficulties, it may be necessary to cast the net wider to identify useful detergents, to add lipids, or to adopt new methods for protein extraction such as the use of non-detergent alternatives [70].
Overall, the purification and subsequent SEC experiments did yield highly pure, monodisperse sCFTR, which was used for structural characterisation.However, the high proportion of aggregated CFTR in the samples after Ni-affinity chromatography indicated that salmon CFTR may not be an orthologue that is suited to higher resolution structural studies due to the dwindling yield of protein when it is purified to monodispersity.
Despite its unsuitability for high-resolution structural studies, salmon CFTR provides an interesting source of comparison to human CFTR and chicken CFTR that have previously been characterized in this expression system ( [51], and unpublished data).Salmon are poikilothermic (they do not self-regulate their temperature) and control their temperature by moving to warmer or colder water [71].Thus it is possible that sCFTR is functional at a range of temperatures, in contrast with mammalian orthologues of CFTR that have evolved to exist at a lower range of temperatures [72].Indeed it has been demonstrated that osmoregulation, a process that involves CFTR, is efficient in salmon between 4.4 and 14.4 ℃ [73], more than 20 ℃ lower than the internal temperatures of humans or chickens.
Several specific proline residues (equivalent to positions 422, 434, 492 in the hCFTR sequence) have been identified in dynamic positions in the thermally stable chicken orthologue of CFTR.When engineered into the hCFTR-F508del sequence, along with the suppressor mutation I539T, these mutations partially correct the F508del defect [45].The F508del mutation leads to folding defects in CFTR, and hCFTR-F508del has lower thermal stability than the wild-type protein.Thus, it can be concluded that the I539T polymorphism, along with the proline resides at 422, 434 and 492 in the sequence of CFTR increase the intrinsic stability of the protein.The sequence alignment of chicken, human and salmon CFTR showed that sCFTR has proline residues in 2 of the 3 positions mentioned, as well as the threonine at position 539 (Figure 4).This suggests that the stabilizing effects of these polymorphisms are limited, since SEC analysis of sCFTR I indicates that it has lower intrinsic stability than the human CFTR, which lacks these threonine and proline residues, having serine and isoleucine instead.
Notwithstanding its suitability for high-resolution structural studies, salmon CFTR is an interesting subject of study, not least because commercial farming of salmon (and related species) is economically important worldwide.CFTR is expressed in the peripheral organs of fish (i.e.shark rectal gland, salmon gills) and is known to play a role in chloride secretion and osmoregulation [9,74,75,76].A noteworthy feature of salmon is the existence of two isoforms of CFTR in its genome [11].Analysis of mRNA transcripts has indicated that the amount and proportion of the two isoforms varies through the life cycle of the salmon, with particular reference to its adaptation from a freshwater to a marine environment.
The sequence alignments of the two sCFTR proteins isoforms with human CFTR revealed that they had 95.3% identity to each other and 60.0/60.8%identity to hCFTR.The sequences of these three proteins diverged in the R-region: both sCFTR proteins were 30 residues longer than the equivalent region in hCFTR (and chicken and platypus CFTR).Furthermore, the R-region of CFTR is typically disordered, and in sCFTR is longer than in the other sequences considered here.It is possible that a larger disordered region could contribute to the greater tendency of salmon CFTR for self-aggregation, when compared to the other orthologues (Figure 4).A detailed exploration of the thermal stability of sCFTR could be used to test the hypothesis concerning its lower intrinsic stability.
In addition, the sequence identity between the two salmon CFTR isoforms in the R-region is lower than the sequence identity for the full-length of the CFTR molecules.Phosphorylation of serine and threonine residues in the R-region regulate channel opening [77], so divergence in this region could hint at a unique mechanism for the regulation of each salmon CFTR isoform.Characterization of the channel activity and regulation of each isoform might provide insight into the role of each in sCFTR isoform in osmoregulation.For example, phosphorylation of the R-region of purified mouse CFTR has been detected by mass spectrometry [76].Similarly, purified sCFTR I can be a valuable tool for direct assessment of the function and regulation of sCFTR.

Conclusions
From the data presented here we conclude that salmon CFTR performs less well in purification and likely has lower intrinsic stability than other orthologues of the protein.It is possible to isolate moderate quantities of salmon CFTR I with low polydispersity, which has enabled investigation of its structure at low-resolution.However, due to the difficulty in isolating large quantities of monodisperse protein it is unlikely to be a useful candidate for high-resolution structural studies as a proxy for human CFTR.Nonetheless, salmon CFTR I has several features that hint at a unique mechanism for regulation of its activity.Since salmon cannot regulate their body temperature, and fluctuations in ambient temperature are known to affect the ability of salmon to adapt to saltwater, developing a better understanding of the function and regulation of CFTR in fish may be useful in the context of rising ocean temperatures and intensive farming of salmon.The protocol presented here provides a useful tool for further detailed investigation of the function and regulation of salmon CFTR.

Figure 1 .
Figure 1.Purification of salmon CFTR.Yeast microsomes containing salmon CFTR (isoform I) were solubilised with 1% LPG-14 and purified by nickel affinity chromatography in CFTR buffer with 0.1% LPG-14.Lanes are as follows: (1) microsomes, (2) soluble fraction, (3) insoluble fraction, (4) flow-through, (5) 20 mM imidazole wash, (6-9) 250 mM imidazole elution fractions.The upper panel shows the gel after Coomassie staining and the lower panel shows the gel after detection of the GFP fluorescence.The letters A and M indicate aggregated and monomeric protein respectively.

Figure 2 .
Figure 2. (A) Size-exclusion chromatography of salmon CFTR.(a) Ni-affinity purified CFTR underwent size-exclusion chromatography with detection by UV absorbance at 280 nm (solid blue trace), 215 nm (blue dashed trace) and fluorescence detection at 485/512 nm (green trace).The retention volumes of aggregated and monomeric CFTR are indicated with arrows (A and M).(B) Fractions were collected in 0.5 mL aliquots, and analysed by SDS-PAGE (20 μl sample per lane) followed by Coomassie staining (upper panel) and fluorescence detection of the GFP-CFTR fusion protein (lower panel).

Figure 3 .
Figure 3. Oligomeric state of salmon CFTR assessed by (A) cryo-electron microscopy and (B) SEC-MALLS.(A) A micrograph of pure sCFTR.Protein particles are black against a grey background, as indicated by red arrows.(B) sCFTR purified by Ni-affinity and preparative SEC underwent two consecutive SEC purifications (blue and red respectively), with UV (solid lines) and light scattering detectors (dashed lines) to analyse the eluent.The molecular mass of material eluting from the column was estimated from the refractive index detector and is shown as a broken dotted line, with blue and red showing molecular mass estimates for the consecutive analyses of CFTR and black indicating the mass of the detergent micelles (black line).The retention volumes of aggregated and monomeric CFTR are indicated with arrows (A and M).

Figure 4 .
Figure 4. Sequence alignment of CFTR orthologues.The amino acid sequences of chicken, salmon I, platypus, and human CFTR were aligned using ClustalOmega.This excerpt of 300 amino acids is shown to highlight the conservation of three proline and one threonine residues (boxed in red) across the species, and the unique sequences in the R-region (indicated with blue arrows) of salmon CFTR.

Figure 5 .
Figure 5. Fluorescence SEC of CFTR in LPG-14.Human (dotted), platypus (dashed) and salmon (solid line) CFTR were purified according to the same protocol and analysed by F-SEC.Fluorescence signal (RFU) was normalised to 100 for each trace.

Figure 6 .
Figure 6.Structure of salmon CFTR from cryo-electron microscopy.An electron density map for sCFTR at resolution of 32 Å was generated by cryo-EM and image processing with EMAN-2.The map was displayed in Chimera 1.9 at a threshold corresponding to a molecular weight of 215 kDa.The map was fitted with the high-resolution structure of a bacterial ABC transporter (Sav1866, PDB accession code: 2HYD).Three views of the map are shown, and black arrows indicate how each view is related to the last.

Figure 7 .
Figure 7. SAXS data from salmon CFTR in solution.(A) Data represent the scattering intensity (I) as a function of the momentum transfer, q.The solid line represents the reciprocal space fit of scattering computed from the P(r) function to data shown in (B).The inset is the Guinier plot [log I(q) versus q 2 , where the solid line is the extrapolation of data to q = 0, for qRg < 1.3, by the Guinier approximation.The distance distribution function, P(r), shown in B was calculated from SAXS experimental data.