Atomistic Modeling of Scattering Curves for Human IgG1/4 Reveals New Structure-Function Insights

Small angle x-ray and neutron scattering are techniques that give solution structures for large macromolecules. The creation of physically realistic atomistic models from known high-resolution structures to determine joint x-ray and neutron scattering best-fit structures offers a, to our knowledge, new method that significantly enhances the utility of scattering. To validate this approach, we determined scattering curves for two human antibody subclasses, immunoglobulin G (IgG) 1 and IgG4, on five different x-ray and neutron instruments to show that these were reproducible, then we modeled these by Monte Carlo simulations. The two antibodies have different hinge lengths that connect their antigen-binding Fab and effector-binding Fc regions. Starting from 231,492 and 190,437 acceptable conformations for IgG1 and IgG4, respectively, joint x-ray and neutron scattering curve fits gave low goodness-of-fit R factors for 28 IgG1 and 2748 IgG4 structures that satisfied the disulphide connectivity in their hinges. These joint best-fit structures showed that the best-fit IgG1 models had a greater separation between the centers of their Fab regions than those for IgG4, in agreement with their hinge lengths of 15 and 12 residues, respectively. The resulting asymmetric IgG1 solution structures resembled its crystal structure. Both symmetric and asymmetric solution structures were determined for IgG4. Docking simulations with our best-fit IgG4 structures showed greater steric clashes with its receptor to explain its weaker FcγRI receptor binding compared to our best-fit IgG1 structures with fewer clashes and stronger receptor binding. Compared to earlier approaches for fitting molecular antibody structures by solution scattering, we conclude that this joint fit approach based on x-ray and neutron scattering data, combined with Monte Carlo simulations, significantly improved our understanding of antibody solution structures. The atomistic nature of the output extended our understanding of known functional differences in Fc receptor binding between IgG1 and IgG4.


INTRODUCTION
Antibodies are glycoproteins that protect the host by identifying and neutralizing pathogens. They mediate highly specific antigen binding to a specific epitope through their two Fab regions, followed by their effector binding to other components of the immune system through its Fc region (Fig. 1). Immunoglobulin G (IgG) is the most abundant of the five human antibody classes. In the four IgG subclasses IgG1-IgG4, IgG1 is the most prevalent in serum, and IgG4 is the least. In IgG, the heavy and light chains are paired to form two Fab regions that are joined by two polypeptide hinges linked by interchain disulphide bonds to the Fc region that is formed from two heavy chains (Fig. 1). The variable domains (V H and V L ) mediate antigen binding, whereas the Fc constant domains (C H 2 and C H 3) perform effector functions (1,2). The Fc region also possesses an N-linked glycosylation site at Asn297. IgG1-IgG4 exhibit over 90% sequence identity, differing primarily in their hinges and upper C H 2 domains (3). The IgG1 and IgG4 hinges are of lengths 23 and 20 residues, respectively, whereas that of IgG2 is similar and that of IgG3 is much longer. The IgG1 and IgG4 hinges contain two interchain disulphide bonds, although these in IgG4 can interconnect differently to form hinge isomers (4). Consequently, IgG1 and IgG4 exhibit different effector functions in terms of receptor and complement binding from conformational variations in their hinge region (Fig. 1 A). Flexibility in the hinge is also relevant for function, this being exemplified in the stochastic walking of antibodies on repetitive antigenic epitopes such as on viral surfaces (5).
Antibodies are increasingly important in therapeutic applications (6,7). Over 294 monoclonal-antibody-based drugs have been approved or are in review with worldwide revenues of over $50 billion (8,9). Knowledge of the relationships between antibody sequence, structure, and function in physiological and manufacturing conditions is required for antibody engineering and therapeutics. This knowledge is limited by the very few crystal structures that are known for full-length antibodies, let alone their complexes with receptors; thus, there is great interest in determining their molecular structures, especially in solution. Three-dimensional antibody solution structures can be investigated using small angle x-ray scattering (SAXS) (10)(11)(12), small angle neutron scattering (SANS) (13), or both SAXS and SANS in conjunction with analytical ultracentrifugation (14)(15)(16). In SAXS or SANS, the scattering intensities result from the density contrast difference between the antibody and the solvent (17). SAXS and SANS monitor different aspects of the solution structure. For example, the tightly bound first hydration shell surrounding the protein is visible in SAXS but much less so in SANS if the protein is studied in heavy water buffer (18)(19)(20). Traditionally, SAXS and SANS studies are low-resolution structural methods that are unable to derive reliable unique structures. The realization that molecular structures can be used to fit scattering curves and that comparatively few molecular structures result in good scattering fits gives rise to a powerful approach for determining molecular structures. In this study, atomistic modeling fits of the scattering curves based on constraints from known crystal structures and protein sequences leads to the determination of molecular structures for the antibody solution structures. This is advantageous compared to the use of small beads (or spheres) or surface envelopes to fit the scattering curves because the resulting atomistic structures lead to clarifications of the molecular basis of antibody function in antigen or receptor binding. In our earlier antibody modeling, the SCT/SCTPL modeling package utilized separate crystal structures for the Fab and Fc regions that were combined with randomized hinge structures to create intact antibody structures that were fitted against SAXS-SANS data. The SCT/SCTPL package resulted in best-fit molecular structures, but it was limited by the creation of redundant models, sometimes with sterically overlapping regions, and was computationally expen-sive. SCT/SCTPL resulted in 14 full antibody solution structures available in Protein Data Bank (PDB) format (21), including human IgG1 and IgG4 (15,16) and rabbit IgG (14).
In the CCP-SAS project, SCT/SCTPL has been superseded by the SASSIE workflow on high-performance computing hardware to rapidly create larger numbers of protein conformations that are now physically correct (22,23). SCT/SCTPL and SASSIE represent two examples of the growth of atomistic modeling approaches to interpret scattering data (reviewed in (23)). First, in SASSIE, a full energy-minimized starting model was created. Then, Monte Carlo variations of the polypeptide main chain q and j dihedral angles rapidly generated further models in which models with poor steric overlaps were discarded at the point of generation. These were fitted against scattering data. SASSIE was used to determine a solution structure for human monoclonal IgG2 from SANS data (13). Here, to validate this method in greater detail, SASSIE was used to remodel our previous joint SAXS and SANS data sets for human monoclonal IgG1 and IgG4 (15,16) alongside new joint IgG1 and IgG4 scattering data from three other instruments to test their experimental reproducibility. Initial structures for full-length IgG1 and IgG4 were thus created using molecular dynamics. Monte Carlo simulations rapidly generated $700,000 physically realistic IgG1 and IgG4 trial models to explore the physically allowed conformational space around the hinge region. Joint filters based on disulphide bridge constraints and consistency with joint SAXS and SANS fits resulted in a limited number of new atomistic solution structures that reflected the different hinge lengths of IgG1 and IgG4. The resulting models confirmed in greater detail and rigor our previous SCT/SCTPL analyses of IgG1 and IgG4 (15,16). Docking studies of these bestfit full-length IgG structures with their receptors provided, to our knowledge, new insight into the different functional roles of IgG1 and IgG4; therefore, the use of SASSIE is able to enhance our understanding of antibody-receptorbinding function.

Source of IgG1 and IgG4 antibodies
The monoclonal IgG1 6a and IgG1 19a antibodies were used here as previously described (16). The monoclonal IgG4 B72.3 antibody in its Ser222 and Pro222 forms was also used as previously described, with the Ser222 form corresponding to the wild-type hinge and the Pro222 form having a stabilized IgG4 hinge structure that prevents Fab-arm exchange (15). For new data collection, both IgG1 and IgG4 antibodies were purified by size-exclusion chromatography using a Superose 6 10/300 column (GE Healthcare, Chicago, IL) to remove nonspecific aggregates immediately before the SAXS and SANS measurements. For x-ray work, all four antibodies were measured in phosphate-buffered saline (PBS)-137 (137 mM NaCl, 8.1 mM Na 2 HPO 4 , 2.7 mM KCl, 1.5 mM KH 2 PO 4 (pH 7.4)), and replacing 137 mM NaCl with 50 mM NaCl or 250 mM NaCl in buffers termed PBS-50 and PBS-250. For neutron work, the light water in PBS-137 was replaced by 100% heavy water by extensive dialysis into 100% heavy water immediately before SANS experiments. By this, the H atoms in the hydration shell are replaced by D atoms, together with 90% of the protein-exchangeable H-atom content (18). Additional data collection utilized Hepes-137 buffer (10 mM Hepes, 137 mM NaCl, and 2 mM CaCl 2 (pH 7.4)).

SAXS and SANS data for IgG1 and IgG4
Our previous SAXS data for IgG1 6a and 19a and IgG4 B72.3 in PBS-50, PBS-137, and PBS-250 buffers (15,16) were recorded in 16-bunch mode on instrument ID02 at the European Synchrotron Radiation Facility, Grenoble, France (24). This used a detector with a resolution of 512 Â 512 pixels in sets of 10 time frames with exposure times of 0.1 or 0.2 s each. New SAXS data (experimental ID: MX1801) were acquired using the BioSAXS robot on instrument BM29 at the European Synchrotron Radiation Facility (25,26). Data collection utilized a CMOS hybrid pixel Pilatus 1M detector (Dectris, Baden, Switzerland) with a resolution of 981 Â 1043 pixels (pixel size of 172 Â 172 mm). Overall, IgG1 6a and IgG1 19a were studied at eight different concentrations between 0.04 and 0.33 mg/mL and 0.12-0.96 mg/mL, respectively. Sample volumes of 50 mL were used in the BioSAXS automatic sample changer. Each sample was moved continuously in the capillary during beam exposure to reduce radiation damage. 10 time frames, each of duration 0.1 s, were acquired, alongside online checks that confirmed the absence of radiation damage during data acquisition. After this, the 10 frames were averaged. The BsxCUBE GUI interface was used for control of the automatic sample changer, and the sample settings were loaded from the ISPyB interface (25,27,28).
Our previous SANS data for IgG1 6a and IgG4 B72.3 were obtained in PBS-137 in 100% 2 H 2 O at 6, 20, and 37 C on instrument SANS2d at the ISIS pulsed neutron source, Rutherford Appleton Laboratory, Didcot, UK (29). New SANS data for IgG1 6a, IgG1 19a, and IgG4 B72.3 in the same buffer were obtained on instruments D11 (DOI: 10.5291/ILL-DATA.8-03-846) and D22 (DOI: 10.5291/ILL-DATA.8-03-832) at the Institut Laue-Langevin, Grenoble, France (30). For D11, samples were measured at sample-to-detector distances of 1.2 and 8 m, with 5.5 and 8 m collimation and a wavelength l of 0.60 nm. For D22, the sampleto-detector and collimation distances were 5.6 m with l of 0.60 nm. All samples were measured in rectangular Hellma cells with 2 mm thickness in a thermostatted sample rack set at 20 C. Data for IgG1 6a were collected at four concentrations of 0. 5

Scattering curve analyses of IgG
For macromolecules measured in high solute-solvent contrasts, the radius of gyration R g is a measure of structural elongation if it is assumed that the internal inhomogeneity of scattering densities has no effect. This is well-approximated by x-ray measurements in physiological salt buffers or by neutron measurements in 100% 2 H 2 O. Guinier analyses of the scattering curve I(Q) at low scattering vectors, Q (where Q ¼ 4psinq/l; 2q is the scattering angle), give the R g and the forward scattering at zero angle I(0) (31): This expression is valid in a Q, R g range up to 1.5. If the structure is elongated, and represents shapes similar to long rods, the mean radius of gyration of the cross-sectional structure R xs is obtained from the convolution of the scattering curve with a cross-sectional factor represented by Q (31,32): The radius of gyration of the cross section is a monitor of the mean width of an elongated structure. The cross-sectional plot for antibodies exhibits two distinct linear regions, a steeper innermost one and a flatter outermost one (32). The two analyses are denoted as R xs1 and R xs2 , respectively. The R xs1 parameter monitors the mean width of the full antibody structure, whereas the R xs2 parameter monitors the mean width of each of the individual Fab and Fc structures (Fig. 1). The R g and R xs analyses were performed using the SCT package (21). For IgG1, the Q ranges for R g , R xs1 , and R xs2 that gave linear fits were 0.15-0.28, 0.31-0.47, and 0.65-1.04 nm À1 , respectively (16). For IgG4, the same Q ranges were used for R g and R xs2 , whereas R xs1 was calculated from an adjusted Q range of 0.31-0.51 nm À1 for reason of its slightly different shape (15). Indirect Fourier transformation of the scattering data I(Q) in reciprocal space into the distance distribution function P(r) in real space was carried out using the program ScÅ tter (http:// www.bioisis.net/users/sign_in): IðQÞ Â Q r sinðQrÞdQ: (3) P(r) corresponds to the distribution of distances r between all volume elements. This yields the maximum dimension of the macromolecule L and its most commonly occurring distance vector M in real space, as well as an alternative calculation of the R g value.  IgG1 and  IgG4 is formed from the variable and constant  V H , C H 1, C H 2, and C H 3 domains. Each light chain is formed from the V L and C L domains. The hinge sequences in the inset are shown in EU numbering. Two interchain disulphide bridges connect the two heavy chains. Two conserved N-glycosylation sites in the Fc region are at Asn297 (C). (B) A cartoon of the Fab and Fc regions. The distance between the centers of mass of the two Fab regions was denoted as d1. Those between the two Fab and Fc regions were denoted as d2 and d3. The antibody is shown arbitrarily as a two-fold symmetric structure with d2 ¼ d3. In general, d2 and d3 are unequal. In the text, the smaller of the two values is denoted as min(d2, d3), and the larger of the two is denoted as max(d2, d3).

Generation of initial IgG structural models
Atomistic scattering modeling compares theoretical scattering curves calculated from protein crystal structural models with the experimental scattering curves. For this, the antibody amino-acid residues were numbered using standard EU numbering for IgG (33,34).
The previous SCT/SCTPL modeling of IgG1 (16) utilized the crystal structure of full-length human IgG1 b12 (PDB: 1HZH) (35) to create 20,000 symmetric and asymmetric randomized full-length models of IgG1. For these, either the seven-residue upper hinge 220 CDKTHTC 226 with Cys 220 and Cys 226 acting as tethers was randomized to make asymmetric IgG1 structures or the 19-residue upper, middle, and lower hinge 220 CDKTHTCPPCPAPELLGGP 238 was randomized to make both symmetric and asymmetric IgG1 structures. Sequence differences between IgG1 b12, IgG1 6a, or IgG1 19a were disregarded. For the previous SCT/SCTPL modeling of IgG4 (15), the starting model was constructed from crystal structures for Fab B72.3 and human IgG1 b12 (PDB: 1BBJ and 1HZH) (36,37). The asymmetric IgG4 models considered only the upper hinge 212 VESKYGPPC 220 with Val 212 and Cys 220 acting as tethers to create 10,000 randomized IgG4 models. Symmetric IgG4 models considered the upper, middle, and lower hinges as a 21-residue peptide 212 VESKYGPPCPSCPAPEFLGGP 232 to create another 10,000 IgG4 models.
For our SASSIE modeling, the starting IgG1 model also employed the crystal structure of full-length human IgG1 b12 (PDB: 1HZH) (35). 13 residues were not present in this 1HZH structure, specifically 132 SKSTSGG 138 in one Fab C H 1 domain, 223 THT 225 in its associated core hinge, and 445 PGL 447 at one of the two C-termini in the Fc C H 3 domain. The missing Fab region and hinge residues were reconstructed by duplicating the coordinates of residues 1-299 with those from the other complete heavy chain, superimposed on residue 229 of the starting heavy chain. The missing three C-terminal residues were modeled with backbone 4 and j angles of 10 using the PyMOL build_seq script in the PyMOL Script Repository, Queen's University, Ontario, Canada (Schrödinger). All disulphide bonds were retained. For this complete IgG1 starting structure, force field parameterizations were generated and hydrogen atoms added using the glycan reader component of CHARMM-GUI (37,38) and the CHARMM36 force field (39)(40)(41)(42). This IgG1 structure was energy-minimized for 1000 steps using the conjugate gradient method implemented in NAMD2 (43). Of the two disulphide bonds at Cys226 and Cys229 ( Fig. 1 A), only that at Cys226-Cys226 was present in the IgG1 b12 crystal structure. The Cys226-Cys226 conformation was retained in the initial model, with the force field parameterizations being varied to incorporate or exclude it as required below. Sequence differences between IgG1 b12, IgG1 6a, or IgG1 19a were again disregarded. Two biantennary Gal 2 .GlcNAc 2 .Man 3 .GlcNAc 2 glycans at Asn297 were retained ( Fig. 1 A). The x-ray and neutron scattering length density of the glycans is slightly higher than that of the protein; thus, the glycan and protein components were indistinguishable by scattering (18).
For our SASSIE modeling, the starting IgG4 model was constructed from crystal structures for Fab B72.3 (PDB: 1BBJ) (36) and serumderived Fc IgG4 (PDB: 4C55) (44). These have the same sequence as the IgG4 under study. The glycans at Asn297 (Fig. 1 A) were retained. The composition of the first glycan was Gal 1 .GlcNAc 2 .Man 3 .GlcNAc 2 ; that for the second was a Man 3 .GlcNAc 2 .Fuc core with GlcNAc and Gal.GlcNAc branches. The 20-residue hinge 216 ESKYGPPCPSCPA-PEFLGGP 238 and the missing C-terminal Fc residues 442 SLGK 445 were modeled using the PyMOL build_seq script. All the disulphide bonds within the crystal structures were retained. To complete this structure, force field parameterizations were generated and hydrogen atoms added, as for IgG1 above. Because neither the Cys226-Cys226 or Cys229-Cys229 hinge residues were positioned to form disulphide bonds in the initial structure, these bonds were not included in the force-fieldparameterized model of IgG4. The initial model was energy-minimized as for IgG1.

Configurational sampling of IgG
The hinge conformations of the new starting IgG1 and IgG4 models were rapidly sampled using dihedral Monte Carlo simulations within SASSIE (22) while holding the above energy-minimized Fab and Fc regions fixed. The assigned variable hinge regions of the IgG1 and IgG4 models are listed below for each simulation, and the backbone dihedral angles in these regions were varied. A Metropolis sampling methodology was used to sample the energetically allowed dihedral angles, using only the dihedral component of the CHARMM potential (39) to determine the energy of each configuration. Sterically overlapping IgG structures were automatically discarded in SASSIE.
For IgG1, three Monte Carlo simulations were performed to maximize the sampling of possible conformers with plausible hinge disulphide bonding. Of a total of 704,000 randomized conformations that were generated, 231,492 structural models were sterically acceptable (i.e., no atomic overlaps). The three simulations gave 27,158 models that contained both canonical disulphide bonds as follows: VEPKSCDKTHTCPPCPAPELLGGP 238 was varied. This simulation sampled conformations with and without symmetry but resulted in a low sampling rate for conformers that showed viable disulphide bonding conformations. Simulation 3: 404,000 models were generated, of which 26,836 (6.6%) were sterically acceptable. The full hinge was again varied, but with the additional constraint that the Cys226-Cys226 and Cys229-Cys229 a-carbon atoms had to be within 0.75 nm of each other to permit interchain disulphide bonding. This additional constraint removed many models.
For IgG4, two Monte Carlo simulations generated 700,000 IgG4 randomized conformations, from which 190,437 structural models were sterically acceptable. Simulation 1 for IgG1 was not performed for IgG4 because the Fab and Fc crystal structures used for the starting model did not contain a disulphide-bridged hinge structure. The simulations resulted in 46,979 models that contained both canonical disulphide bonds: Simulation 2: As for IgG1, 300,000 trial models were generated by varying the full hinge 215 VESKYGPPCPSCPAPEFLGGP 238 , of which 143,568 models (47.9%) were sterically acceptable. Only 110 models contained both canonical disulphide bonds. Simulation 3: As for IgG1, 400,000 trial models were generated in which the Cys226-Cys226 and Cys229-Cys229 pairs involved in interchain disulphide bonding in IgG4 had to be within 0.75 nm of one another. From this, 46,869 models (11.7%) were acceptable. As for IgG1, this disulphide bridge constraint removed many models.

Scattering curve calculations and analyses
Scattering curves for the acceptable IgG1 and IgG4 models were calculated using SCT (21). SCT is a coarse-grained method that converts the atomistic models into small sphere models for the Debye calculation of the theoretical scattering curves I(Q) (45). For comparison with the SANS data in heavy water buffer, the sphere models were left unhydrated. For comparison with SAXS data, hydration spheres were added to create a monolayer hydration shell corresponding to 0.3 g of water per gram of protein (18,20). The coordinate conversion to spheres used a grid with cube side lengths of 0.5329 nm for IgG1 and 0.5335 nm for IgG4, plus a cutoff of four atoms. SCT optimized these parameters to reproduce the unhydrated protein volume. For comparison with the scattering curves, each experimental I(Q) value was matched to the theoretical I(Q) value with the closest Q value. For the x-ray curves with up to 365 data points, the Q spacing is close enough for this procedure to have little effect, whereas the neutron curves have up to 45 data points and the quality of the matches is reduced (16). After this, the R factor was computed by analogy with crystallography in which the lower R factors represent better fits: R-factor ¼ P I Expt ðQÞ À hI Theor ðQÞ P I Expt ðQÞ Â 100: (4) h is a scaling factor used to match the theoretical curve to the experimental I(0) value. An iterative search to minimize the R factor was used to determine h. All steps were performed in the SASSIE-web workflow (version 0.8) (https://sassie-web.chem.utk.edu/sassie2). Structures with the lowest R factors were accepted as valid models of the antibody solution structure. We note that c 2 values are often used elsewhere as a monitor of best fits to scattering curves; these, however, require errors for the experimental intensities that were not always available.
The final antibody structures were analyzed using the distance between the centers of mass of the two Fab regions Fab1 and Fab2 (d1) and those between each Fab region with the Fc region (d2, d3) ( Fig. 1 B). The twofold symmetry of the antibody primary structure meant that the differentiation between Fab1 and Fab2, and consequently d2 and d3, is only there for descriptive clarity. These parameters were used previously for IgG4 (46) and other antibodies (14,47,48). Antibody asymmetry is monitored by the absolute difference between the two Fab-Fc distances, abs(d2 À d3), which is close to zero for symmetric structures.

Docking analyses for C1q heads and the FcgR receptors
Crystal structures for C1q and the FcgRI receptor were docked onto the best-fit structures for the IgG1 b and the IgG4 a and b clusters using the web server algorithm PatchDock as described previously (15,16). For this, the crystal structure of the C1q head and its predicted contacts was used (49,50), together with the crystal structure for the FcgRI receptor-Fc complex (PDB: 4X4M (50)). Crystal structures for the FcgRIII receptor-Fc complex were also used (PDB: 1E4K and 1T89 (51,52)).

Experimental scattering curves for IgG1 and IgG4
To determine the atomistic solution structures of human IgG1 and IgG4, reliable scattering curves were required. Data sets from five scattering instruments were obtained for each of the two monoclonal IgG1 6a and IgG4 B72.3 antibodies in both H 2 O and 2 H 2 O buffers (Table 1). This tested the reproducibility of the scattering curves to be used for modeling. Two x-ray and neutron data sets for each of IgG1 6a and IgG4 B72.3 on instruments ID02 and SANS2d were reused from our previous study (15,16). For IgG1 6a, two new neutron data sets on instruments D11 and D22 were obtained. For IgG4 B72.3, one new x-ray data set from instrument BM29 and one new neutron data set from instrument D22 were obtained. For IgG1 19a, an x-ray data set from instrument ID02 was reused (16), together with two new neutron data sets from instruments D11 and D22. Analytical ultracentrifugation showed that IgG1 and IgG4 were unaffected by protein aggregation but showed minor reversible dimerization (15,16). The minor dimerization observed only for IgG4 in 2 H 2 O buffer meant that the neutron curves from SANS2d were extrapolated to zero concentration before modeling this ( Table 1).
The IgG solution structures were parameterized using linear Guinier fits to determine the R g , R xs1 , and R xs2 values using the Q ranges specified in Materials and Methods. The Q ranges for the R g and R xs fits were the same as in our previous studies (15,16) to permit the direct comparison of the R xs1 and R xs2 values. The previous values were similar to the mean values from new concentration series for IgG1 and IgG4 from instruments BM29, D11, and D22 (Table 1). For IgG1 6a, the R g values were 5.10-5.20 nm (  The R g , R xs1 , and R xs2 fits were performed using the Q ranges specified in Materials and Methods. a Fig. S2 A. b The reported Guinier fit corresponds to the neutron curve in Fig. S2  For comparison, the full scattering curves out to Q ¼ 1.5 nm À1 were superimposed on each other based on the forward scattering at zero angle I(0) (Fig. 2). The scattering curves for each of IgG1 and IgG4 showed good agreement up to 1.1 nm À1 . For IgG1, a minor difference was noticed between the SANS2d and D11 neutron curves in the R xs2 fit range of 0.7-1.1 nm À1 when these were referenced to the ID02 x-ray curve as baseline (Fig. S1). This difference was attributed to the different effect of the hydration shell on the x-ray and neutron data. Beyond Q of 1.1 nm À1 , the curves generally showed weaker signal/noise ratios; thus, the atomistic scattering modeling fits in this study were only made to a maximal Q value of 1.1 nm À1 when calculating the goodness-of-fit R factors.

Monte Carlo atomistic modeling of human IgG1
The atomistic solution structural modeling of IgG1 was initiated using the crystal structure of full-length IgG1 b12 (35). Missing amino-acid residues were rebuilt, and the intact IgG1 structure with glycans was energy-minimized by molecular dynamics (Materials and Methods). Next, Monte Carlo simulations were performed with this starting structure, based on three types of conformational variations of the hinges between the Fab and Fc regions (Materials and Methods). Of the generated 704,000 structures, those showing steric overlap were rejected to leave 231,492 physically realistic trial structures for human IgG1. Theoretical scattering curves were then calculated from each model for comparison with experimental data.
A goodness-of-fit R factor analysis was used to identify the errors in the experimental data sets and the R factor filter required to select best-fit models. This R factor monitored the agreement between the theoretical and experimental curves to select modeled solution structures that were consistent with the experimental curves. To achieve this selection, it was necessary to determine a cutoff R factor below which models were assigned as best fits depending on the experimental scattering curve, its signal/noise ratio, and its Q range. To determine this cutoff, two experimental curves for the same protein from the same instrument were used to calculate two R factors for each of the 231,492 modeled IgG1 curves. The correlation between the two R factors was assessed using both the Pearson r and Spearman r s coefficients (53). By gradually excluding the models with higher R factors, this identified the point at which the ranking of the fits was no longer consistently determined for the two experimental curves. The cutoff was chosen as the point where both the r and r s coefficients decreased below 0.5. It should be noted that this approach is only valid when the best R factors were more densely sampled compared to the poor fits, as seen in Fig. 3. Because no detectable difference was seen between the ID02 SAXS curves for IgG1 6a and IgG1 19a (Fig. S2 A; Table 1), these two data sets were used to determine the R factor cutoff. The same procedure was followed for each of the D11 and D22 SANS curves for IgG1 6a and IgG1 19a (Table 1). Because no SANS2d data were available for IgG1 19a, the SANS2d data for IgG1 6a at 4.0 and 3.0 mg mL À1 (Table 1) were compared (Fig. S2 B). The final R factor cutoffs for IgG1 were determined to be 3.00, 2.00, 3.15, and 3.10% for instruments ID02, SANS2d, D11, and D22, respectively.
First, the IgG1 6a SAXS curve fits based on 231,492 hydrated IgG1 models were performed ( Table 2). The three Monte Carlo simulations produced a broad range of conformations with R g values between 3.75 and 6.47 nm (black, Fig. 3). As desired, most of the models occurred near the R factor minimum and within error of the experimental R g value, except for Simulation 3, which incorporated the hinge disulphide constraints. Simulation 1, which utilized the FIGURE 2 Experimental scattering curves used for the modeling fits. The horizontal bars indicate the Q ranges used to calculate the Guinier R g , R xs1 , and R xs2 values. The resulting linear Guinier plots are available for inspection in Fig. 6 of (16) for IgG1 and in Fig. S4 of (15) for IgG4. (A) For IgG1 6a, the x-ray scattering curve from ID02 (black crosses) is compared with the neutron scattering curves from SANS2d (blue dots), D11 (cyan dots), and D22 (red dots). Flat baseline corrections of 2.09 and 0.29% of I(0) were subtracted from the I(Q) curves to allow for different incoherent scattering contributions in the SANS2d and D11 data, respectively, to achieve correspondence with the x-ray data at high Q. No correction for incoherent scattering was made for the D22 data. (B) For IgG4, the x-ray scattering curves from ID02 (black crosses) and BM29 (purple crosses) are compared with neutron data from SANS2d (blue dots) and D22 (red dots). A flat baseline correction of 1.06% of I(0) was subtracted from the I(Q) curves for the SANS2d data to allow for incoherent scattering. No correction was made for the D22 data. To see this figure in color, go online. lower hinge conformation from the crystal structure with only one disulphide bond, showed a noticeable sampling bias toward lower R g values (green, Fig. 3). Using the above R factor cutoff of 3.00% as the filter (inset, Fig. 3), 37,412 (16%) of the 231,492 IgG1 models were deemed consistent with the SAXS curve ( Table 2). The 37,412 models were made up of 21,462, 11,291 and 4659 models from Simulations 1, 2, and 3, respectively. Application of a second filter of 0.75 nm for the a-carbon separation in the Cys226-Cys226 and Cys229-Cys229 disulphides caused these numbers to fall to 4728 (Table 2), this being distributed as 0, 69, and 4659 models for Simulations 1, 2, and 3, respectively (magenta, Fig. 3). The double constraint of R factors and disulphide bridges demonstrated the utility of atomistic representations for determining scattering fits. Thus, the requirement of disulphide bridge formation in the models removed many potential structures with low R factors.
The structural outcome of the IgG1 6a SAXS curve fits was monitored by the d1, d2, and d3 distances between the centers of mass of the Fab and Fc regions (Fig. 1 B;  (14-16,46)). Asymmetry was monitored using the absolute difference between the two Fab-Fc distances, abs(d2 À d3). In the 231,492 IgG1 models, the Fab separation d1 ranged between 3 and 16.5 nm (Fig. 4 A). Filtering for the above-determined R factor for instrument ID02 of 3.00% to give 37,412 models reduced d1 to 4.5-12 nm (purple, Fig. 4 A). The disulphide distance constraint separated the resulting 4728 models into two clusters (magenta, Fig. 4  A). The 389 models with d1 of $6 nm and below 7 nm  (Table 1). Black dots denote all of the 231,492 models. The green dots denote the 68,914 models that retained the Cys226-Cys226 disulphide bridge (Simulation 1). The magenta dots denote the 27,158 models from the simulations in which both the Cys226-Cys226 and Cys229-Cys229 residue pairs were within 0.75 nm of one another to enable disulphide bond formation (26,836 from Simulation 3 and 322 from Simulation 2). The insets show expanded views of the fits for which the R factors were below 3.0%. To see this figure in color, go online.

FIGURE 4
Relationship between the inter-Fab distance (d1) and the absolute difference in Fab to Fc distances, abs(d2 À d3), in the sterically acceptable IgG1 6a x-ray and neutron models. The gray dots represent the 231,492 unfiltered models that represent all the sampled IgG1 conformations (Table 2). (A) X-ray modeling of ID02 data. Purple dots represent the 37,412 models with x-ray R factors below 3.0% (top, Table 2). Of these, the magenta dots represent the better 4728 models in which the Cys226-Cys226 and Cys229-Cys229 residue pairs were both within 0.75 nm of one another (top, Table 2). Two clusters of structures a and b were observed. The a cluster at d1 ¼ 6 nm and low abs(d2 À d3) contains 389 symmetric Fab to Fc distances indicated in the cartoon (middle, Table 2). The b cluster at d1 ¼ 9.5 nm shows 4339 Fab to Fc distances with either asymmetry b 1 or symmetry b 2 (bottom, Table 2). (B) Neutron modeling. The 10,121, 3121, and 10,836 structures with neutron R factors below 2.0% and with both Cys pairs within 0.75 nm of one another are shown (blue dots, SANS2d; red dots, D11; cyan dots, D22) (not shown in Table  2). The a and b clusters are arrowed as in (A). To see this figure in color, go online.
were denoted as Cluster a. These were Y-shaped symmetric structures because abs(d2 À d3) was low at $1 nm. The 4339 models with d1 of $9.5 nm and above 7 nm were denoted as Cluster b. Those Cluster b structures with high asymmetry and a large abs(d2 À d3) were labeled as b 1 , whereas those Cluster b structures showing symmetry with a low abs(d2 À d3) were labeled as b 2 (cartoons, Fig. 4 A). These a or b clusters were only visible through the abs(d2 À d3) difference, not otherwise (Fig. S3, A and  B), hence showing the utility of the abs(d2 À d3) values to evaluate the filtered models.
The curve fits of the experimental IgG1 SAXS data to the modeled I(Q) curves calculated from representative Cluster a, b 1 , and b 2 best-fit structures with the lowest R factors revealed very good visual fits (Fig. 5). These I(Q) fits were corroborated by very good visual fits with the distance distribution function P(r) for all three Clusters a, b 1 , and b 2 , especially the relative intensities of the M1 and M2 peaks (insets, Fig. 5). The final sets of modeled Guinier parameters R g , R xs1 , and R xs2 of 4.97 5 0.08, 2.68 5 0.06, and 1.36 5 0.05 nm (Cluster a: Table 2 (Table 1). Hence, Cluster b was more representative of the IgG1 solution structure than Cluster a.
For the SANS fits for IgG1 6a, the theoretical curves from the unhydrated 231,492 models were compared with the experimental SANS curves from instruments SANS2d, D11, and D22. The three neutron R factor versus R g graph results (Fig. S4) were similar to those for the SAXS graphs (Fig. 3). In particular, the positions of the R factor minima were close to the experimental R g values. Models with R factor values below the cutoffs of 2.00, 3.15, and 3.10%, respectively, were accepted (see above). The models that passed both the R factor and disulphide distance filters showed a Fab-Fab separation d1 between 6 and 11 nm (Fig. 4 B). The D11 and D22 fits (red, cyan, Fig. 4 B) showed slightly reduced d1 values compared to the SANS2d fits (blue, Fig. 4 B). Unlike the SAXS modeling, no distinct a and b clusters were identified. Visual inspection of the curve fits of the theoretical scattering curves with the SANS experimental data for representative structures confirmed that both clusters provide plausible models (Fig. S5). Because the most frequently occurring best SANS-fitted models were consistent with Cluster b and showed better agreement with the Guinier parameters R g , R xs1 , and R xs2 (Tables 1 and 2), Cluster b was concluded to be more representative of the IgG1 solution structure determined by SANS. FIGURE 5 Representative x-ray scattering curve fits for the three families of best-fit IgG1 6a models. The three curve fits correspond to the a, b 1 , and b 2 clusters (magenta dots in Fig. 4 A). The experimental curve from instrument ID02 is shown in black (Table 1), and the bestfit theoretical curve is shown in blue. The experimental and theoretical distance distribution functions P(r) are shown at the top right of each panel. For each cluster, the best-fit conformer is shown in two views related by an axial rotation of 90 as indicated to follow the colors of Fig. 1 B. (A) Cluster a with a symmetric structure (small abs(d2 À d3)) and a small d1; (B) Cluster b 1 with asymmetric Fab-Fc distances (large abs(d2 À d3)) and a large d1; and (C) Cluster b 2 with symmetric Fab-Fc distances (small abs(d2 À d3)) and a large d1. To see this figure in color, go online.  X-ray R factor %3.00%, two disulphides, and neutron R factor %2.00% 0 n.a n.a n.a n.a n.a n.a n.a n.a n.a n.a n.a The atomistic modeling of IgG4 was based on two crystal structures for the separate Fab and Fc regions (36,44). To create the starting IgG4 structure, the hinge peptide joining the Fab and Fc regions and four C-terminal residues were modeled, then the intact IgG4 structure with glycans was energy-minimized using molecular dynamics (Materials and Methods). After this, two Monte Carlo simulations were performed. Either the two disulphide bridges in the hinge were disregarded (Simulation 2) or they were present (Simulation 3). Simulation 1 (see above) was not performed because there was no analog of the corresponding IgG1 simulation. A total of 190,437 physically realistic trial structures for human IgG4 were accepted after rejecting the models that showed steric overlap in 700,000 Monte Carlo-generated conformations. As for IgG1, theoretical scattering curves were then calculated from each IgG4 model for comparison with experimental data.
The 190,437 hydrated IgG4 model structures were analyzed for their fits to the SAXS curves. These gave theoretical curve R g values that ranged from 3.81 to 6.02 nm (black, Fig. 6 A). This range was smaller than that of 3.75-6.47 nm sampled for IgG1 (black, Fig. 3), and concurred with the lower experimental R g value of 4.99 5 0.02 nm for IgG4 ( Table 3) compared to that of 5.20 5 0.06 nm for IgG1 ( Table 2). The slightly more compact structure for IgG4 than IgG1 was explained in molecular terms by the IgG4 hinge being three residues shorter than the IgG1 hinge. The majority of the IgG4 sampling was concentrated close to its experimental R g (Fig. 6). Even though many curves with acceptable R factors showed calculated R g values over 5% lower than the experimental R g values (magenta, Fig. 6), the majority of the good-fit structures including conformations with good disulphide separations of 0.75 nm at the Cys226 and Cys229 pairs have R g values closer to experiment. Using the two experimental curves (Table 1), an R factor cutoff of 3.00% (see above) was determined to give 28,084 models (an acceptance rate of 15%; Table 3). This total was comprised of 13,189 models (9% accepted) from the unconstrained Simulation 2 and 14,895 models (32% accepted) from the disulphide-constrained models of Simulation 3. Filtering for those models on the basis of R factors and disulphide separations reduced the 28,084 models to 14,927 models (8% accepted), with 32 (0.02% accepted) and 14,895 (32% accepted) for Simulations 2 and 3, respectively ( Table 3). The correct disulphide connectivity thus improved the fits for the IgG4 solution structure.
The structural outcome of the IgG4 SAXS curve fits was monitored by the distances between the centers of mass of the Fab and Fc regions, as for IgG1 above. The 190,437 models showed a Fab separation d1 that extended to 14 nm (gray, Fig. 7 A). Filtering for R factors below 3.00% limited d1 to a range between 5 and 12 nm (purple, Fig. 7 A). Interestingly, the maximum d1 value of 12 nm was similar to that for IgG1 (purple, Fig. 4 A). Application of the hinge disulphide constraint resulted in the observation of two clusters, a and b, that corresponded to d1 being below or above 7 nm, respectively (magenta, Fig. 7 A). Cluster b in IgG4 showed a smaller average d1 of 8.09 nm compared to 9.34 nm in IgG1 (Tables 2 and 3). Unlike IgG1, the models in Cluster b were predominantly asymmetric, with abs(d2 À d3) values mostly found at 2.5 nm and labeled b 1 . Cluster b 2 was much less populated for IgG4 compared to IgG1. When the IgG4 structures were filtered using curve fits based on the SAXS instrument BM29 with an R factor cutoff of 2.40%, very similar Clusters a, b 1 , and b 2 were again visible (dark magenta, Fig. 7  B), thus confirming the reproducibility of the modeling fits from two different SAXS data sets.  (Table 3), each being represented by a single dot. The vertical blue shaded band indicates R g values within 5% of the experimental R g value of 4.99 nm for IgG4 B72.3 (Table 3). Black denotes all 190,437 models (top, Table 3). The magenta overlay denotes the 172,823 models from Simulations 2 and 3 in which the Cys226-Cys226 and Cys229-Cys229 residue pairs are each within 0.75 nm of one another to enable disulphide bond formation. The insets show expanded views of the fits for which the R factors were below 3.0%. To see this figure in color, go online.     Very good visual experimental SAXS fits to the modeled scattering curves I(Q) were seen for representative best-fit IgG4 structures for Clusters a, b 1 , and b 2 that had the lowest R factors (Fig. 8). These were corroborated by very good visual fits with the distance distribution function P(r), including the relative intensities of the M1 and M2 peaks in the P(r) curves (insets, Fig. 8). The final modeled Guinier parameters R g , R xs1 , and R xs2 of 4.82 5 0.08, 2.56 5 0.07, and 1. 35 (Tables 1  and 3). Similar R g , R xs1 , and R xs2 values were obtained with or without the disulphide distances constraint, showing the importance of the atomistic modeling approach to narrow the number of allowed structures. The mean R g values of both clusters were similar to the experimental value of 4.99 nm, showing that these clusters could not be distinguished.
For the SANS fits, the theoretical curves for the unhydrated 190,437 IgG4 models were compared with each of the SANS IgG4 curves from instruments SANS2d and D22 (Table 1). Using the two experimental neutron curves (Table 1), the R factor cutoff for acceptable models was determined to be 2.85% for both data sets. After filtering for the hinge disulphide separations, the Fab-to-Fab separation d1 for acceptable models was between 5 and 9.5 nm (Fig. 7 C; Table 3). No separate a and b clusters were resolved for IgG4 in the neutron fits, unlike the x-ray fits, although the same range of d1 values was seen. Unlike the SANS modeling for IgG1, which showed fewer models in Cluster a, many of the good-fit SANS models for IgG4 occurred in both the a and b clusters. Visual inspection of curve fits between theoretical scattering curves and the SANS experimental data for representative structures confirm that both clusters provide plausible models (Fig. S6). This means that IgG4 models from both the a and b clusters were consistent with both the SAXS and SANS data. This outcome was attributed to the shorter hinge in IgG4, which permitted alternative arrangements of the two Fab and one Fc regions in the intact antibody.

Joint fits of SAXS and SANS curves for IgG1 and IgG4
The SAXS atomistic modeling analyses for IgG1 and IgG4 resulted in 4728 and 14,927 conformational models, respectively (Tables 2 and 3), that satisfied the R factor cutoff and disulphide distance constraint (Figs. 4 and 7). These x-raymodeled structures were based on hydrated proteins with a surface monolayer of water. The SANS-modeled structures differed in that unhydrated proteins, in which the surface monolayer of water molecules was mostly invisible (20), were visualized. The combination of the different views from the x-ray and neutron modeling should narrow the FIGURE 7 Relationship between the inter-Fab distance, d1, and the absolute difference in Fab to Fc distances, abs(d2 À d3) in the IgG4 x-ray and neutron models. In all subpanels, gray dots represent the 190,437 unfiltered models that represent all the sampled IgG4 conformations. (A) X-ray modeling of ID02 data. Purple dots represent the 28,084 models in which the R factor is less than 3.0% (top, Table 3). Magenta dots represent the better 14,927 models with R factors below 3.0% in which the Cys226-Cys226 and Cys229-Cys229 residue pairs were both within 0.75 nm of one another. As for IgG1 (Fig. 4 A), two groups of structures were observed. The a cluster at d1 ¼ 5.5 nm and low abs(d2 À d3) values contains 8663 symmetric Fab-to-Fc distances, indicated in the cartoon (middle, Table 3). The b cluster at d1 between 8 and 8.5 nm shows 6264 distances with high asymmetry (large abs(d2 À d3) values) and low asymmetry (small abs(d2 À d3) values) labeled b 1 and b 2 , respectively, indicated in the cartoons (bottom, Table 3). (B) X-ray modeling of BM29 data. The same filtered modeling results from the BM29 data in a darker hue (3647 models) are compared with the ID02 data in magenta (14,927 models; (A)). The a and b clusters are arrowed as in (A). (C) Neutron modeling. The 45,975 and 66,092 structures that provided acceptable R factors below 2.85% for the neutron fits are shown (blue dots, SANS2d; red dots, D22). In these structures, both Cys pairs were within 0.75 nm of one another. The a and b clusters are arrowed as in (A). To see this figure in color, go online. ranges of accepted structures. By using the above-determined R factor cutoffs of 3.00% (IgG1, IgG4, ID02), 2.00% (IgG1, SANS2d), and 2.85% (IgG4, SANS2d), far fewer modeled structures satisfied these double x-ray and neutron filters. 1) For IgG1, the weak evidence above for Cluster a conformations was confirmed by the exclusion of Cluster a in the double x-ray and neutron fits (red, Fig. 9 A). There, Cluster b was centered at d1 ¼ 9.68 5 0.45 nm (Table 4). This outcome was attributed to a more extended IgG1 hinge conformation that does not permit the formation of the more compact conformations that correspond to Cluster a. This outcome was reproducible when the SANS2d neutron data were replaced by neutron data from D11 (red, Fig. 9 C) and D22 (red, Fig. 9 D). There, Cluster b was centered at similar d1 values of 9.06 5 1.06 and 9.08 5 1.06 nm (Table 4). 2) For IgG4, both the Cluster a and b models passed the double x-ray and neutron filter. Both clusters showed similar populations when tested against the joint ID02-SANS2d data sets (Table 4). An explanation for why IgG4 existed in two alternative good-fit conformations was provided from Fig. 9, A and B (in blue). In Clusters a and b, two of the three separations d1, d2, and d3 in IgG4 were 6.8-7.0 nm, and the third was 8.6 nm (Table  4), unlike IgG1, for which the three separations were distinct at 9.1, 7.9, and 6.5 nm ( Table 4). Given that the Fab and Fc regions were similar in sizes, Clusters a and b in IgG4 were, in fact, indistinguishable. The same separations for IgG4 were also seen in the joint BM29-D22 data sets (blue, Fig. 9, C and D); thus, this outcome was reproducible.

3) Crystal structures for full-length human IgG1 and IgG4
provided an independent assessment of the atomistic modeling, although each one only provided a single snapshot of one structure. The IgG1 crystal structure (PDB: 1HZH) gave separations d1 ¼ 9.0 nm and d2 and d3 ¼ 9.0 and 6.1 nm that resembled Cluster b (35), indicating that the crystal and solution structures were similar (:, Fig. 9). The IgG4-based pembrolizumab crystal structure (PDB: 5DK3) gave separations d1 ¼ 7.0 nm and d2 and d3 ¼ 7.6 and 5.9 nm that were intermediate between Cluster a and b (-, Fig. 9) (47). Consequently, although similar to the solution structure, the IgG4 crystal structure did not distinguish between the two clusters. Although further MD simulations could potentially suggest a preference for either Cluster a or b conformations, further insight by using this was considered unlikely. FIGURE 8 Representative x-ray scattering curve fits for the three families of best-fit IgG4 B72.3 models. The three curve fits correspond to the a, b 1 , and b 2 clusters of Fig. 7 A with R factors below 3.0% and Cys pairs within 0.75 nm of one another. The experimental curve from instrument ID02 is shown in black (Table 1; (15)), and the best-fit theoretical curve is shown in blue. The experimental and theoretical distance distribution functions P(r) are shown at the top right of each panel. The best-fit conformer is shown in two views related by an axial rotation of 90 to follow the colors of Fig. 1 B. (A) Cluster a with a symmetric structure (small abs(d2 À d3)) and a small d1; (B) Cluster b 1 with asymmetric Fab-Fc distances (large abs(d2 À d3)) and a large d1; (C) Cluster b 2 with symmetric Fab-Fc distances (small abs(d2 À d3)) and a large d1. To see this figure in color, go online.

DISCUSSION
We have described in detail a, to our knowledge, new atomistic method to determine solution structures of full-length human IgG1 and IgG4 antibodies by joint SAXS and SANS studies. After data collection, we constructed a fullsized energy-minimized molecular model for each of IgG1 and IgG4, then submitted these to Monte Carlo simulations at their hinges to generate a large and broad range of structures to enable best-fit structures to be determined. This method opens new avenues for future structural studies of full-length antibodies of all types. The steps in this process are outlined. First, abundant x-ray and neutron scattering data for monoclonal human IgG1 and IgG4 in light and heavy water buffers from five different instruments were used. The joint data sets established their experimental reproducibility, defined the appropriate R factors for filtering based on experimental curve comparisons, and provided two different views of hydrated and unhydrated IgG structures. To model these data sets, molecular dynamics first ensured that the starting antibody protein structures for IgG1 and IgG4 based on crystal structures were physically realistic and complete. Each starting structure was then inputted into the SASSIE-web modeling workflow, in which Monte Carlo randomization of the antibody hinge structure was performed (23). The Monte Carlo approach offered a computationally rapid means of generating 700,000 and 704,000 trial structures for full-length IgG1 and IgG4, respectively. After sterically overlapping IgG structures were removed, 231,492 and 190,437 acceptable structures, respectively, were identified (Tables 2 and 3). Filters based on x-ray R factors of 2.00-3.15% and a hinge disulphide separation of 0.75 nm reduced these totals to 4728 and 14,927, respectively (Tables 2 and 3). As a third filter, by accepting only those structures that jointly fitted the xray and neutron data sets, these structures were further reduced to final totals of 28 and 2748, respectively (Tables  2 and 3). The comparison of these filtered best-fit models with the starting acceptable structures (Figs. 3 and 6) showed that enough structures had been sampled and that convergence to best-fit structures was encompassed within these starting structures. The 28 accepted IgG1 solution structures were asymmetric; these fell into a Cluster b group of structures with separations d1 of 9.7 nm and d2 and d3 of 8.3 and 6.5 nm (Fig. 1 B; Table 4). For IgG4, two final totals were identified, one being 1645 symmetric structures (Cluster a) with separations d1 of 6.0 nm and d2 and d3 ¼ 8.8 and 7.7 nm and the other being 1103 asymmetric structures (Cluster b) with separations d1 of 8.0 nm and d2 and d3 ¼ 8.4 and 6.0 nm (Table 4). Other searches based on different joint x-ray and neutron data sets (Table 4) reported similar outcomes. These final 28 and 2748 best-fitted IgG1 and IgG4 structures are downloadable as Data S1. New, to our knowledge, biological insights were obtained from these atomistic IgG1 and IgG4 solution structures. First, by considering the hinge peptides as molecular structures in the fits, we have an atomistic explanation for the more elongated solution structure of IgG1 compared to FIGURE 9 Summary of the joint best-fit x-ray and neutron models and connected hinge disulphide bonds. Their x-ray and neutron R factors are below 3.0%, and the two Cys pairs are within 0.75 nm of one another. The red and blue crosses indicate the ensembles of the best-fit IgG1 and IgG4 models respectively. In (A) and (B), the red and blue shaded areas represent the libraries of 231,492 and 190,437 structures for IgG1 and IgG4, respectively, in the Monte Carlo simulations. In (A)-(D), : anddenote the IgG1 and IgG4 crystal structures, respectively (PDB: 1HZH and 5DK3). (A) A comparison of the 28 and 2748 d1 and abs(d2 À d3) distances after the theoretical IgG1 and IgG4 models were filtered, using both the ID02 x-ray and SANS2d neutron data together ( Table 4). The cartoons illustrate representative a structures and two distinct b structures for IgG4 (left) and IgG1 (right) in this plot, in which the numbers denote the separations in nm. Substantial overlap is seen in the starting Monte Carlo conformations for IgG1 and IgG4 before filtering. (B) Distribution of the smaller Fab-Fc (min(d2, d3)) and larger Fab-Fc distances (max(d2, d3)) for each of the 28 and 2748 IgG1 and IgG4 models filtered in (A). (C and D) A comparison with (A) of the d1 and abs(d2 À d3) distributions derived from repeat experimental data sets. Using the ID02 x-ray data jointly with the (C) D11 neutron data resulted in 24 IgG1 models and with the (D) D22 neutron data resulted in 26 IgG1 models, shown in red crosses ( Table 4). The 2845 IgG4 models (blue crosses) were filtered using the BM29 x-ray data and D22 neutron data in both (C) and (D) ( Table 4). To see this figure in color, go online. that of IgG4. The atomistic modeling also showed that the IgG1 solution structure is asymmetric, this asymmetry being similar to that of the IgG1 crystal structure (Fig. 9 A). Second, these molecular structures provided new functional insights on the molecular basis for IgG1 and IgG4 complex formation with two major ligands, i.e., the globular heads of complement C1q and the high-affinity FcgRI receptor (Fig. 10). Here, atomistic modeling now based on molecular dynamics and Monte Carlo simulations, compared to our less detailed 2014 modeling (15,16), provided a clearer molecular explanation of the relative reactivities of IgG1 and IgG4.
1) For IgG1, the 28 best-fit IgG1 solution structures (Cluster b) were combined with a docking model for the interaction between human IgG1 Fc and the crystal structure of the C1q globular head, this being taken from our 2015 study (16,49,50). Overlap was defined as a docked C1q structure at either of the two Fc sites showing >50 heavy atoms within 0.2 nm for either Fab region. Only three (10%) of the 28 IgG1 best-fit structures showed overlap.
In the case of the crystal structure of the human Fc-FcgRI receptor, considered in the same way (16,54), only four (14%) of the 28 IgG1 structures showed overlap. The low degrees of overlap indicated that the C1q-IgG1 and FcgRI-IgG1 interactions were permitted by the IgG1 solution structures without too much displacement of the Fab regions (Fig. 10 A). Similar results were obtained for crystal structures for the Fc-FcgRIII complexes, showing that the IgG1-FcgRIII interactions were permitted (51,52). 2) For Cluster a in IgG4, the combination of the 1645 bestfit IgG4 structures with the C1q head and the FcgRI receptor showed 109 (7%) and 1340 (82%) steric overlap of the two ligands with the Fab regions, respectively (Fig. 10 B). For Cluster b for IgG4, the combination of the 1103 best-fit IgG4 structures with the C1q head and the FcgRI receptor showed 203 (18%) and 1009 (95%) steric overlap of the two ligands with the Fab regions, respectively (Fig. 10 C). The low overlap for the C1q-IgG4 interaction indicated that this interaction was permitted, provided that the shorter IgG4 hinge showed enough flexibility to enable the Fab regions to be displaced for complex formation. The high overlap for the FcgRI receptor indicated that IgG4 would show reduced reactivity for binding to its receptor, making this interaction unlikely. High overlap was seen also for the FcgRIII receptor, making this interaction unlikely, too.
Our IgG1 and IgG4 studies complement other recent investigations on antibody solution structures. A neutron spin echo study (55) was conducted with a heterogeneous mixture of polyclonal monomeric and dimeric human IgG from plasma, comprised of the four IgG1, IgG2, IgG3, and IgG4 subclasses with different hinge structures between the subclasses. Similar to earlier studies of antibody flexibility involving the Fab and Fc regions, flexibility was detected as contributions to translational and rotational diffusion motions between the Fab and Fc regions. Interestingly, this flexibility does not influence our averaged structure determinations for the IgG1 and IgG4 subclasses, given that these were able to account for their different receptorbinding functions. Other related studies have examined different aspects of antibodies, for which our approach is relevant. For example, antibodies are important as pharmaceutical proteins, and the study of their aggregation propensities is important to control these for their effectiveness. A multidisciplinary biophysical approach was used to study the unfolding, interactions, and aggregation pathways for a set of four human IgG1 monoclonal antibodies (56). Although no atomistic modeling was performed in that study, there is much scope for modeling to be applied, as illustrated by our recent atomistic study of the unfolding The best-fit structures were filtered on the basis of their simultaneous agreement with the paired x-ray and neutron data sets indicated in each section. All the accepted IgG1 and IgG4 models have the disulphide-bridged Cys226-Cys226 and Cys229-Cys229 residues within 0.75 nm of one another, as required for bonding ( Fig. 1 A). Columns 5-7 specify the modeled inter-Fab and Fc distances defined in Fig. 1 B. SD, standard deviation. a The D11 filter retained five Cluster a structures, and the D22 filter retained a single Cluster a structure (not tabulated).
of a Fab region at low pH by a combination of SAXS, molecular dynamics, and FRET analyses (57). A polydisperse mixture of monomeric and dimeric bovine IgG, IgA, and IgM has also been studied, also going to very high concentrations but with examples of bead modeling and fits to the human IgG1 b12 crystal structure that provided some insight into the scattering data (58). Finally, other molecular modeling studies have been reported for antibodies, although less extensive than this study. One study related to the modeling of glycan chains in four small proteins using ab initio bead methods, including the antibody Fc region (59). Another related to the construction of a full human IgG1 structure from its Fab and Fc regions by molecular dynamics. Despite the approximations made in this modeling, this approach successfully accounted for diffusion coefficients measured using dynamic light scattering (60). SASSIE has already been used to study antibodies. A structure for human monoclonal IgG2 was determined from SANS data, although the models were based on a mouse IgG2a crystal structure and not a human IgG2 crystal structure (13). The use of this mouse structure may have affected the outcome of this analysis because the hinges are different between human and mouse IgG2. In a subsequent SAXS and SANS study of human myeloma IgG2, as many as 400,000 trial models were generated for IgG2 over a longer hinge region with the correct hinge sequence (61). This enabled IgG2 function to be assessed. A more ambitious SASSIE modeling analysis was undertaken for human IgG2 binding to a tetrameric streptavidin antigen (62). Experimentally, multiple complexes were formed based on monodentate and bidentate complexes. Nonetheless, the SASSIE modeling showed that compact models for the bidentate antibody-antigen complexes fitted well with the SAXS data.
The atomistic modeling of antibody solution structures from scattering data has evolved significantly since its original inception in 1995. Our first approach used systematic translations and rotations of separate Fab and Fc crystal structures in bovine IgG to fit SANS data, followed by the molecular modeling of the two hinge peptides into the best-fit Fab-Fc arrangement (63). Our second strategy utilized molecular dynamics to generate conformationally randomized hinge peptides to connect the Fab and Fc regions in human IgA1 antibody, thereby creating intact randomized structures for scattering fits (64). This strategy gave 14 different antibody best-fit solution scattering structures that were deposited in the PDB (21). Drawbacks included the lengthy processing of trial structures and an inefficient exclusion of sterically overlapping Fab and Fc regions. Our current (and third) strategy in SASSIE started from an energy-minimized structure, used rapid Monte Carlo sampling methods on a high-performance computing platform, rejected structures with poor stereochemistry at the point of generation, and employed an integrated modeling workflow for scattering fits (22). In this way, human IgG2 antibody was modeled from SANS data, and human IgA1 antibody likewise from SAXS and SANS data sets (13,65). This third approach will enable more ambitious modeling of antibody solution structures in the future, whether this will be to determine new types of structures or to monitor conformational changes.
These IgG1 and IgG4 atomistic structural fits yield significantly more information from scattering experiments than previously. For this, molecular dynamics and Monte Carlo methods proved indispensable. Even though the conformational difference between IgG1 and IgG4 was evident from the x-ray and neutron scattering data (Table 1), the atomistic modeling that directly fitted these scattering data enabled a molecular structural interpretation of this difference (Fig. 10). Interestingly, the 28 asymmetric best-fit structures for IgG1 6a resembled the IgG1 crystal structure, thus showing reproducibility and consistency with other structural methods (Fig. 9). Also interestingly, the crystal structure for the IgG4-based therapeutic antibody pembrolizumab showed consistency with the 2748 symmetric and asymmetric best-fit structures, although the crystal structure was intermediate between the Cluster a and b solution structures. Alongside these scattering fits, the availability of high-quality scattering curves was indispensable. Here, experimental errors in these curves were considered by comparing side-by-side two independent experimental scattering curves of the same protein, then computing the R factor to show the extent of agreement between these. Our curve comparison of data from two different x-ray and three different neutron instruments showed that the lowest R factors ranged between 2.00 and 3.15%. The quality of the modeled fits surpassed this experimental limit for both IgG1 and IgG4, with R factors below 3% in the full scattering curve Q range out to 1.5 nm À1 (Tables 2 and 3), FIGURE 10 Receptor binding to the best-fit IgG1 (A) and IgG4 (B and C) models. The density plots show the Fc structure in a gray surface representation and the sterically accepted Fab structures in a gray semitransparent volume. Representative best-fit Fab orientations for the IgG1 b cluster and the IgG4 a and b clusters are shown in blue and yellow as labeled. The upper row shows IgG1 or IgG4 bound to a docked C1q head (green envelope), and the bottom row shows IgG1 or IgG4 bound to the FcgRI receptor (purple). In all views, the Fc structure is viewed face-on. To see this figure in color, go online.
indicating that the generation of 700,000 trial structures was sufficient for accurate atomistic modeling. The modeling resulted in an ensemble of related structures (Fig. 9), not one definitive solution structure. The application of three constraints (low R factors, disulphide bridge connection, joint x-ray and neutron fits) much reduced the number of allowed structures in the ensemble. In the future, the advent of sizeexclusion chromatography in SAXS and SANS data collection will improve the quality of the scattering curves by the removal of trace aggregates, leading to improved modeling outcomes.
Atomistic scattering modeling systematically evaluates all physically allowed conformations once enough models are sampled. This modeling procedure raises the issue of how to interpret the resulting conformations. Previously, principal component analyses were used to identify four distinct clusters of structures that defined the asymmetric solution structure of human IgA1 (65). Here, the alternative representation of the fitted solution structures based on plots of abs(d2 À d3) vs. d1 resulted in two different conformations, termed Clusters a and b, that gave indistinguishable scattering fits. Cluster a featured the two Fab regions $6 nm apart and both separated by $8.0-8.5 nm from the Fc region in mostly symmetric structures (Table 4). Cluster b showed the two Fab region separations and one of the Fab-Fc separations at $8.0-8.5 nm in mostly asymmetric structures, with the other Fab-Fc separation at $6 nm ( Fig. 9 A). The fit ambiguity resulted from the similar sizes and shapes of the Fab and Fc regions, meaning that these could be interchanged and still give good fits. For IgG1, this ambiguity was resolved by the disulphide constraint and the use of joint x-ray and neutron fits; this eliminated Cluster a to leave Cluster b as the best-fit solution structure. For IgG4, its shorter hinge meant that both Clusters a and b offered good fits, with no clear preference for either conformation. It is possible that the two clusters represent degeneracies in the modeling and not two distinct structural outcomes. Nonetheless, this result still enabled the analysis of IgG4 binding to its C1q and FcgRI ligands by docking. This outcome emphasizes the importance of combining multiple experimental data sets to understand protein solution structures, especially when combining solution data with crystal and NMR structures (66).
In recent years, several algorithms have been developed to construct solution structure ensembles for experimental SAXS and SANS data (reviewed in (23)). A protein in solution is often considered to exist in multiple structural states with a population distribution, and a trajectory from a molecular dynamics simulation is an example of one such ensemble. Given that the experimental scattering curve is a time-averaged observation of protein structural states in solution, the question arises whether any of the determined protein structures give an experimental scattering curve that is significantly different from the experimental one. This is unlikely for IgG1 and IgG4 because, from the algorithm in use, all the generated structures were stereochemically valid without steric clashes. The Monte Carlo simulations were sufficiently broad that they were able to generate quite different multiple structural states. Nonetheless, each of the IgG1 and IgG4 fits resulted in a single distribution with one clear minimum, implying that there is a single structural type for both proteins (Figs. 3 and 6; Fig. S4). Given that there were no indications of alternative structures, our computational method here and in our earlier studies (15,16) determines the average structure that fits the experimental data. Thus, the molecular dynamics and Monte Carlo modeling approach used here becomes a method that fits one structural ensemble to the experimental data.

CONCLUSIONS
We have reported here in detail a new, to our knowledge, new modeling method to determine solution structures of full-length human IgG1 and IgG4 antibodies by joint SAXS and SANS studies. This opens new avenues for future structural studies of full-length antibodies. New biological insights were obtained on the IgG1 and IgG4 solution structures. The IgG1 and IgG4 atomistic structural fits have yielded significantly more information from scattering experiments than previously.

AUTHOR CONTRIBUTIONS
D.W.W. and E.L.K.E. designed and performed the modeling analyses and wrote the manuscript. G.K.H. obtained SAXS and SANS data and analyzed these. S.J.P. conceived and coordinated the study and wrote the manuscript.