Validation of electron-microscopy maps using solution small-angle X-ray scattering

A novel validation tool for transmission electron-microscopy maps utilizing independent small-angle X-ray scattering measurements is introduced and implemented. The power of this technique was demonstrated by testing it using simulated data and using it on real experimental data from online repositories.


Table S1
Benchmark times.The subscripts 'st' and 'mt' are short for 'single-threaded' and 'multi-threaded', respectively.Each value represents the mean and standard deviation of between 10 and a few hundred runs in milliseconds, depending on the runtime of each program.A ( † ) indicates a SASBDB id.We here present a short example illustrating how the program can be used.We will fit the EMD25044 map to the SASDME5 SAXS dataset.There are two supported modes of operation: directly using the terminal, and using the simple graphical user interface.Both executables are freely available on the project GitHub page https://github.com/AUSAXS/AUSAXS.

Using the terminal
From a terminal, the program can be launched through the em fit executable.The basic syntax is em fit <map file> <SAXS file>, where a number of options can be appended.For a full list, use the em fit -h option.A more complete description of each option and what it controls is available in the online documentation.
After entering the path to the map, the SAXS file, and any required options, hit enter to start the fit.After a short setup time, the fit itself will begin.During this phase, the current threshold cutoff value and its associated χ 2 is shown in the terminal and updated for every iteration.Since the scan starts with the smaller dummy structures and slowly works its way towards larger dummy structures, the iteration rate will steadily slow down throughout this phase.After having scanned the majority of the threshold range with the requested number of steps, the fitting is allowed to terminate, after which it will enter the second phase: estimating the uncertainty of the minima.In this phase it will focus on the area surrounding the lowest minima it encountered in the first phase.This is usually quite fast, but may add significant computing time depending on the size of the dummy structure at the minima.At the end of this phase, the absolute minima is printed to the terminal along with some additional information about the optimized structure.A number of .plotfiles will also be saved to the output directory.To convert these files to actual plots, first navigate to the folder, and then run the plot.pyPython script without any arguments.

Using the graphical user interface
The interface, built with the Elements(de Guzman, 2019) C++ package, can be started by executing (double-clicking) the em fit gui executable.This will load up a window similar to figure 2a.Here the most important options can be changed, though the more advanced ones are only available through the terminal interface.Start by hitting either of the folder icons next to the map path and saxs path fields.Navigate to the data file and open it.Doing this with the map path will automatically pull in the SAXS data if a file with the correct extension is also present in the same folder.
After both datasets have been loaded in, the output path will automatically change to a unique path for this combination of files, preventing accidental overwrites, see figure 2b.The output path can also be manually changed if necessary.Now the remaining options can be changed if required, though the defaults are typically sufficient.When you are ready, hit the central start button to initiate the fitting, after which a progress bar will be shown as seen on figure 2c.When the fit is complete, the resulting plots will be shown directly in the interface, like in figure 2d.You can now cycle through them using the small menu.

Understanding the output
There are two main plot types: the fit to the scattering profile, and the χ 2 r landscape plots.The former comes with both linear and logarithmic x-axis variants, while the latter has five different plots with various zoom levels and x-axes.r landscape with an intermediate zoom level and with the mass as the x-axis.The χ 2 r landscape shown with an intermediate zoom level and with a mass x-axis.When analyzing the output, it is important to check if the minima is well-defined in this type of plot.Sometimes multiple minima may be present, in which case the folder models will be created, containing fit information and dummy structures for each such minima.
a) The interface after starting.
b) The interface after inputting the data.Note the green coloring indicating that the files were successfully opened.
c) The interface while running the fit itself.Note that the progress bar will likely not progress linearly.
d) The interface after completing the fit.All generated figures can be viewed directly inside the program itself.
Projections of the test maps and an high-resolution atomic structure.For the first three proteins (denoted with an EMDB id), the structure associated with the EM map is shown.In the last, the structure associated with the experimental SAXS data is shown instead.In all cases they were manually aligned, both in space and with the threshold cutoff value.
Projections of the test maps and an high-resolution atomic structure.For the first seven proteins (denoted with an EMDB id) the structure associated with the EM map is shown, while for the remaining five (EMDB id in parentheses) the structure associated with experimental SAXS data of the same structure is shown.In all cases they were manually aligned, both in space and with the threshold cutoff value.

Figure 1 (
Figure 1 (left): The scattering profile with double-logarithmic axes.The upper half shows the data points with error bars in black, with the red curve denoting the best fit.The legend shows the goodness-of-fit through the χ 2 r value.(right): The χ 2 r landscape with an intermediate zoom level and with the mass as the x-axis.The χ 2 r landscape shown with an intermediate zoom level and with a mass x-axis.When analyzing the output, it is important to check if the minima is well-defined in this type of plot.Sometimes multiple minima may be present, in which case the folder models will be created, containing fit information and dummy structures for each such minima.

Figure 2
Figure 2Example use of the graphical user interface.Note that this is an early version and may thus be subject to change in the future.

Table S2
Benchmark of serial evaluation of the scattering curves of the 24889 cryo-EM map.One hundred SAXS curves are calculated within each threshold range five times.This table can be compared with the benchmarks for single structures in tableS1.