Brownian dynamics simulations of biomolecular diffusional association processes

Brownian dynamics (BD) is a computational method to simulate molecular diffusion processes. Although the BD method has been developed over several decades and is well established, new methodological developments are improving its accuracy, widening its scope, and increasing its application. In biological applications, BD is used to investigate the diffusive behavior of molecules subject to forces due to intermolecular interactions or interactions with material surfaces. BD can be used to compute rate constants for diffusional association, generate structures of encounter complexes for molecular binding partners, and examine the transport properties of geometrically complex molecules. Often, a series of simulations is performed, for example, for different protein mutants or environmental conditions, so that the effects of the changes on diffusional properties can be estimated. While biomolecules are commonly described at atomic resolution and internal molecular motions are typically neglected, coarse‐graining and the treatment of conformational flexibility are increasingly employed. Software packages for BD simulations of biomolecules are growing in capabilities, with several new packages providing novel features that expand the range of questions that can be addressed. These advances, when used in concert with experiment or other simulation methods, such as molecular dynamics, open new opportunities for application to biochemical and biological systems. Here, we review some of the latest developments in the theory, methods, software, and applications of BD simulations to study biomolecular diffusional association processes and provide a perspective on their future use and application to outstanding challenges in biology, bioengineering, and biomedicine.


| INTRODUCTION
Simulations of interactions between biomolecules can be used to compute the thermodynamic, kinetic, and structural properties underlying biomolecular processes. A range of simulation approaches exists, spanning varying levels of accuracy, and computational expense. The choice of simulation method depends on the properties of the molecular system under study, which can vary in size from a few thousand to billions of atoms, and can involve molecular association processes that span timescales from nanoseconds to days. One of these simulation approaches is Brownian dynamics (BD), which can be used to simulate molecular diffusion and to compute diffusional binding kinetics. Compared to methods such as molecular dynamics (MD), BD simulations are computationally efficient, since the molecules are often simulated as rigid bodies, an implicit solvent treatment is used, and the timesteps are on the order of picoseconds or longer, while the timestep used in classical MD is on the order of femtoseconds. These features of BD allow the simulation of times that can extend to many milliseconds of the molecular diffusion of biomacromolecular systems in a few hours of computing time, which is useful for studying how solutes subject to Brownian motion diffuse over long distances and interact with one another. Brownian motion describes the random walk of solutes diffusing and drifting in a continuum solvent, a phenomenon named after Robert Brown, who first observed pollen particles moving randomly in water. 1 Some years later, Albert Einstein and Marian von Smoluchowski developed a theory to describe this diffusive movement and mathematically formulated a description for the average displacement of solute particles over time, which is proportional to the square root of time. One theoretical formulation of BD arises from Langevin dynamics under the condition that friction forces are overdamped, which prevents any particles (for our purposes, molecules or other diffusing bodies) from accelerating into any sort of ballistic motion. 1 In addition, random "kicks" from the solvent are included to ensure that particles remain in motion. Additional interparticle forces are considered in most BD simulations, such as those due to electrostatic, desolvation, hydrophobic, and steric interactions.
The BD simulation approach is applicable if molecular diffusion plays a role in the process under consideration. The implicit treatment of the solvent as a dielectric continuum (with a Boltzmann distribution of ions) that exerts frictional and stochastic forces on diffusing solutes means that some of its features that can affect association processes are not accounted for. These include structural water molecules that may be present at macromolecular interfaces, the ability of water molecules to engage in directional hydrogen-bonding and hydrogen-bonding networks, and the entropic penalty due to water displacement when molecules diffuse into close proximity. Moreover, the rigid-body approximation that is often used in BD for diffusing solutes may not be appropriate when the conformational dynamics of the solutes influence the association processes studied. In such cases, it is necessary to introduce the important degrees of freedom into the solute models for BD or to couple BD simulations with MD simulations that can more completely account for the molecular flexibility. Despite the simplifying approximations inherent to the BD simulation method, it is applicable to studying a wide range of biomolecular processes.
The aim of this article is to summarize the most recent advances and applications of BD to investigate biomolecular association processes by using detailed structure-based molecular models. We consider the period from approximately 2015 to the present (2022). For prior developments and applications of BD, we refer the reader to previous reviews. 2 -11 In Section 2, we give an overview of the physical models employed in BD simulations, including forces, hydrodynamics, as well as BD algorithms. In Section 3, we describe computational methods that employ BD simulations. In Section 4, we describe the features of software packages that are currently widely used for BD simulations of biomolecules. In Section 5, we explore recent applications of BD to investigate association processes in a variety of biomolecular systems. Finally, we conclude with an outlook on the development and application of BD simulation for biomolecular applications.

| ALGORITHMS FOR PERFORMING BD SIMULATIONS
All molecular simulations require a set of algorithms and equations that are used to compute solute particle motions over time. The foundation of BD relies on Equation (1), which relates the variance of particle position in one dimension (Δx) to an interval in time Δt ð Þ in Brownian motion: where D T is the translational diffusion coefficient defined by: where k B is the Boltzmann constant, T is the temperature, and γ is the friction coefficient, which depends on solute shape and size, and also on solvent viscosity. For a spherical particle translating in three dimensions, D T is given by the Stokes-Einstein equation.
where η is the viscosity of the solvent and a is the hydrodynamic radius of the solute. The corresponding rotational diffusion constant of a sphere (D R ) is: In addition to Brownian motion, BD also includes intermolecular forces to compute the drift of the particles. Therefore, an extension of Equation (1) is used to define the position change Δr of a particle within the timestep Δt, in which external and solute-solute forces are accounted for, as well as random "kicks" to model the stochastic collisions between the solute and the solvent particles, represented by a random displacement vector. Another important effect to consider is the hydrodynamic interactions (HI) between solute particles, which result from hydrodynamic movements of the solvent produced by the displacement of one of the solute particles that in turn influence the movement of the same solute particle or others nearby. Ermak and McCammon developed an algorithm to simulate Brownian motion while also accounting for HI, intermolecular forces, and random displacements at each timestep 12 : where Δr i and Δw i are the displacement and rotation of solute particle i, respectively, D T,ij is the component of the translational diffusion tensor and D R,ij is the component of the rotational diffusion tensor applying between particles i and j, F j and τ j are the forces and torques on particle j, and R i and W i are translational and rotational Gaussiandistributed random vectors, respectively, describing the effects of the stochastic collisions between the solute i and the solvent particles, defined by: These equations define the translational and rotational displacements of molecules at each timestep in a BD simulation. In the following sections, we outline how the forces are computed, as well as algorithms to compute diffusion coefficients and tensors.

| Forces
The movements of the solutes are influenced by several types of forces, including electrostatic, van der Waals, and desolvation, of which the electrostatic forces are often the most important forces in BD simulations due to their long-range nature. In molecules, atoms assigned partial charges are attracted or repelled by the electrostatic potential created by the other charges in the system, to a degree that is affected by the heterogeneous dielectric medium, defined by a (position-dependent) dielectric constant ϵ(r), and the presence of electrolytes. The electrostatic potential of a solute Φ is typically calculated by numerical solution of the Poisson-Boltzmann equation: where c i and z i are the concentration and charge of solvent ion i, respectively. Forces can be computed from the electrostatic potential. In the case of a two-solute system, where F 1 ! is the electrostatic force felt by Solute 1 produced by the sum over every test charge q i within Solute 1 exposed to an electrostatic potential gradient rΦ 2 cast by Solute 2.
The steric repulsion between solutes can be accounted for by rejecting timesteps in which a collision is detected in two-solute systems, 13 but is otherwise modeled by a van der Waals force, typically represented by a soft-core (8-6) Lennard-Jones (LJ) potential. 14 Desolvation forces, which arise when two solute particles approach close to each other due to solvent effects, can be described by terms representing polar 15 and nonpolar contributions. 16 Coarse-grained (CG) models can be used to reduce the number of pairwise interactions. In these models, atoms are grouped into beads, typically possessing a spherical shape of the suitable radius with a point charge, and empirical terms are used to describe the interactions between the bead particles. 17 The advantage of this approach is that flexibility can be considered while the computational costs are reduced.
However, computational costs can be further reduced by representing the solutes as rigid bodies or collections of rigid conformations. Instead of calculating pairwise electrostatic interactions between atoms or beads, or solving the PB equation during the simulation, electrostatic potential grids can be precomputed for the individual solute structures (e.g., with the software APBS 18 ) to drastically reduce the cost of force evaluations. With these grids, the electrostatic force on one solute is computed considering the interactions of its charges with the electrostatic potential grid of another solute (Equation (7)). To further reduce the computational resources, the solutes can be assigned effective charges at the positions of charged (or polar) functional groups to approximately account for the low dielectric of both solutes compared to the aqueous solution. 19 With the effective charge approximation, PB forces are approximated as the sum of electrostatic interaction and electrostatic desolvation forces. The electrostatic interaction terms are computed using Equation (7) with q i being effective charges whereas electrostatic desolvation terms are computed to describe the effect of the dielectric medium of a solute as it interacts with the effective charges. Molecular interaction grids can be precomputed to represent these electrostatic terms as well as short-range interactions due to nonpolar hydrophobic desolvation effects, van der Waals interactions (e.g., LJ), and excluded volume effects of rigid body solutes. For details, refer to Martinez et al. 20 A problem when computing forces from grid-point/atom interactions are that for large systems such as a viral capsid, they become computationally demanding due to the large memory required to store the potential grids for the BD simulations. Dynamic tubular grid (DT-grid) is an algorithm that can help to solve this problem and is implemented in SDA 7. With this algorithm, only the information from the interaction grids close to the molecular surface is stored rather than the complete grid. 21 This approach can be applied to the LJ and excluded volume terms because the corresponding interaction potential decays quickly with distance for such short-range interactions. It uses a set of recursive DT-Grid structures that store the grid data and how to compress and access it, thereby enabling efficient and fast random access to the grid points. For example, the use of DT-Grid reduced the memory requirements to store the LJ interaction grid (with 1600 Â 1600 Â 1600 points and a spacing of 0.2 Å) for computing the interaction of a cowpea mosaic virus capsid with a gold Au(111) surface from 64 to 8 GB. 21 Browndye employs another strategy to consider excluded volume interactions: it utilizes a divide-and-conquer collision detection method that avoids needing an exclusion grid altogether by partitioning the molecule into a tree-like structure where portions of the molecule are enclosed by nested spheres with gradually reducing radii. By examining intermolecular sphere overlaps within this tree structure, collisions between molecules can be efficiently detected. 22 For the longer-range electrostatic potential grid, DTgrid is not applicable. In this case, the Debye-Hückel (DH) approximation can be used to model electrostatic interactions extending beyond the electrostatic potential grid when the solutes are sufficiently far apart that they can be considered as monopoles. 23,24 A cutoff distance is applied, beyond which the DH approximation is used instead of an electrostatic potential grid, thus reducing the need for very large grids with high memory requirements. Recently, the treatment of the transition between grid and DH models to ensure force continuity was improved, and the DH approach was recently extended to the long-range electrostatic forces between proteins and flat surfaces of materials. 24

| Diffusion coefficients and hydrodynamics
When the solutes are far apart, the HI are usually negligible and a diffusion tensor D that includes only simple selfdiffusion terms, that is, constant diffusion coefficients, is used. The diffusion coefficients can be computed under a simple spherical approximation or by accounting for the nonspherical shape of molecules, for example, using HYDROPRO 2 or the fast convex hull method, HullRad. 25 As the solutes come into close proximity, the effect of HI becomes more pronounced. The original approach for treating HI described by Ermak and McCammon 12 is very computationally expensive as D is a large matrix (Equation 4) requiring factorization at every timestep. Therefore, alternative approaches have been developed that avoid the factorization of D. Researchers have performed multipolar expansions of the Stokes equations and linked them by a mobility tensor M, of which the Rotne-Prager-Yamaka (RPY) tensor is the most frequently used. The RPY tensor treats the HI between nonoverlapping spherical beads that represent molecular shape at different levels of accuracy (from atomistic detail to one sphere per solute). This tensor neglects the multi-body effects and the higher order terms in two particle interactions (the movement of one particle by HI impacts the movement of the initial particle) by making use of Faxen's laws. 26 Therefore, M is reduced significantly compared to the matrix of Equation (4) and its factorization is much less computationally expensive. However, the number of components in M increases exponentially with the number of particles.
One approach to reduce the computational complexity is to sample particle motion in two stages. 27 First, a wavespace (long-ranged) contribution is calculated with periodic HI by applying Fast Fourier Transforms (FFTs) to the RPY tensor, and second, a real-space (short-ranged) correction is calculated with a mathematical subspace (Krylov subspace) to account for inaccuracies in the RPY tensor at very close interparticle distances. It has been shown that the algorithmic complexity of these two stages scales linearly (O(N)) with the number of particles. Another approach, which is employed in the RPYFMM software package, 28 is to decompose the far-field RPY tensor into a linear combination of scalar Laplace potentials and their derivatives and then evaluate each of these using an adaptive fast multipole method (FMM). With a parallel implementation, this was shown to scale efficiently to calculation of the RPY tensor for millions of beads. To further account for inaccuracies in the RPY tensor at close distances, a bead-shell molecular model can be used to compute the HI at close molecular proximities. 29 One difficulty in modeling HI is the representation of complex molecular shapes. Biological macromolecules have irregular shapes (for example, antibodies with a Y-shaped structure). Thus, an effective sphere model of a macromolecule may be inadequate. For example, Roosen-Runge et al. 30 showed for rod-like molecules how treating them as a linear assembly of hydrodynamically interacting spherical beads can overcome the problems of an effective spherical bead model for studying short-time self-diffusion. In the Generalized Rotne-Prager-Yamaka (GRPY) approximation, 31,32 overlapping differently sized beads are used to properly model the transport properties of molecules with irregular shapes, and this approach has been extended to HI between solutes and surfaces, as well as to motion under shear flow in Lee-Edwards periodic boundary conditions. BD without HI cannot be reasonably applied under crowded conditions due to the close proximity of the solute molecules. A mean-field hydrodynamic model was developed by Mereghetti and Wade 23 for concentrated protein solutions and was extended by Reinhardt et al. 24 to treat the HI between flat surfaces and proteins. In this model, the solute diffusion coefficients are scaled by a term dependent on the local occupied volume, which is obtained from the sum of the volumes of the surrounding solutes, which are treated as hydrodynamic spheres. The approximation requires only marginally more computation than calculations that neglect HI and assume a constant solute diffusion coefficient.

| Sampling conformational ensembles
Whereas a single rigid conformation for each solute is often used, solute flexibility can be represented using one of two main strategies in BD simulations: (1) Treating the solute as a flexible chain of CG beads representing, for example, amino acid residues or protein domains, 17 or (2) by on-the-fly sampling of conformations from a precomputed ensemble of rigid structures of a solute. 20 A simple example of the first strategy is given by the use of a "fluctuating-dumbbell" model of an enzyme. This was used in BD simulations to examine why the enzyme's diffusion is enhanced in the presence of a substrate. 33 The results indicated that the diffusion enhancement is most likely due to the reduction in enzyme size rather than any suppression of the hydrodynamically coupled conformational fluctuations by substrate binding. However, the lack of quantitative agreement with experimental data could not be explained. A more detailed model is provided by COFFDROP (COarse-grained Force Field for Dynamic Representations Of Proteins), a force field that uses CG to sample different conformations of proteins according to a potential energy function that can allow dynamics and flexibility within a particular molecule. 17 As suggested by the name, COFFDROP is a coarse-grained model that abstracts amino acids as point objects that interact by bonded and nonbonded forces described by a set of distribution functions; therefore, the computational efficiency of the COFFDROP approach is limited by the O(N 2 ) pairwise interactions of N amino acids. The potential energy function is derived from the thermodynamics and geometry of intra-and intermolecular interaction data from all-atom MD simulations. In an extension of COFFDROP, bonded backbone interactions were added in addition to nonbonded interactions, 34 enabling hydrodynamic radii determined experimentally for intrinsically disordered proteins to be reproduced.
The SDA7 software, in contrast, uses the second strategy, and samples a user-provided ensemble of rigid structures to assign the conformations of each solute. The conformations are used to compute the intermolecular forces while the solutes move as hydrodynamic spheres. Molecular conformations change on-the-fly during the BD simulations based on a Monte-Carlo approach. A Metropolis-Hastings algorithm, based on Markov Chain Monte Carlo sampling, is applied to accept or reject a proposed change of conformation based on the energy of the system. This approach is not limited to proteins but can be extended to any type of molecule. 20

| Simulation of many solutes
While BD studies of association processes often involve the simulation of only two solutes, the simulation of systems with much larger numbers of solutes can be used to study diffusional behavior in crowded environments (Figure 1), aggregation phenomena, or the adsorption of molecules to surfaces. The diffusion constants of molecules can vary between different crowded scenarios, and therefore care must be taken in their definition. 23,[35][36][37][38][39] If the dilute diffusion coefficients are used to simulate a crowded environment, the actual diffusion rates of the molecules can be computed from the BD simulations when the crowded environment is simulated explicitly. These simulations have, for example, shown how the translational diffusion coefficients of the solutes are reduced by crowding due to excluded volume effects and the interactions of the solutes with protein crowders 23 or with confining surfaces. 24 As the simulation of hundreds of diffusing crowder molecules increases the computational cost of simulations of crowded environments, it is desirable to model crowders implicitly. Examples include the algorithm proposed by Smith et al. which implicitly models the effects of hard sphere crowders. 40 However, such an approach does not account for the effects of crowder shape and interactions that have been found to be important for protein 23,38 and small molecule diffusion. 39 Moreover, the diffusion of small molecules has been shown to be influenced by heterogeneity in the levels of macromolecular crowding, such as can be present between subcellular regions, by modeling with an advectiondiffusion equation and by hard-sphere BD simulations. 36

| Enhanced sampling
High free energy barriers encountered during simulation can make the observation of rare events difficult. Enhanced sampling in BD aims to increase the likelihood that these high barriers are crossed as the potential energy landscape is explored. The weighted-ensemble (WE) method developed by Huber and Kim 41 is an algorithm to focus on rare paths by dividing simulation space into a series of bins, and allowing a swarm of "walkers," which each carry a reaction probability, to move between bins. Walkers in under-sampled bins are split, while walkers in oversampled bins are merged, ensuring that rare events are adequately sampled. However, this method can be very computationally expensive when used to explore the free energy of a high-dimensional parameter space, due to the large number of bins that may be necessary. The strategy called WExplore aims to address this problem by classifying bins hierarchically. 42 This approach divides the energy landscape into large spaces that are subsequently divided into subspaces, focusing this division on the regions of interest. In this way, the algorithm seeks to reduce the range of landscape spaces and subspaces to explore, thereby reducing the computational time. WExplore is useful for tasks such as the multiscale mapping of ligand binding landscapes. 43 In addition, simulations may be made more efficient by the use of variable timesteps, whose length is adjusted on-the-fly based on intermolecular proximity and the magnitudes of forces and torques 44 3 | METHODS WHICH USE BD

| Computing bimolecular association rate constants and binding affinities
While bimolecular association rate constants can be computed from simply counting encounter events in a simulation of a period box of the relevant solutes, this is often computationally inefficient. Northrup et al. proposed a theoretical and algorithmic framework for using BD to estimate bimolecular association rate constants, which is known as the Northrup Allison McCammon (NAM) algorithm and requires the simulation of only the two associating solutes. 45 Two intermolecular distances are defined at the "b-surface" and the larger "q-surface," at which the forces between the molecules are centrosymmetric ( Figure 2). The rate constant at which the two molecules at a defined bulk concentration approach to the intermolecular distance defined by the b-surface is computed using the Smoluchowski equation.
This expression can then be used to compute the bimolecular association rate constant: where E(r) is the potential of mean force at an intermolecular separation distance of r, β is the observed probability from the BD simulations that the molecules starting at the b-surface will continue on to react, rather than separate to the q-surface. BD trajectories are generated starting with one molecule centred at the origin and one on the b-surface, both randomly oriented, and then the molecules move according to Equation (4). If the molecules approach close enough to interact according to predefined reaction criteria, the trajectory is counted as a "reaction" trajectory and halted. In contrast, if the trajectory results in intermolecular distances that exceed the q-surface, the probability of escape from the qsurface versus reappearing at the b-surface, is computed analytically. The simulation may then be either restarted at the b-surface or marked as an "escape" event and halted according to this probability. From the initial rate constant to reach the b-surface, as well as the various probabilities to "react" or "escape," one can compute the overall association rate constant.
Due to the implicit solvent and rigid body assumptions typically used in BD, the possibility of using multiscale approaches, for example by combining BD and MD, can increase model accuracy. [48][49][50] Multiscale approaches have produced promising results by making use of a Markov State Model (MSM), 51 and this approach has recently been used to assess gating of drug binding to HIV-1 protease 52 (Figure 3).
Other work has focused on sampling conformational ensembles, 53 driving the development of more complex and sophisticated software. 46 Milestoning is a theoretical framework that allows one to partition a long-timescale molecular process into a series of smaller sections separated by surfaces called "milestones." This partitioning of a long-timescale process allows one to limit simulations spent on well-sampled sections of the reaction to observe rare but important events, and allows each section to be simulated in parallel. 54 The SEEKR approach and software were developed to use milestoning theory to combine MD and BD simulation results and is capable of obtaining kinetic and thermodynamic quantities such as k on , k off , and ΔG bind . 46 Hydrogen mass repartitioning has been successfully used in MD simulations in the SEEKR framework to improve the speed of the methodology. 47,55,56 SEEKR has been used to compute binding kinetics and thermodynamics for proteins such as trypsin and janus kinase, to their respective inhibitors. 46,47 Focusing on the trypsin-benzamidine system, SEEKR was used to obtain a k on value of (2.4 ± 0.2) Â 10 7 M À1 s À1 , comparing well with the experimental k on value of 2.9 Â 10 7 M À1 s À1 . 47 This result was obtained with about 1 CPU hour for BD simulations totaling 180 ms of simulation time and 440 GPU hours for MD simulations totaling 5.5 μs of simulation time. Although the MD simulations far exceed the computational effort of the BD simulations, they also enable the calculation of k off and k D values and employ reaction criteria based on the bound state rather than the less well-defined encounter complex. SEEKR has also been used to rank-order a series of small molecule "guests" to a β-cyclodextrin F I G U R E 2 Computation of receptor-ligand diffusional association rate constant using the NAM algorithm 45 augmented with the SEEKR method. 46,47 Simulations are started with the ligand on the b-surface (dashed inner blue circle) and continue until the ligand either reaches the q-surface (dashed outer black circle) or the reaction surface (dashed red curve). The SEEKR method, if employed, further partitions the regions below the BD reaction surface into milestones to further characterize the binding process using MD simulations. See text for details "host." 55-57 SEEKR and a WE MD method were found to give kinetic parameters with similar accuracy and computational efficiency. 57

| Docking
In the context of BD simulations, docking refers to the technique of running rigid-body BD simulations where ligand and receptor molecules, guided by intermolecular forces, particularly long-range electrostatic forces, are allowed to form diffusional encounter complexes. This approach contrasts with conventional "docking," 58-61 which produces structures of fully bound complexes by systematically generating ligand positional and orientational arrangements, usually focused on the vicinity of the binding pocket, without regard for the binding pathway. For BD docking, a similar procedure is used as outlined in Section 3.1. When reaction criteria are satisfied, the resulting complex can be saved as a "docked" structure. The reaction criteria can include user-defined constraints from experiments or analysis of protein sequences. Many docked encounter complexes can be generated from running swarms of BD trajectories. These can be clustered, and their probability assessed from the corresponding cluster size, as well as the interaction energies computed from the BD force field. The broad applicability of this procedure is shown by its use to generate docked structures of diffusional encounter complexes for diverse protein-protein complexes, 62 protein-small molecule complexes, 52 and protein-nucleic acid complexes, such as linker histone-nucleosome complexes. 63,64 The encounter complexes obtained with rigid-body BD can be refined with additional MD simulations and/or normal mode analyses to allow for a better fit and to generate fully docked complexes. 62

| Simulation of adsorption to surfaces
Often, surfaces are modeled in BD simulations to account for the presence of a cellular membrane or other wall-like structure. [65][66][67] Surfaces may also approximate the exterior of an artificial object, such as a medical implant, experimental apparatus, or nanoparticle. [68][69][70] Adsorption of molecules to surfaces, which can be modeled in atomic detail, can be investigated using the BD docking method described in the previous section to study the orientation and position of the solute on the surface as well as the contacts between them. 71 In addition, adsorption processes can be studied by simulating the diffusion of many solutes, as described in Section 2.4, in the presence of a surface.
The presence of a flat surface, for example, glass (silica) or a membrane, can significantly alter the motion of Brownian particles, in comparison to free-floating particles in a bulk substance far from any stationary bodies. For instance, one method has been developed to adjust the protonation states of residue side chains based on proximity to a F I G U R E 3 Scheme illustrating how conformational gating and induced fit can be accounted for by combining BD simulations of ligand-receptor diffusional association with MD simulations to compute k on . The approach entails three steps and can be applied when the timescales of these three steps can be sufficiently separated. The approach starts by using MD simulation with Markov state modeling (MSM) to identify the ligand-accessible conformations of the unbound receptor that are then used in rigid body BD, to compute the diffusional k on and to generate encounter complexes. 52 These encounter complexes provide starting points for MD simulations and are clustered to simulate with MD and thereby compute the probability that the bound state is reached from the encounter complexes. The diffusional k on computed in BD is then multiplied by factors obtained from the MD simulations to obtain the final k on , see text for details. charged surface. 72 To correctly account for surface-solute forces at larger intermolecular distances, a DH model can be used to model the forces between charges on the solute and the surface, while simultaneously allowing for solvent dielectric and electrolyte screening. 24 At closer distances, an electrostatic potential grid is typically computed with a Poisson-Boltzmann equation solver, such as APBS, 18 and the grid is imposed on the solute molecule(s) during the BD simulation. Customized force fields such as ProMetCS, 73 which was parameterized for protein interactions with atomically detailed gold surfaces, may be necessary to adequately model short-range interactions. In addition, the presence of a surface will also affect hydrodynamic interactions, generally having the effect of lowering the diffusion rates of nearby solutes, although the diffusion coefficient in the direction perpendicular to the surface will be different from that of the direction parallel to the surface. It can be modeled in a computationally efficient manner by using the mean-field approach described above. 24 The effect of a surface on rotational diffusion coefficients is often ignored due to its relatively small magnitude. 24

| SOFTWARE
Many software packages have been developed to prepare, simulate, and analyze molecular systems by BD simulation. This section summarizes the major, freely available BD software packages, which differ in their capabilities, efficiencies, and intended use cases. BD simulation software and their features of interest are listed in Table 1.
Incidentally, many simulation engines known primarily as MD simulators can run variations of BD. Gromacs, NAMD, and OpenMM have the option to use BD integrators; and, while they represent the limit of high friction in Langevin dynamics, they do not natively support the NAM algorithm, grid-based forces, or hydrodynamics. Thus, these programs are limited to pairwise forces (although approaches such as particle mesh Ewald can mitigate their high algorithmic cost).
The UHBD (University of Houston Brownian Dynamics) program was first published in 1991 74 and is still actively used. 79 Many of the programs in this section were inspired by the UHBD code.
The program SDA (Simulation of Diffusional Association), which has been under continuous development since the mid-1990s, is a widely used simulator of BD for biomolecular systems. 20 SDA originally started as an implementation of the NAM algorithm to perform protein-protein association studies, but later generations expanded SDA to allow such features as protein diffusion in the presence of a surface, protein-protein docking, and the simulation of large numbers of diffusing particles. SDA is capable of computing electron transfer rates-a feature not commonly available in other packages. The latest version, SDA7, introduced the representation of individual molecules by multiple conformations, as mentioned in Section 2.3, expanded the ability to analyze systems and compute physical quantities such as the potential of mean force (PMF), and more efficient parallelism. In addition, the webSDA web server allows researchers to perform many of SDAs simulation capabilities for free through a web interface and offers easier automation of the preparation and analysis of BD simulations. 75 Browndye, 22 developed since 2010, has recently been upgraded to its second major version. Browndye, which uses either atomically detailed or CG molecular models, is capable of simulating either rigid-body or flexible dynamics.
The software packages are listed in chronological order according to first date of publication.
Browndye has been primarily developed and optimized for the fast and efficient computation of the NAM algorithm and is limited to the spherical boundary condition geometry. Browndye is able to use shared memory parallelism using pthreads and supports grid-based electrostatic and desolvation forces (for speed), hydrodynamics, and complex reaction sequences. Browndye can also run with the weighted ensemble enhanced sampling algorithm to increase the speed and likelihood of observing binding events. BD_BOX, first published in 2011, is a BD software package that performs single-component or multi-component simulations using highly parallel hardware through either shared-memory or distributed-memory CPU systems, or by using GPUs. 76 BD_BOX models molecules as multiple series of beads connected by flexible bonds, which typically constitutes a CG model. The motions of beads may be further constricted by angle and dihedral restraints. Forces between beads include nonbonded terms, such as electrostatics and LJ forces. BD_BOX can also be used to simulate multiple molecules as CG collections of beads. Hydrodynamics are included as mobility tensors between the beads, and homogenous flows and external electric fields may be included with a variety of available boundary conditions. ReaDDy, first presented in 2013, is designed to perform interacting particle reaction dynamics (iPRD). 77 In iPRD, large numbers of interacting particles can interact, pop into existence, disappear out of existence, merge with other particles, divide into multiple particles, and diffuse around a crowded or sparse environment in a simulation of a realistic biological system. Now in its second iteration, ReaDDy 2 has improved performance, and a new Python interface for API capabilities. Flexible molecules with complex polymer geometries (such as proteins) can be modeled in ReaDDy with a variety of boundary conditions and may be run on shared memory multiprocessor platforms and on GPUs. However, ReaDDY is limited to screened electrostatics and LJ interactions and is not capable of including hydrodynamic interactions.
GeomBD, which was first presented in 2015, boasts the ability to simulate more than two molecules at a time, specifically for the purpose of examining intersite and intermolecular transfer of substrates. 78 The program may run either all-atom or CG simulations and force fields for describing molecular flexibility may be loaded into GeomBD. In addition, GeomBD can simulate a variety of boundary condition geometries, enabling binding between a diffusing molecule and an infinite surface or an infinite tube. Precomputed grids enable the fast computation of forces.

| APPLICATIONS
Here, we give examples of some recent applications of BD simulations to study biomolecular association processes.

| Molecular signaling cascades
Cyclic adenosine monophosphate (cAMP) is an important intracellular mediator of extracellular signals that is synthesized by adenylyl cyclase, an enzyme that is regulated by G proteins which in turn receive signals from G protein-coupled receptors (GPCRs). To investigate the ability of an adenylyl cyclase-based signaling cascade to perform coincidence detection of neurotransmitters that bind to two different GPCRs, Bruce et al. combined MD and BD with kinetic modeling using ODEs. 80 MD was used to investigate formation of binary and ternary complexes of adenylyl cyclase with Gɑ olf and Gɑ i proteins. Structures from the MD simulations were used to compute association rate constants for the binding of G proteins to adenylyl cyclase by BD. Kinetic models were built with the topology and parameters being informed by the results of the MD and BD simulations, as well as experimental data. The association rate constants computed by BD were consistent with the ability of the adenylyl cyclase-based network to detect simultaneous changes in neuromodulatory signals, which is important for learning and memory formation. This study demonstrates the potential of multiscale modeling approaches incorporating subcellular and molecular level methods, including BD, for studying complex biochemical networks. 80,81 Further downstream in the cAMP signaling network, the protein kinase A heterotetramer was simulated with the aim of resolving discordant experimental structural data on its regulatory subunit by using MD to generate initial protein structures and then BD to assess the binding of cAMP to them. 82 The simulations revealed a stable R-subunit "Flipback" structure and showed a likely binding mechanism between this structure and cAMP. A visual examination of the cyclic nucleotide-binding domain showed that the Flipback structure displays a more electrostatically positive region than the initial holo-structure, increasing the rates of cAMP association. The combined use of MD and BD allowed the reconciliation of apparently discrepant experimental data by revealing and validating a novel Flipback conformation.
BD docking can also be used to study signaling cascades. One example is the downstream signaling of the T cell receptor carried out by tyrosine kinases. Two kinases, Lck and ZAP-70 were found to have a selection mechanism for their substrates through repulsive and attractive electrostatic interactions. BD simulations were used to demonstrate the high interaction propensity between ZAP-70 and its substrates, specific tyrosine residues in short peptides from the sequence of the scaffold protein LAT. This was due to long-range electrostatic steering which, in contrast, was not observed in BD simulations of the Lck kinase and substrates. 83

| Protein disaggregation mechanisms
The J-protein (HSP40) network in cells assists the HSP70 protein in rescuing aggregated proteins. The interplay between HSP70 and the J-protein network is related to stress recovery and differences exist between prokaryotic and eukaryotic cellular processes. According to experimental studies, in eukaryotic cells, the class A and class B J-proteins associate to enhance their activity, but in prokaryotic cells, this association does not take place. BD docking simulations were performed to generate encounter complexes for the different possible combinations of J-protein domains. Consistent with experiment, clustering and energetic scoring of the encounter complexes indicated greater specificity of the complexes with class A than class B J-protein C-terminal domains for proteins from eukaryotes but not from prokaryotes. 84,85 The evolution of J-protein classes and interaction properties could be understood by comparative analysis of the protein electrostatic potentials along with phylogenetic and coevolutionary analysis. 85

| Microtubule elongation and myofilament dynamics
BD simulation has shown itself to be a useful approach to studying protein-protein interactions involved in filament formation, dynamics, and function. An example is the finding of a mechanism for microtubule elongation by the addition of GTP-tubulin dimers-represented by spheres-to the tips of curving protofilaments by using BD to characterize lateral tubulin-tubulin interactions and the bending motion of microtubule protofilaments-showing BD's utility for testing novel biological hypotheses. 86,87 McCulloch, Regnier, and colleagues have employed BD simulations to study actin-protein interactions and develop multiscale models of cardiac myofilament activation and sarcomere contraction. [88][89][90] dATP is known to increase contractile force in hearts. Powers et al. 88 investigated the mechanism and found that the deoxy hydrolysis product, 2-deoxy-adenosine 5 0 -diphosphate (dADP), restructures the prepowerstroke myosin. By BD simulation of the association of myosin with an actin dimer (representing F-actin) in ADP and dADP bound states, they computed higher association rates for the dADP-bound myosin. This was explained by an allosteric effect of dADP binding, resulting in enhanced electrostatic steering due to a conformational change of myosin. Subsequently, these rates were incorporated into a Markov Chain Monte Carlo model of cooperative sarcomere contraction, 89 providing details about the mechanism and effects of dATP on the actin/myosin powerstroke compared to the native ATP.
Another approach to multiscale simulation was taken by Aboelkassem et al. 90 who used BD and the NAM method to investigate how the association rates of a tropomyosin coiled-coil and an actin filament depend on the relative orientation in which they bind. The results of the atomistic BD simulations were then employed to build a Langevin dynamics model at the filament scale of activation of the thin filament during sarcomere contraction. The BD simulations in these studies were limited by the difficulties of capturing the key features of long filaments by a few protein molecules and the very simple models of their hydrodynamics that were used. Nevertheless, the BD simulations provided important mechanistic insights and semi-quantitative results for multiscale modeling.

| Redox protein interactions in the thylakoid lumen
BD has been used to simulate the association of redox proteins and compute the electron transfer rates in diffusional encounter complexes. In a recent study, the diffusion of plastocyanin (Pc) within the lumenal space of a thylakoid was simulated to elucidate the timescales and mechanisms of Pc association with photosystem I (PSI) in the presence of cytochrome bf (Cyt bf). 91 Both PSI and Cyt bf are embedded in the membrane, and they were considered as fixed objects in two fixed thylakoid membranes separated by a lumenal space of 10 nm thickness. The association between atomically detailed PSI or Cyt bf and Pc molecules undergoing translational and rotational diffusion subject to electrostatic forces was modeled using typical distance-based reaction criteria. However, the complex formation and subsequent electron transfer and complex dissociation were simulated by introducing probabilities for each event based on an assumed electron transfer rate, with the Pc molecules being converted from the reduced to the oxidized state on-thefly in the simulations. From the simulations, it was possible to compute docking times and lifetimes for the individual steps in the overall reaction mechanism that depended on protein concentrations and differed according to the protein distributions in the stromal and granal areas of the thylakoid membranes. Comparison with experiment showed that good agreement could be obtained with the BD model, reproducing the fast and slow phases of PSI reduction by Pc with lifetimes of about 30 and 300 μs, respectively.

| Protein-DNA and DNA-DNA association
BD simulations of binding between proteins and DNA double-helical strands provide a particular challenge, both due to the long, cylindrical and flexible structure of DNA, as well as DNA's dense concentration of negative charges. Under these circumstances, effects caused by high electrolyte concentrations can be particularly strong, and the hydrodynamics of long DNA strands must be correctly accounted for.
BD simulations were used to compute the k on , k off , and K D values for the association of a linker histone globular domain with a nucleosome (consisting of DNA wrapped around the histone octamer) 92 and to generate encounter complexes between linker histones and nucleosomes to study their binding poses and how these are affected by protein mutations, posttranslational modifications or differences in DNA sequence and structure. 63 The linker histonenucleosome encounter complexes, whose formation is guided by electrostatic interactions, resembled the complexes observed in experiments (Figure 4). The positioning of the linker-histone on nucleosomes is important for chromatin compaction and for regulating transcription and the simulations indicate that it may be affected by subtle differences in sequence or posttranslational modification.
BD has also been used to investigate the correlation between the self-association of proteins and their ability to shape the DNA helical coil when bridging two DNA segments. 94 In this study, a custom CG forcefield was used to implement system flexibility to reproduce the shaping of DNA when interacting with proteins. From the simulations, one can infer that without the protein self-association process, the monomeric proteins would not be sufficient to properly shape the DNA coil.
DNA binding to the von Willebrand Factor, an adhesive blood protein, was studied recently using BD, atomistic MD, and CG MD to characterize key binding sites. The importance of electrostatics for the interactions between DNA and von Willebrand Factor was shown by increasing ionic strength and observing a subsequent attenuation of binding in association BD simulations. 95 Chang and colleagues carried out BD simulations of the binding of target ssDNA to probe ssDNA tethered to the surface of self-assembling monolayers (SAMs; Figure 5). As is typical in BD simulations, all ssDNA strands were treated as rigid bodies, and fairly loose reaction criteria were needed to simulate an interaction which might result in a hybrid. Despite this approximation, the observed rates of hybridization for different surface conditions may aid in the design of SAMs. 96 5.6 | Protein-drug association kinetics BD simulations of protein-drug association may assist in the drug design process by providing association mechanisms and rates. Photosensor molecules have been identified and simulated that have an antiviral effect against SARS-COV2, SARS-COV, and MERS-COV. 97 Using BD to suggest likely binding sites, it was deduced that methylene blue (MB) is likely to have an enhanced antiviral effect against SARS-COV-2 compared with alternative photosensors examined in the study because MB binds to the receptor binding domain region where the spike protein interacts with its native substrate, the ACE2 receptor, revealing why MB does not need an irradiation activation to present antiviral activity.
An MD-derived MSM for apo HIV-1 protease was used to provide structures for BD simulations of the association of inhibitory drugs to examine the effect of gating by the flexible loops of HIV-1 protease on drug association rates. 52 BD simulations of eight inhibitors of HIV-1 protease were performed to identify accessible states and to compute association rate constants for them. These simulations are illustrated schematically in the graphical abstract. The MSM showed slow gating between the inaccessible and accessible states and enabled a gating factor to be computed for the HIV-1 protease flaps. In contrast to previous studies of the binding of larger peptides to HIV-1 protease, the results indicated that a postdiffusional step, rather than conformational gating, may be the rate-limiting process for the drugs studied ( Figure 3).
Another BD study of HIV-1 protease, focused on the association of its inhibitor, xk263 in a surface plasmon resonance or continuous flow biosynthesis experimental environment which was mimicked by a model self-assembled monolayer surface and a simulated current flux of xk263. 98 The BD simulations allowed the effects of different fluxes and different protein orientations and interactions on the protease-inhibitor association times to be investigated, showing how such simulations could guide experimental design.
BD has also been used to compute the association rates of small molecules to multiple sites on the surface of an entire influenza viral envelope model. Results indicated different binding rates to the viral surface proteins hemagglutinin and neuraminidase for various viral mutants. 51,99 Furthermore, the unknown role of the neuraminidase secondary site, as well as the role of glycans ( Figure 6) in the binding process, 51  F I G U R E 5 A self-assembled monolayer (SAM) with attached ssDNA probes (backbone colored in cyan). Target DNA sequences (backbone colored in orange) were allowed to diffuse freely in rigid-body BD simulations. Binding was considered to have occurred if centerof-mass or Watson-Crick base pair distances between any probe and any target fell within 7 Å, or if any end-to-end overlap of at least 5 base pairs was observed. Reprinted with permission from Cholko and Chang. 96 Copyright© 2021, American Chemical Society

| Enzyme substrate channeling
Substrate channeling in bifunctional enzymes has been the subject of a number of BD simulation studies. Recently, the diffusion of oxaloacetate (OAA) in a channel between the proteins malate dehydrogenase and citrate synthase-a process integral to the Krebs cycle-was studied by BD simulation. While the transition time of OAA between the active sites could not be directly obtained from the BD calculations, a relative transition time for wild-type and recombinant receptors was obtained from the simulations. 100

| Mechanism of how nanoparticles and artificial surfaces affect amyloid fibril formation
Diffusion to a surface, followed by diffusion along the surface to a target (two-dimensional diffusion), can be faster than diffusion within a bulk solution to that target (three-dimensional diffusion). Moreover, surface diffusion can affect the rate of binding at active sites on flat or curved surfaces: an analytical model for such surface-assisted trapping was recently derived and validated by BD simulation. 101 The effects of cell membranes or artificial surfaces on the aggregation of amyloid proteins have been studied by BD simulation.
In a study of amyloid-β aggregation using all-atom representations of the peptides, it was found by BD simulations that, in the presence of SAM surfaces, Aβ42 monomers dimerized in a surface-mediated manner with significant two-dimensional diffusion 102 (Figure 7). Notably, the association and fibril formation significantly decreased when monomers of amyloid-β Aβ42 bound with high affinity to a SAM surface, or the surface had a heterogeneous charge distribution. These simulation results could guide the design of nanoparticles with surface properties that might hinder amyloid-β plaque formation.
Gold nanoparticles are often coated with citrate, and this affects binding of proteins to such nanoparticles. 103,104 Through BD simulation, as well as MD simulation and NMR, these gold nanoparticle/citrate complexes, were observed to disrupt the formation of medically harmful aggregates of different mutants of amyloidogenic protein β2-microglobulin 105 (Figure 8). The BD simulations of the association of β2-microglobulin to a gold nanoparticle surface were used to suggest the most preferential interaction sites between monomeric and dimeric protein aggregates, which could then be used for more detailed docking and stability studies.

| Multiple molecule diffusion
To simulate realistic biological situations, such as the interior of a cell, it is necessary to consider large numbers of molecules in a BD simulation. 35,[106][107][108][109] In one recent study, diffusing molecules within a crowded spherical cell displayed increased mobility and correlated motion in the region close to the cell membrane, as opposed to the interior of the cell cytoplasm, indicating a possible mechanism of signal transduction. While the insights gained from the results are interesting and informative, the approach was highly simplified compared to a real cell environment, using a spherical cell membrane and identically sized spherical solute particles. 110 F I G U R E 7 Illustration of two-dimensional and three-dimensional diffusional association between Aβ42 structures to form fibrils while adsorbed to a self-assembling monolayer (SAM) surface. The three structures shown in van der Waals representation are identical Aβ42 peptides. Encounters between them were considered to occur when PHE19 residues (circled in black) come within a 40 Å distance of one another (as depicted by the dashed white curve). The peptide with cyan carbons is undergoing two-dimensional diffusion along the surface toward the peptide with yellow carbons whereas the peptide with magenta carbons is diffusing to the yellow peptide directly from the solvent by three-dimensional diffusion. Figure reprinted with permission from Cholko et al. 102 Copyright 2020 American Chemical Society. In another study, the dramatic reduction in the translational diffusion of tracer molecules in the presence of varying volume fractions of double-stranded DNA-a cylindrical molecule that produced a more pronounced effect than higher volume fractions of a spherical molecule, demonstrating the importance of molecular shape when considering crowded environments. While atomic detail was used for double-stranded DNA in these simulations, other crowding molecules were modeled using a simple spherical shape-a mix of levels of detail, but it does confirm that molecular shape can affect the dynamics within crowded environments. 38,111 6 | CONCLUSIONS AND OUTLOOK In this review, we have highlighted the multiple ways in which the BD simulation approach is being used and tailored to study biomolecular systems. Recent methodological developments, along with advances in computational power and experimental techniques, are extending the scope for application of BD simulations. Looking ahead, we can identify many opportunities for BD simulation. For one, due to its relatively high computational efficiency compared to other atomically detailed simulation methods, such as MD, BD has been, and will continue to be, used to simulate large molecular entities, such as biomacromolecular complexes with extensive quaternary structure or entire viruses, and applications can be expected to increasingly extend to cellular compartments and cells. Therefore, we expect the BD technique to develop in tandem with experimental advances for investigating multimolecular and subcellular systems and processes. An example of the integration of BD and experiments is given by the recent use of BD simulations to validate the Line-FRAP method for measuring protein diffusion rates in vitro and in vivo. 112 Multiparticle BD simulations starting from different particle distributions were performed to mimic the experimental setup and fluorescence recovery curves were computed. From the BD simulation results, it was found that two exponents were necessary to fit experimental recovery curves to derive protein diffusion coefficients due to the diffusive boundary between the bleached area and its surroundings. BD is also a valuable tool to provide detailed molecular interpretations of experiments. For example, the Line-FRAP method was applied to investigate how protein crowders and surfaces influence the translational diffusion of small molecule drugs. 39 BD simulations of the drug molecules diffusing in the presence of protein crowders and a glass surface, in conjunction with molecular docking of the drugs and protein crowders, helped to decipher the different factors contributing to the different effects on drug diffusion observed for different molecules and protein crowders. Thus, the detailed molecular mechanistic insights from BD simulations can help to design better experiments and to interpret experimental results.
The future directions for the development of the BD simulation methodology for biomolecular simulations are expected to focus on the development of algorithms and software that incorporate increasing detail without drastically compromising performance. Advances in algorithms to model conformational flexibility, particularly those that precisely capture the degrees of freedom that most affect diffusional properties, will continue to emerge to solve problems that demand the treatment of internal molecular motions. These should be accompanied by appropriate force fields that provide more accurate inter-particle forces with commensurate computational effort. While some recent work has focused on capturing electrostatic interactions efficiently, for example, by combining grid-based Poisson-Boltzmann and DH sphere models, and new CG models for intramolecular particle interactions have been developed, there is still a need for more accurate force fields for BD, particularly for treating shorter-range and intramolecular interactions. Advances in machine learning of force fields may provide a means to improve the accuracy of force calculations for BD simulations with sufficient computational efficiency.
The development of implicit models for crowded environments is especially promising as regards its computational efficiency and its usefulness for understanding the processes that take place inside a cell. However, it clearly presents challenges as regards the treatment of HI. More generally, while there have recently been significant advances in the modeling of HI of molecules and material surfaces, both in terms of accuracy of the models and in terms of their computational scaling efficiency, there is still a need for models that better account for the heterogeneous shapes of biomolecules as well as better tensors that correctly compute the HI when the solutes interact in close proximity.
In this article, we have summarized the main aspects of BD theory and methods, both old and new, as well as the existing, free-to-use software that is available for simulations. We have also highlighted some of the ways that BD has recently been applied, and we outline the opportunities and future directions for the BD developer and user community to consider. BD as a field of focus and as a tool for research continues to thrive and grow, providing a valuable means to tackle a wide range of problems in biology, bioengineering and biomedicine.