The UWHAM and SWHAM Software Package

We introduce the UWHAM (binless weighted histogram analysis method) and SWHAM (stochastic UWHAM) software package that can be used to estimate the density of states and free energy differences based on the data generated by multi-state simulations. The programs used to solve the UWHAM equations are written in the C++ language and operated via the command line interface. In this paper, first we review the theoretical bases of UWHAM, its stochastic solver RE-SWHAM (replica exchange-like SWHAM)and ST-SWHAM (serial tempering-like SWHAM). Then we provide a tutorial with examples that explains how to apply the UWHAM program package to analyze the data generated by different types of multi-state simulations: umbrella sampling, replica exchange, free energy perturbation simulations, etc. The tutorial examples also show that the UWHAM equations can be solved stochastically by applying the RE-SWHAM and ST-SWHAM programs when the data ensemble is large. If the simulations at some states are far from equilibrium, the Stratified RE-SWHAM program can be applied to obtain the equilibrium distribution of the state of interest. All the source codes and the tutorial examples are available from our group’s web page: https://ronlevygroup.cst.temple.edu/software/UWHAM_and_SWHAM_webpage/index.html.


Stratified UWHAM
The Stratified-UWHAM algorithm is based on coarse-graining. Firstly, we coarse-grained the whole phase space into macrostates, which correspond to energy basins separated by free energy barriers. Then all the λ-states in the system are divided into two groups (S 1 , S 2 ).
For the λ-states in the first group S 1 , the simulations are approximately equilibrated among all the macrostates. Namely, all the macrostates are well connected at each λ-state in group S 1 . For the λ-states in the second group S 2 , the simulations are only locally equilibrated.
Namely, the coarse-grained set of macrostates forms a disconnected network of microstate clusters at each λ-state in group S 2 . 1 We assume that the set of observations {X αi : i = 1, 2, · · · N α } observed at the λ-states in the S 1 group is independently drawn from the distribution And the set of observations {X αi : X αi ∈ R k , i = 1, 2, · · · N α } observed at the λ-states in the * To whom correspondence should be addressed S 2 group is independently drawn from the distribution where q αk ({x} αi ) = q α ({x} αi )δ({x} αi ∈ R k ); δ({x} αi ∈ R k ) is the indicator function for a macrostate cluster R k ; and Z αk is the partition function of the kth macrostate cluster at the αth λ-state. The likelihood function of the simulation data for this model is proportional to where u αi is the reduced coordinate of the microstate X αi . 1 In Ref. [ 1], we showed that the estimating equations from the maximization of Eq.
(3) can be solved in the form of UWHAM equations with an expanded set of λ-states. In such case the total number of λ-states of the system increases from where M 1 and M 2 are the total numbers of λ-states in the S 1 and S 2 group, respectively; K α is the number of disconnected macrostate clusters of the αth λ-state in the S 2 group. 1 As mentioned in the main text, the costs of memory and computational time of running UWHAM are proportional to the second order of the total number of λ-states M. It is more computationally expensive to solve the Stratified-UWHAM equations with an expanded set of λ-states.
In Ref [ 1], we proposed a stochastic solver for the Stratified-UWHAM equations called Stratified RE-SWHAM, which is illustrated in Fig.1. Stratified RE-SWHAM is a variant of RE-WHAM. There are two differences between Stratified RE-SWHAM and the unstratified version. For the stratified variant, when the observations observed at each λ-state are collected as the database for that λ-state, each observation is tagged by the macrostate cluster that it belongs to. 1 The second difference is how to choose an observation from the the database of each λ-state to associate with the replica at that λ-state during the "move" procedure of each replica exchange cycle. For Stratified RE-SWHAM, if the λ-states is in the S 1 group, we choose one observation from the database of that λ-state with equal probability, which is the same as the original RE-SWHAM. However, for the λ-states in the S 2 group, we choose one observation that is in the same connected macrostate cluster as the previous observation (associated with the replica) from the database of that λ-state with equal probability. 1 In Ref. [ 1], we proved that the output of Stratified RE-SWHAM at every λ-state is the estimate of the equilibrium distribution of that λ-state.  Figure 1: An illustration of the Stratified RE-SWHAM algorithm. This drawing shows two λ-states with "gray" or "cyan" color. Each λ-state has two macrostates A and B. The gray λ-state is locally equilibrated while the simulations at the cyan λ-state are approximately equilibrated among the macrostates. The white gap between macrostates at the gray λstate represents an uncrossable barrier for the "move" procedure during the Stratified RE-SWHAM analysis. Beforehand, we construct each λ-state a database which contains all the observations obtained from that λ-state, and each observation is tagged by the macrostate which it belongs to. As shown in the picture, the observations are separated into two subgroups A and B. Then Stratified RE-SWHAM is run in cycles, which consists of a "move" procedure and an "exchange" procedure. In the move procedure, Stratified RE-SWHAM chooses an observation to associate with the replica at each λ-state. At the cyan λ-state, the next observation is chosen from the whole database of that λ-state with equal probability. However, at the gray λ-state, the next observation is chosen from the subgroup which the previous observation belongs to with equal probability. In the exchange procedure, if the exchange attempt is accepted, in addition to the swap of the replicas, the observations associated with the replicas are also swapped to the database of the other λ-state. At the end of each cycle, the observation associated with each replica is recorded as the output of Stratified RE-SWHAM. Reprinted (adapted) with permission from Ref [ 1]. Copyright (2017) American Chemical Society.