CVs Based Enhanced Sampling Methods for Free Energy Calculation

14 minute read

Published:

This blog summarized some CV-based sampling methods for free enery calculation, including umbrella sampling, adaptive biasing force and metadynamics.

The samplies in the NVT ensemble follow the Boltzmann dirstribution \(\begin{aligned} p(\mathbf x) = \frac{\exp[-\beta E(\mathbf x)]}{\mathcal{Z}}\end{aligned}\) where \(\mathcal{Z}\) is the partition function \(\begin{aligned} \mathcal{Z} = \int \exp[-\beta E(\mathbf x)] \mathrm{d}\mathbf x\end{aligned}\) And the Helmholtz free energy is defined as: \(\begin{aligned} A = -\frac{1}{\beta} \ln \mathcal{Z}\end{aligned}\) If we consider the CV space, the probability of \(p(\mathbf s)\) can be derived by \(\begin{aligned} p(\mathbf s) = & \int \delta(\tilde{\mathbf s}(\mathbf x) - \mathbf s) p(\mathbf x) \mathrm{d}\mathbf x \\ = & \int \delta(\tilde{\mathbf s}(\mathbf x) - \mathbf s) \frac{\exp[-\beta E(x)]}{\mathcal{Z}} \mathrm{d}\mathbf x\\\end{aligned}\) And the free energy at \(\mathbf s\) is defined as \(\begin{aligned} F(\mathbf s) = & -\frac{1}{\beta}\ln p(\mathbf s) + \text{const} \\ = & -\frac{1}{\beta}\ln \left[\int \delta(\tilde{\mathbf s}(\mathbf x) - \mathbf s) \exp[-\beta E(\mathbf x)] \mathrm{d}\mathbf x \right]+ \text{const.}\end{aligned}\) If the const. is set as 0, the Helmholtz free energy can be computed by \(\begin{aligned} A = -\frac{1}{\beta} \ln \left[ \int \exp[-F(s)] \mathrm{d}\mathbf s \right]\end{aligned}\)

Umbrella Sampling Method

: This part refes to the following resources.

Theory

The simplest way to obtain \(F(\mathbf s)\) from an umbiased MD simulation is to evaluate the HISTOGRAM of the visited configurations in the CVs space \({N}(s)\),so that the estimated free energy \(\overline{F}(\mathbf s)\) reads \(\overline{F}(\mathbf s) = -\frac{1}{\beta} \ln {N}(\mathbf s)\)

The Umbrella samppling methods perfomes the MD simulation in a window with a bias potential \(V_b(s)\) and the umbiased free energy is recovered as \(\begin{aligned} \overline{F}(\mathbf s) = -\frac{1}{\beta} \ln N(s) - V_b(\mathbf s)\end{aligned}\)

The hard thing is that how to extract \(\overline{F}(s)\) from the trajectories of multiple windows. In the following induction, we use \(p_b(\mathbf s)\) to represent the probability distribution with the biased potential \(V_i(\mathbf x)\) at window \(i\) and \(p_u(\mathbf s)\) to represent the unbiased potential. The energy function \(E_u\) after the biased potential becomes \(\begin{aligned} E_b(\mathbf x) = E(\mathbf x) + V_i(\tilde{\mathbf s}(\mathbf x)) \end{aligned}\) The \(p_b(\mathbf s)\)has the expression \(\begin{aligned} p_b(\mathbf s) = \frac{\int \delta(\tilde{s}(\mathbf x) - s) \exp\left[-\beta E_b(\mathbf x)\right] \mathrm{d}\mathbf x}{\int \exp\left[-\beta E_b(\mathbf x)\right]\mathrm{d}\mathbf x}\end{aligned}\)

The above equation can be forther written as \(\begin{aligned} p_b(s) = & \exp[-\beta V_i(\mathbf s)]\frac{\int \delta(\tilde{s}(\mathbf x) - s) \exp\left[-\beta E(\mathbf x)\right] \mathrm{d}\mathbf x}{\int \exp\left[-\beta E_b(\mathbf x)\right]\mathrm{d}\mathbf x} \\ = & \exp[-\beta V_i(\mathbf s)] p_u(\mathbf s)\frac{\int \exp[-\beta E(\mathbf x)] \mathrm{d}\mathbf x}{\int \exp\left[-\beta E_b(\mathbf x)\right]\mathrm{d}\mathbf x}\\ = & p_u(\mathbf s) \exp[-\beta V_i(\mathbf s)] \end{aligned}\) And thus we have \(\begin{aligned} p_u(\mathbf s) = & p_b(\mathbf s) \exp[\beta V_i(\mathbf s)] \frac{\int \exp\left[-\beta E_b(\mathbf x)\right]\mathrm{d}\mathbf x}{\int \exp[-\beta E(\mathbf x)] \mathrm{d}\mathbf x} \\ = & p_b(\mathbf s) \exp[\beta V_i(\mathbf s)] \langle \exp[-\beta V_i(\mathbf s)] \rangle \end{aligned}\)

The unbiased free energy is \(\begin{aligned} F_u^i(\mathbf s) = -\frac{1}{\beta} p_i^b(\mathbf s) -V_i(s) - \frac{1}{\beta} \ln \langle \exp[-\beta V_i(\mathbf s)] \rangle \end{aligned}\) Here, we writhe the expectation above as \(\begin{aligned} -\frac{1}{\beta} \ln \langle \exp[-\beta V_i(\mathbf s)] \rangle = b_i,\end{aligned}\) which is a constant in a window.

We can just run simulations for multiple windows to explore the CV space. Then combine all the windows to recover the free energy landscape.

WHAM

Binned WHAM

The global $p_u(\mathbf s)$ is written as \(p_u(\mathbf s) = \sum_{i} \omega_i p_u^i(\mathbf s)\) where \(\omega_i\) is weight of the dirstribution \(p_i^u\) from the window \(i\). We choose the weights to minimize the variance of the glocal unbiaed dirstribution with the normalization constraint \(\begin{aligned} \max_{\mathbf \omega}:& \sigma^2(p_u(\mathbf s)) \\ \text{s.t.}: & \sum_i \omega_i = 1\end{aligned}\) The solution is \(\begin{aligned} p_i = \frac{n_i \exp[-\beta V_i(\mathbf s) + \beta b_i]}{\sum_i n_i \exp[-\beta V_i(\mathbf s) + \beta b_i]}\end{aligned}\) However, the \(b_i\) is unknown. But remember that the \(b_i\) is dependent on \(p_u\) via \(\begin{aligned} \exp[-\beta F_i] = \int p_u(\mathbf s) \exp[-\beta V_i(\mathbf s)] \mathrm{d}\mathbf s\end{aligned}\) Thus, the SCF is utlized to solve the above nonlineae equation. This method requires to know which window the sample is from.

Binless WHAM

The PLUMED uses this binless WHAM. For each sample, this method needed to know which the windows the sample is from. See: There is thus no need to record which replica generated each of the frames. One can thus simply gather the trajectories from all the replicas together at the outset. This observation is important as it is the basis of the binless formulation of WHAM that is implemented within PLUMED.

This method modles the problem as an MLE problem. Suppose that you have run multiple \(K\) trajectories each of which was computed by integrating a different biased Hamiltonian. We can calculate the probability of observing the set of configurations during the \(K\) trajectories that we ran using the product of multinomial distributions shown below: \(p(\mathbf T) = \prod_{j=1}^M \prod_{k=1}^K (c_k \exp[-\beta V_b^k(\mathbf s_j)]p_j)^{t_{kj}}\)

In this expression the second product runs over the biases that were used when calculating the \(K\) trajectories. The first product then runs over the \(M\) bins in our histogram. The \(p_j\) variable that is inside this product is the quantity we wish to extract; namely, the unbiased probability of having a set of CV values that lie within the range for the jth bin. Here \(c_k\) is a free parameter that ensures that, for each k, the biased probability is normalized. \(\begin{aligned} \sum_j c_k \exp[-\beta V_b^k(\mathbf s_j)]p_j = 1\end{aligned}\) The above optimization probelm is equivalent to solving the following two equations \(\begin{aligned} p_j = &\frac{\sum_{k=1}^K t_{kj}}{\sum_k c_k \exp[-\beta V_b^k(\mathbf s_j)]}\\ c_k = &\frac{1}{\sum_{j=1}^M \exp[-\beta V_b^k(\mathbf s_j)]p_j}\end{aligned}\) The scf method then is employed to solve above two equations.

Adaptive Biasing force(ABF)

ABF is another enhanced sampling method, quiet similar as the metadynamics. This method based on computing the mean force on the CVs and then removing this force. When $t$ is sufficiently large, the dynamics corresponding to a random walk with zero mean force.

The adaptive biasing force method relies on modifying the potential \(V\) in such a way that the energy landscape is flattened along a given transition coordinate. The potential \(V\) is changed to \(V(\mathbf x) - A_t(\mathbf s)\) and \(A_t\) is umpdated in such a way that it converges to the free energy \(A(\mathbf s)\), defined by \(\begin{aligned} A(\mathbf s) = -\frac{1}{\beta} \ln \int \delta(\tilde{\mathbf s}(\mathbf x) - \mathbf s) \exp[-\beta H(\mathbf x)] \mathrm{d}\mathbf x \end{aligned}\) The main ingredient that we now need is an update rule for \(A_t\) such that \(\lim_{t \rightarrow \infty} A_t = A\). This is baed on the following formula, which defines the mean force, namely the negative of the derivative of the free energy. \(\begin{aligned} \frac{\partial A(\mathbf s)}{\partial s_i} = \frac{\int \textcolor{red}{\frac{\partial H}{\partial \mathbf s_i}}\delta(\tilde{\mathbf s}(\mathbf x) - \mathbf s) \exp[-\beta H(\mathbf x)] \mathrm{d}\mathbf x }{\int \delta(\tilde{\mathbf s}(\mathbf x) - \mathbf s) \exp[-\beta H(\mathbf x)] \mathrm{d}\mathbf x } = \left\langle \frac{\partial H}{\partial s_i} \right\rangle_{\mathbf s}\end{aligned}\) See appendix C of Reference for details.

An important observation is that the above equation remains true of the potential is changed to \(V - A_t\) for any function. \(\frac{\partial A(\mathbf s)}{\partial s_i} = \frac{\int \textcolor{red}{\frac{\partial H}{\partial s_i}}\delta(\tilde{\mathbf s}(\mathbf x) - \mathbf s) \exp[-\beta (E(\mathbf x) - A_t(\mathbf s))] \mathrm{d}\mathbf x }{\int \delta(\tilde{\mathbf s}(\mathbf x) - \mathbf s) \exp[-\beta (E(\mathbf x) - A_t(\mathbf s))] \mathrm{d}\mathbf x } = \left\langle \frac{\partial H}{\partial s_i} \right\rangle_{\mathbf s}\) In other words, a biasing potential \(A_t\) depending solely on \(s\) leaves the average conditioned by \(s\) unchanged, This is the intuition behind the adaptive biasing force dynamics, which can be written as Thus we would like to simulation \(\begin{aligned} \begin{cases} d \mathbf x_t = \mathbf M^{-1}\mathbf p_t \mathrm{d}t \\ d \mathbf p_t = -\nabla(V - A_t \circ \tilde{\mathbf s})(\mathbf x_t) \mathrm{d}t - \gamma \mathbf M^{-1}\mathbf p_t \mathrm{d}t + \sqrt{2\gamma{\beta}^{-1}\mathrm{d}W_t} \\ \frac{\partial A(\mathbf s)}{\partial s_i} = \mathbb{E}\left[ \frac{\partial H}{\partial s_i}\right] \end{cases}\end{aligned}\)

Now, we derive the form of derivation of free energy with respect to CVs. We first introduce a lemman. The proof of this lemman can be found in the reference paper or Page 128 of reference book.

Let \([J(\mathbf s)]_{ij} = \frac{\partial x_i}{\partial x_j}\). Then we have \(\begin{aligned} \nabla_{\mathbf s} A(\mathbf s) = \left\langle \mathbf W^T \nabla U - (\nabla \cdot \mathbf W)^T |\mathbf s \right\rangle, \end{aligned}\) where $W$ is a thin matrix with $N_{\mathbf s}$ columns such that \(\begin{aligned} \mathbf J(\mathbf s) \mathbf W = \mathbf I \end{aligned}\) Another important form is \(\begin{aligned} \nabla_{\mathbf s} A(\mathbf s) = \left \langle \frac{\mathrm{d}}{\mathrm{d}t} (\mathbf W^T \mathbf p) \right \rangle \end{aligned}\) For special 1d case, the above equations can be writte as \(\begin{aligned} \frac{\mathrm{d}A}{\mathrm{d}s} = &\langle \nabla U \cdot \mathbf w - \nabla \cdot \omega |s \rangle \\ \text{s.t.:} \quad &\mathbf w \cdot \nabla s =1 \end{aligned}\) and \(\begin{aligned} \frac{\mathrm{d}A}{\mathrm{d}s} = \left \langle \frac{\mathrm{d}}{\mathrm{d}t} (\mathbf w \cdot \mathbf p) \right \rangle \end{aligned}\)

By choosing \(\omega\) propoerly, you can get different forms equations. For instance, if you choose \(\mathbf \omega = \frac{\partial x}{\partial s}\), you can get \(\frac{\mathrm{d}A(s)}{\mathrm{d}s} =\left \langle \frac{\partial U}{\partial s} + \frac{\partial \ln |J|}{\partial s}\right \rangle\) The term |J| is the determinant of the Jacobian matrix upon changing from Cartesian to generalized coordinates. It measures the change in volume element This is one of the well-known mean froce form. the difficulty in this formulation is that generalized coordinates appear explicitly in the form of the Jacobian \(|J|\), which may be difficult to calculate in many cases. It is therefore desirable to find an expression in which the explicit knowledge of the generalized coordinates is not required.

If we choose \(\mathbf w= m_{s}M^{-1}\nabla s\) with \(m_{s}^{-1}= \sum_{k}m_k^{-1}\left(\frac{\partial s}{\partial x_k}\right)^2\), then we get \(\begin{aligned} \frac{\mathrm{d}A}{\mathrm{d}s} = -\left\langle \frac{\mathrm{d}}{\mathrm{d}t}\left( m_{s}\frac{\mathrm{d}s}{\mathrm{d}t}\right) \right\rangle\end{aligned}\)

Consider a vector transition coordinate \(\mathbf s\) in the presence of a set of constraints of the form \(\sigma_k(\mathbf x)=0\).For each coordinate \(\mathbf s_i\), let \(\mathbf v_i\) be a vector field \(\mathbb{R}^{3N} \rightarrow \mathbb{R}^{3N}\) satisfying for al \(j\) and \(k\): \(\begin{aligned} \begin{cases} \mathbf v_i \cdot \nabla s_j = & \delta_{ij}\\ \mathbf v_i \cdot \nabla \sigma_k = &0 . \end{cases}\end{aligned}\)

The $i$-th partial derivative of the free energy can then be calculated as the conditional average of the follwing instance force. \(\begin{aligned} \frac{\partial H}{\partial s_i} = -\nabla U \cdot \mathbf v_i + \beta^{-1} \text{div}(\mathbf v_i)\end{aligned}\) A special case is \(\mathbf v_i = \nabla s_i/ |\nabla s_i^2\). Note that this estimater is rqeuired the calculation of second derivatives, in the form of all the divergence of the vector filed \(\mathbf v_i\), although the relative freedom in chossing \(v_i\) can be taken advantage of to make the divergence calculation practical.

Metadynamics

For the metadynamics and well-tempered metadynamics, I read the following papers

  1. Metadynamics 2011 review

  2. Well-tempered MetaDynamics PRL 2008

In metadynamics, an external history-dependent pias potetial which is a function of the CVs is added to the Hamiltonian of the system. This function can be written as a sum of Gaussians deposited along the system trajectory in CVs sapce to fill the free energy minima, thereby driving the system toward a uniform distribution.

At time \(t\), the metadynamics bias potential can be written as \(\begin{aligned} V_G(\mathbf s, t) = \int_{0}^{t} \mathrm{d}t' \omega \exp[-\sum_{i=1}^d \frac{(s_i(\mathbf x) - s(\mathbf x(t'))^2)}{2\sigma_i^2}]\end{aligned}\) where the summaation above is over all the d CVs., \(\omega\) is an energy rate and \(\sigma_i\) is the width of the Gaussian for \(i\)-th CVs. The energy rate is constant and usually expressed in terms of a Gaussian height \(W\) and a deposition stride \(\tau_G\): \(\begin{aligned} \omega = \frac{W}{\tau_G}\end{aligned}\) And we have \(\begin{aligned} \lim_{t \rightarrow \infty}V_G(\mathbf s, t) = -F(\mathbf s) +\text{const.}\end{aligned}\) And the error in the Free Energy Surface (FES) has been proven both emperically and theoretically to be \(\begin{aligned} \epsilon \propto \sqrt{\frac{\omega}{\beta D}}\end{aligned}\)

where $D$ is the intrinsic system diffusion coefficient in the CVs space. However, metadynamics has one problem. n a single run, VG does not converge moduloa constant to the free energy, but oscillatesaround it. This fact has two consequences.(a) The bias potential overfills the underly-ing FES and pushes the system toward high-energy regions of the CVs space. (b) It is nottrivial to decide when to stop a simulation.However, in this respect, one can say, as ageneral rule, that, if metadynamics is usedto find the closest saddle point, it should bestopped as soon as the system exits from theminimum. Otherwise, if one is interested in reconstructing an FES, it should be stoppedwhen the motion of the CVs becomes diffusive in the region of interest. The solution to this problem is the well-tempered metadynamics. In well-tempered metadynamics, we bias the dynamics by adding the history-dependent potential

\(V(\mathbf s, t) = k_B \Delta t \ln \left(1+\frac{\omega N(\mathbf s, t)}{k_B \Delta T} \right)\) Recall that here \(N(\mathbf s, t)\) is the histogram of the CVs, \(N(\mathbf s, t) = \int_{0}^{t}\delta(\mathbf s - \tilde{\mathbf s}(t))\)

And the time derivative of \(V(\mathbf s, t)\) is \(\begin{aligned} V'(\mathbf s, t) = \frac{\omega \Delta T \delta(\mathbf s - \tilde{\mathbf s}(t))}{\Delta T+ \omega N(\mathbf s, t)} = \omega \exp[-V(\mathbf s, t)/\Delta T]\delta(\mathbf s - \tilde{\mathbf s}(t))\end{aligned}\) The connection with metadynamics is evident if we replace \(\delta(\mathbf s - \tilde{\mathbf s}(t))\) with a finite Gaussian.

For large times, \(V(\mathbf s, t)\) varies so slowly that one can assume the system reachs equilibruim, and the \(p(\mathbf s, t) \mathrm{d}\mathbf s \propto \exp[(-F(\mathbf s)-V(\mathbf s, t))/T]\mathrm{d}\mathbf s\), and one has \(\begin{aligned} V'(\mathbf s, t) = & \omega \exp\left[-(V(\mathbf s, t))/\Delta T\right] p(\mathbf s, t)\\ = & \omega \exp\left[-(V(\mathbf s, t))/\Delta T\right] \frac{\exp\left[-(F(\mathbf s)+V(\mathbf s, t))/T\right]}{\int \mathrm{d}\mathbf s\exp\left[-(F(\mathbf s)+V(\mathbf s, t))/T\right]}\end{aligned}\)

This implies that \(\textcolor{red}{ \lim_{t \rightarrow \infty} V(\mathbf s, t) = -\frac{\Delta T}{\Delta T + T}F(\mathbf s).}\) We can get the following two important results.

First, \(\begin{aligned} \lim_{t \rightarrow \infty} p(\mathbf s, t)\mathrm{d}\mathbf s \propto \lim_{t \rightarrow \infty} \exp[(-F(\mathbf s)-V(\mathbf s, t))/T]\mathrm{d}\mathbf s = \exp[-(F(\mathbf s))/(T+\Delta T)] \mathrm{d}\mathbf s.\end{aligned}\) Therefore, for \(\Delta T \rightarrow 0\), ordinary MD is recovered while as the \(\Delta T \rightarrow \infty\), the standard metadynamics is recovered.

Second, \(\begin{aligned} \overline{F}(\mathbf s, t) = -\frac{T+\Delta T}{\Delta T} V(\mathbf s, t)\end{aligned}\) Thus the free energy can be obtained from the above equation.