2.2 Monte Carlo
2.2.3 Kinetic Monte Carlo Method
With MD we can only reproduce the dynamics of the system upto several microsec- onds. Slow kinetic processes cannot be modeled. Metropolis MC samples con- figurational space and generates configurations according to the desired statistical- mechanics distribution but can not be used to study the dynamics of system. Kinetic Monte Carlo (KMC) method is an alternative computational technique that can be used to study kinetics of slow processes. In Metropolis MC method, we decide whether to accept a move by considering the potential energy difference between two states whereas in KMC method, we use kinetic rate that depend on the energy barrier between the states. The basic idea behind KMC is to use increment of time which are defined by the transition rates of all processes and formulated such a way that they relate to the microscopic kinetics of the system.
The main trick involved in the KMC simulation is described in the following.
As transition between two states for a Markov process depends on only the kinetic rate (or free energy barrier) between them, a stochastic process can be designed to propagate the system correctly from state to state. In that case, the probability of observing a sequence of states and escape times in KMC simulation is same as that obtained from MD data. Hence, the resultant KMC trajectory will be indistinguish- able from a long MD trajectory. The steps of KMC algorithm are:
(i) Suppose the system is in initial state i.
(ii) Generate an exponential distribution of time (tj) for an escape pathway to state j. The actual process takes place along the pathway.
(iii) Select the escape pathway for which tj is minimum or the escape time of fastest transition.
(iv) Advance the overall clock time bytminj and discard the other times drawn from exponential distribution.
(iv) Move the system to new state jmin.
(v) New simulation begins from new state jmin and process repeats.
In this way the KMC algorithm involves cataloguing of all the possible kinetic events, and calculating the escape rates associated with these processes.
2.2 Monte Carlo Therefore, the couping of KMC method with MD method can be utilized as an accelerated MD method to study the long-timescale dynamics of large biomolecular system.
Chapter 3
Validity Time of a Markov State Model
3.1 Introduction
Markov state models (MSMs)61–65 and other related kinetic network models are fre- quently used to study the long-timescale dynamical behavior of biomolecular and material systems. MSMs are detailed kinetic network models wherein the configura- tional space of a biomolecule under study is partitioned into states. The dynamical evolution of the system is approximated in terms of state-to-state transitions. The number of states can range between tens to thousands depending on the complexity and level of coarse-graining. Each node in the network denotes a metastable state of the system while the connections between the nodes provide rates of interconversion between the states. MSMs have become useful tools for probing the dynamics of nucleic acids and proteins131–134, for example, folding and unfolding events at long time scales. Though we restrict ourselves to biomolecular systems, MSMs are closely related to kinetic Monte Carlo (KMC) models135–137 used in the materials and reac- tions areas for studying catalysis138, crystal growth139, material processing140, and adsorption phenomena141 to name a few. Both approaches solve a master equation and have benefited from the exchange of ideas between the respective communi- ties. For instance, knowledge of the network structure can be exploited to accelerate the KMC dynamics by eliminating fast degrees of freedom142–145. Despite their widespread usage, some aspects of MSM construction are still poorly understood.
A key step in the MSM construction entails determining states and kinetic path- ways to be included in the model. The availability of a large number of parallel processors has enabled rapid construction of high fidelity MSMs using brute-force
3.1 Introduction molecular dynamics (MD) calculations146–148. Herein states and kinetic pathways are identified via coarse-graining several independent MD trajectories. The MD tra- jectories can be seeded from different starting configurations, which allows for better sampling of the configurational space. Other simulation techniques offering resolu- tion greater than the MSMs can also be employed149–151. Enhanced thermodynamic- sampling techniques that can sample rare events with large energy barriers can aid in the efficient construction of the model152–156. However, often overlooked is an additional challenge associated with building MSMs (and indeed with KMC models as well157); namely, a fundamental limitation remains that the entire configura- tional space cannot be sampled by a finite number of MD trajectories, that is, a MSM is never complete. Even when the MD trajectories used for network-building collectively exceed microsecond time scales, there are bound to be rare states and pathways missing from the MD data. When relevant states and pathways are miss- ing from the constructed MSM, thermodynamic/kinetic quantities being sought can be inaccurate. The main purpose of this chapter to highlight the danger arising from missing relevant states and pathways in a network, develop a strategy to quantify the completenessof a kinetic network model, and identify regions of configuration space relevant to the dynamical evolution, which can guide further network construction.
Many network-building procedures entail pruning/lumping of states and kinetic pathways to enforce detailed balance and avoid absorbing states. Although the length of the MD trajectory used to build a network is generally reported, it is not enough to establish the maximum duration for which the dynamics is faithfully pre- dicted by the network model. In the worst case, missing states can be important to the ensemble-averaged quantities calculated from an “incomplete” network model.
Given that network models are nowadays generated by seeding the MD calculations starting from different states while using a variety of computational tricks, comparing the dynamics from models for the same system is subject to error/uncertainty result- ing from the missing kinetic information. A conceptual framework that accounts for missing states/pathways will bolster endeavors to generate reliable network models.
Estimators for missing rates from a state first developed in Refs.158and159have been applied to different material systems157,160–162. However, more than the missing rates, conceptually, it is the time scale where the missing pathways become relevant to the dynamics that is of interest. The largest time scale where the network model continues to yield the correct dynamics, termed as the validity time for the model, is introduced here. The validity time allows one to systematically compare the behavior
3.2 MSM Methodology: Construction, Validation and Error