There are a wide variety of approaches to kinetics using Monte Carlo techniques. We focus here on those which use a grid of some form since we want to include spatial competition in our model. Furthermore, we only specify a set of species which can exist on the surface, and provide reaction constants (forward and reverse) to convert between them. This not only includes reactions, but diffusion, adsorption, and desorption processes. Thus, given any state of the surface, we can count all possible “actions” that could happen. The effects we include in our model allow molecules to diffuse to an adjacent empty site, an empty site to receive a molecule from the gas phase, or two adjacent molecules could react, possibly to form gas products. For all of these actions, we have a rate calculated from the energy barrier, which is directly proportional to the probability Pa that it will happen in a given iteration. There are a couple ways of handling this, for example, the Metropolis method would pick a random particle and associated action, and act on it with probabilityPa. The best way of handling this is described in Algorithm 6.3.
Algorithm 3 KMC algorithm loop
Identify all possible actions, and their associated ratesri R←3N
1 ri u∼[0, R]
Find j such that: 3j
1ri < u <3j+1 1 ri Perform action ‘j’
t=t−ln([0,1])R end loop
This efficiently produces a Poisson distribution, meaning that all the events are indepen- dent from each other. This allows the same molecule to be involved in successive actions, if it has sufficient propensity to do so. The computationally expensive steps is the first one listed here, of collecting all possible actions, as well as the fourth step, of searching the list.
6.3.1 The General Solution
At a first glance, this is a linearly scaling process, since one would need to first make an array, and then search the array for a cumulative probability. But this array of actions does not change much from step to step, especially if there is some degree of locality to each change.
It is well known that any KMC algorithm could scale asO(logN) per event through the use of binary trees [97] to perform the array updates and searching. However, the realization of this is often application dependent [98]. Although we have found several theoretical analyses [97, 98] of KMC, we have found very few discussions of actual implementation and how to design the binary trees.
6.3.2 Our Solution
To initialize a calculation, we start by evaluating the rates of all the forward and reverse reactions that were specified in an input file. We then separate all processes into two categories: those which only involve one site, and those which involve two sites. For all the one site processes, we create a static array indexed to each specie in the system (which includes the “empty” specie), summing all the rates for processes it might perform. For example, an empty site might receive a non-disociating gas molecule, or might receive an atom percolating up from below the surface. In these instances, the reaction would look like the conversion of an empty site into an occupied site. Two site processes are handled in an analogous fashion by allocating a matrix with each dimension indexed to the species in the system. If two species can interact in any processes, we sum all the corresponding rates into that matrix element. Based on this understanding, the matrix is symmetric.
To model reactions on a rectangular 2D surface, we design a binary tree such that each node contains the sum of all the reaction constants for a 1/2Lfraction of the surface, where L is the level in the binary tree of the node in question. Thus the root node covers the whole surface with L = 0 and stores R. Its left and right children are defined as partial sums such that R = Rl +Rr, and upper and lower distinctions are specified at
the grandchildren level so that R =Rul+Rur+Rll+Rlr, and so on. This means that a sum of the reaction constants across all the nodes for a given level will produce R. The leaves of this binary tree are the grid sites themselves, which store the reaction constants for everything that can happen at that site individually, as well as half (so that we do not double count) of its neighbors. Of course this could be adjusted for the topology of any 2D surface. Furthermore, this approach is general enough to allow some 3D systems by either modeling them as connected 2D structures, or by adding a dimensional index to the labels of the species in the input file.
To find a particular reaction starting at the root, we see if U, our uniformly drawn random number on [0, R], is lower than the left child. If it is, then we proceed down the left branch passing along the same value for U. If it is higher, then we proceed down the right branch using U −Rl as our new uniformly distributed random number. Either way, U is uniformly distributed between 0 and the cumulative valueRfor the lower node, so we repeat this comparison moving down the tree until we reach a grid site.
Once we reach a grid site, we note that the residual value of U is uniformly distributed on [0, r] where r is the sum of the 4 ri representing everything that can happen between ourself (1 term) and half of our neighbors (3 terms). The cost of this search is constant, since theriwere precomputed, resulting in a maximum of 3 comparisons and 3 subtractions.
Once we have found theri corresponding toU, we scan through only the precomputed list separating all the few ways that the two species can interact. After performing the sought after process, we update theri which changed and then updater. Then follows one update per level, in anO(logN) overall update of the binary tree.
This implementation has several merits. First, in contrast with the general techniques described [98], we never have to update any lists of nearest neighbors, or change the structure of our binary tree. Each node in the binary tree remains responsible for exactly the same grid sites throughout the calculation, and an update only involves propagating terms like R=Rl+Rrup a tree, involving very little math for each update. Second, our method scales in constant time with respect to the number of species in the system, since the sum of all the ways they can interact is precomputed and stored in a matrix in random access memory (RAM). Thirdly, our method scales as constant time with respect to the number of reactions included in the input file. The reason is that all the ways two species can interact is known at compile time, so instead of writing code which searches through the list of reactions to
find all the ways two species can relate, we use a script to generate sparse search code based on the input file which automatically skips any meaningless comparisons. This stores the equivalent of a matrix into the executable itself. Thus our RAM requirement scales as the square of the number of species, and our hard disk requirement scales as the square of the number of reactions. In our studies, neither of these requirements have been high. With this implementation, we believe that we have an original O(logN) implementation of the KMC method that is more efficient than any other.
It is worth pointing out that because the simulated time increment scales as 1/R, the number of iterations required to reach a desired simulation time scales asO(N) for an over all scaling of O(NlogN). However, this scaling also depends on other parameters such as temperature and pressure, which affect the individual rates leading to R. The biggest concern however comes from large differences in the energy barriers. If an input file specifies reactions which all have comparable energy barriers, then they’ll also have comparable rates, and fewer iterations should be necessary to produce interesting results. However, the rates scale exponentially with the energy barrier, so if you include fast processes (like diffusion) along with slow but interesting processes, then the probability per iteration that an interesting reaction will occur will drop exponentially. These factors underscore the vast variability in the number of iterations that might be required to complete a calculation.