In particular, Generalized Valence Bond (GVB) wavefunctions can eliminate enough of the fixed-node energy, for both bond breaking and electronic excitation processes, that we can easily obtain accuracy on the order of a few tenths of a kcal/mol. All that QMC requires of the wavefunction is that it is easily differentiable so that we can apply the Hamiltonian, a much simpler criterion to satisfy.
Wavefunctions
Antisymmetry
It is a violation of the antisymmetry constraint to apply to an electron pair an antisymmetric space function and an antisymmetric spin function, since the product of the two antisymmetric functions is symmetric. The small red sphere below and to the left of Oxygen is the initial position of our test particle, and the darker red spheres are the other electrons of the same spin.
Quantum Chemistry
Static correlation is an error resulting from the optimization of an imperfect functional form for the wave function during the SCF process and is usually resolved by increasing the complexity of the wave function by adding more orbitals and basis functions to the SCF optimization. Unfortunately, the convergence of energy in the limit of adding multiple determinants is very slow, so this approach is impossible in practice.
Variational Monte Carlo
Error Margins
However, we can do better than this because the computational considerations required to evaluate the local energy, discussed in detail in Appendix B, are quite different from those of other quantum chemistry algorithms. Specifically, we can now add interelectron coordinate functions to our wavefunction, a significant improvement over electrons that can only see the mean field of other electrons.
Cusp Conditions
These functions, which we will call Jastrows, can now help electrons avoid each other, beyond the repulsion imposed by the antisymmetry principle.
Diffusion Monte Carlo
To speed up convergence, we must use our best guess of the significance sampling wave function, and the specific choices for this type of "trial function" will be the subject of later chapters. If this weight becomes large, a typical DMC algorithm may duplicate the pedestrian, giving each of the pedestrian children half the weight of the parent.
Practicum
With good CVS software, the exact evolution of any piece of code can be observed. For the GVB wave function, I usually include all determinants as they are not expensive for reasons documented in Appendix D.
Introduction
In the present paper, scientific results as well as underlying formalism have been simplified for purposes of presentation and to focus on the essential computational aspects. We acknowledge that it is unclear how single precision results can be useful, especially for an algorithm designed to produce highly accurate results.
Introduction to Graphical Processing Units
However, currently, the time to double performance on GPUs is significantly shorter than for CPUs, leading to increased performance advantages for the GPU if a computation is mapped well enough to the GPU. Before discussing how these concerns manifest themselves in our environment, we provide a brief high-level introduction to Quantum Monte Carlo calculations to understand the necessary computational components that we seek to map to GPUs.
Introduction to Quantum Monte Carlo
For each basis function, Rj, kj, lj, mj, nj, anj and bnj are parameters given as input to the QMC program. The rationale for these additional terms is the improved convergence if the wave function is a better approximation of the eigenfunction of ˆH to begin with.
Implementation on the GPU
- Walker Batch Scheme
- Basis Function Evaluation
- Kernel 1: Data Generation
- Kernel 2: Layout Conversion
- Matrix Multiplication
- Jastrow Functions
This can be amortized by processing as many fragments as possible simultaneously on the GPU. For the GPU implementation, we focused on transferring the electron-electron terms (electron-nuclear terms are significantly less).
GPU Floating Point Error
Underflow Corrections
The effect of this type of error depends greatly on the distribution of parameters, which is very specific to our application. We conclude that switching helps, but the lack of de-normals on the GPU was not found to be a significant source of error.
Kahan Method
However, as Figure 3.5 shows, the improvement is enough to beat even single precision on the CPU for long enough inner products. This distribution was evaluated either on the GPU or on the CPU and then sent to the GPU for multiplication.
Results
With increasing capabilities on the GPU, more of the code will be available for porting considerations. Of course, this assumes that these benefits are not offset by precision errors arising from other parts of the code.
Conclusion
Some elements of the calculation, such as Monte Carlo statistical manipulations, may not be performed with a single precision. We present a comprehensive technique for using Quantum Monte Carlo (QMC) to obtain high-quality energy differences.
Introduction
Second, and even better, providing a guess for the wave function nodes, we can include all the dynamical correlation energy through the diffusion Monte Carlo (DMC) algorithm. In this paper, a balanced solution is investigated where we use Generalized Valence Bond (GVB) wave functions [48] to correct for the fixed-node error.
Method
- Generalized Valence Bond Wavefunctions
- Length Scaled Jastrows
- Wavefunction Optimization
- Walker Reconfiguration
- Further Details
Although there are 43 terms of the form xaixbjxcij in the polynomial for 3 particle Jastrows, several are needed. Furthermore, although QMC is insensitive to the normalization of the wave function, we do not take the opportunity to eliminate some degree of freedom in the CI coefficients.
Results
Methylene
With 'B' we indicate our basis, with 'O' we indicate where it matters whether our GVB pairs are localized or delocalized, and with. This is not the case for our aug-tz basis set, which changes by at least 0.1 kcal/mol when adding 3 particle Jastrows.
Ethylene
Our best result, shown in Table 4.7, is only 0.2 kcal/mol higher than the best literature value. Below the dashed line in Table 4.8 we show our single determinant results using RHF and also using orbitals from an M06-2X DFT calculation.
Conclusion
Furthermore, the analog GVB-4/tc 3-particle Jastrow calculation would have taken 4% more time than if we ignored the 3-particle Jastrows. In contrast to our conclusions for methylene, the 3-particle Jastrows are not worth the effort, even if we had full confidence in their optimization.
Jastrows
The main overhead of these functions, however, turns out to be the process of converting derivatives with respect to all parameters into derivatives with respect to independent and optimizable parameters. The functional form in Equation 5.5 is still a good idea because the ability to truncate Jastrows seems to be important.
More Calculations
- Ne
- Be 2 → 2Be
- O 3 1 A g → O 3 3 B 2
- SiH 2 1 A 1 → SiH 2 3 B 1
- Survey of G1 Atomization Energies
The other difference is that they use C=2 instead of our C=3 as the cutoff prefactor exponent. As we can see from these results, it seems that the GVB wavefunctions are sufficient to study most of the molecules in the table.
A Crazy New Idea
We assume this is because we only add perfect pairs to the CC bonds, and not to the CH bonds. We present a distribution of the value p for all electron motions in Figure 5.2 using GVB/tz wave functions, where we can see that about half of the distribution is above 1.
The Preferred Number of Processors
From this data we can see that the energies are relatively independent of the number of processors, with no deviations of more than 0.15 kcal/mol from the reference, and as can be seen in Table 5.7 the errors are also relatively constant. A more thorough test would quadruple the number of iterations for each calculation, so that they all take at least 1.6 million iterations.
Pseudopotentials
Part of the problem, we suspect, lies in the total number of pedestrians in the calculation, and that some of the error occurred is due to limited population bias. There are a lot of ideas in the literature and a lot of time to waste implementing them all.
Introduction
We present an O(logN) grid-based implementation of kinetic Monte Carlo (KMC), where N is the number of grid sites. We also show how we achieve constant time scaling with respect to the number of species or reactions in our system.
What is Kinetic Monte Carlo?
The General Solution
At first glance, this is a linear scaling process, since one would first have to make a matrix and then find the cumulative probability in the matrix. It is well known that any KMC algorithm can scale asO(logN) per event using binary trees [97] to perform matrix update and search.
Our Solution
Third, our method scales as a time constant with respect to the number of reactions contained in the input file. These factors underline the large variability in the number of iterations that may be required to complete a calculation.
Our First Application
Thus, our RAM requirement is estimated as the square of the number of species, and our hard disk requirement is estimated as the square of the number of reactions. With this implementation, we believe we have an original O(logN) implementation of the KMC method that is more efficient than any other.
Preliminary Results
We only include reactions that directly connect one species to an adjacent species, meaning some numbers do not match as there may be additional sources/sinks outside the pyramid. For some species, however, all associated reactions are included, so adding all sources and sinks together will arrive at the number of species shown.
Conclusion
A Good Set of Parameters
If this script receives an output file, it will generate a graph showing the Jastrows for each optimization step. In this picture, we can see that the Jastrows did not change significantly after the first two steps.
Script: optimized.pl
First, we have a summary.pl script that intelligently scans through directories looking for DMC results, and can figure out which calculations are similar. Here we chose to exclude all files with Ntw in the title (the twisted geometry), to exclude Ta (adiabatic triplet) and V (vertical singlet).
Script: summary.pl
In the output shown here, each line starts with two indices which are defined in the output that we have not shown. VMC results are usually stored as comments at the beginning of the input file based on optimization iterations.
Convergence by Example
Script: plotter.pl
Script: utilities.pl
- Methylene excitations
- Optimizing different parts of the wavefunction
- Single determinant calculations
- Vertical ethylene results
- Vertical ethylene results with a poor geometry
- Adiabatic ethylene results
- Ethylene twist results
- Cycloaddition results
- Neon optimization results
- Beryllium dimer results
- Ozone excitation results
- Silylene results
- Calculations from the G1 test set
- New acceptance probability strategy
- Varying the number of processors
- Nodal plane in H 2 O
- An extrapolation to zero time step
- The cost of correcting for the summation error for square matrices
- The cost of correcting for the summation error for rectangular matrices
- Helium single precision error
- Ethane single precision error
- Kahan Summation Formula, uniform distribution
- Kahan Summation Formula, “QMC-Distributed” data
- QMC performance on a GPU
- GPU summary
- Typical Jastrow functions
- Time step error cancelation
- Silylene convergence
- Extended distribution of p
- Truncated distribution of p
- Varying the number of processors
- Surface sites illustration
- Sample distribution drawn on product pyramid
Stable exact correlation basis sets for molecular core-valence correlation effects: second-row al-ar atoms, and first-row b-ne atoms revisited. Journal of Chemical Physics. Manager-worker based model for quantum monte carlo parallelization in heterogeneous and homogeneous networks.
Script generated Jastrow optimization plot
Script generated convergence graphic