training conditions, it confirms the reliability of our model, qualitatively. Regarding the timing, total ∼47 h were consumed for the training of the three ML models forEP P S (∼17 h),EOSS (∼ 19 h) and ∆SA (∼ 11 h) with a single Geforce RTX 2080 Ti graphics processing unit. A single point calculation with the resulting ML models including the diagonalization (∼ 0.1 s) was about 1000times faster than SSR(2,2) calculation (∼105s).
Structure of conical intersection
CIs are points in nuclear configuration space where two BO potential energies are degenerate and NACVs diverge, i.e. coupling between electronic and nuclear motion become significant thus the BO approximation fails. Due to these features, the CIs play a central role in molecular processes involving multiple electronic states. Especially, a minimum-energy CI (MECI) provides useful information about the character of non-adiabatic chemical reactions. At the CIs, the degeneracy can only be lifted by the first order displacement along branching plane vectors g and h (equations 82 and 83). Therefore, it is important to predict MECIs and branching plane vectors correctly for ESMD simulations. Here we calculate a MECI and corresponding branching plane vectors of PSB3 from our ML models with penalty function algorithm [130], and we compare them with SSR(2,2) method. The reference dynamics study of PSB3 suggests that the major relaxation channel is via MECIcen related to a rotation along the central C=C bond [100], therefore we focus on the structure of MECIcen and corresponding branching plane vectors.
Fig. 23 showsg- andh-vectors at optimized MECIcenfor each level of theory. The MECIcen structures from ML and SSR(2,2) are superimposed in Fig. 23, however, they are hardly distin- guishable since the predicted structures are almost identical. The branching plane vectors are also well-reproduced with correct characters, i.e., g-vector and h-vector at MECIcen represent the bond length alternation motion and the torsional motion around the central C=C bond, respectively. Next, we explore the PESs around the CI in the branching space to check whether the ML models can capture the features of CIs. In Fig. 24, diabatic and BO PESs around the optimized MECIcen point in the branching plane are displayed. For the sake of clarity, we shift PESs to set the energy at the MECIcen as zero. The ML-predicted diabatic and BO PESs capture the characters of the SSR(2,2) PESs. EP P S andEOSScross linearly at the zerog-vector displacement along the h-vector (Figure 24c) and the coupling ∆SA also varies linearly along theh-vector (Figure 24e). The behavior represents the character of diabatic PESs near CI, thus the resulting BO PESs show a clear cusp at the CI (Fig. 24a). Previous studies have shown some limitations for describing the correct topology of PESs such as smoothed CI seam [110], re- quiring large training set [109] or replacing ML calculation with a quantum chemical calculation near degeneracy [107]. For our case, there is no such limitations because the diabatic PESs are trained as explained in the method section. To emphasize the ability for reproducing singularity, phase-aligned eigenvectors for the lower BO state in the branching space are shown in Fig. 25.
The a00 and a01 in the equation 77 evolve clockwise as cos(θ2) and sin(θ2) respectively. As a
result, a clear singularity appears at the CI and along negative g-axis indicating Berry phase [?].
The ability to reproduce the correct topology of PESs near CIs suggests that our ML models can replace the time-consuming quantum chemical calculation for ESMD. In addition, the result also indicates that, with proper transformation of nuclear coordinate system, we can apply our model to efficient construction of diabatic model Hamiltonian involving CIs [131,132] along some important normal modes. This enables us to investigate the effect of specific modes on excited states dynamics with a higher level of dynamics simulation method such as multi-configuration time-dependent Hartree or other wave packet dynamics [133, 134].
Non-adiabatic molecular dynamics
Finally, we conduct ESMD simulations for PSB3 with SHXF approach based on the ML models (SHXF/ML). We sample total200initial nuclear configurations and velocities with Wigner dis- tribution at300K [135] aroundtrans-PSB3 S0minimum energy structure at SSR(2,2)-ωPBEh/6- 31G* level of theory. From the initial electronic state on S1, we propagate nuclear trajectories during300fs with a time step of0.24 fs. The evolution of the central C-C=C-C dihedral angles of randomly chosen50 trajectories during the dynamics are shown in comparison with the ref- erence dynamics (Figure 26). The definition of the dihedral angle is the angle between normal vectors of planes spanned by C-(C=)C-H atoms at each side of central C=C bond [100]. The profile is similar with SHXF/SSR(2,2) result showing good separation between two isomers (cis and trans). The ratio of cis to trans isomers at the end of the dynamics is 58.5 : 41.5 which falls between ab initio multiple spawning with CASSCF (54 : 46) [129] and SHXF/SSR(2,2) (63 : 37) [100] results. Since the SSR(2,2) method can describe the dynamic correlation by DFT and the static correlation with the FONs, the resulting ML models recovers both correlations of the SSR(2,2) method.
We analyze averaged electronic population of the S1 state during the dynamics (Fig. 27).
The SHXF/ML result shows a similar decay of S1 population with SHXF/SSR(2,2), and it also shows internal consistency between⟨ρ11⟩and p1, which is a benefit of the SHXF algorithm [22].
Extended training
Since the SchNet architecture employs atomic embedding [48], SchNet models can be trained with a training set comprising multiple molecules. Based on SchNet architecture, various prop- erties such as potential energy, enthalpy and frontier molecular orbital energies have been suc- cessfully trained on the QM9 data set containing a large number of small organic molecules [48].
Also, formation energies for a variety of bulk crystals have been modeled to show that atomic embeddings for elements can capture chemical similarity of the same group elements without any explicit input [48]. Recently, transferability of the SchNet model for excited state energy and transition dipole moment to predict ultraviolet absorption spectra was confirmed for iso- electronic molecules with similar structures [136].
In this section, we train our ML models with an extended data set composed of two small protonated Schiff base, PSB3 and CH2NH+2, to test the flexibility of the ML models across chem- ical space. The CH2NH+2 data were added to the original data set from SSR(2,2) calculations for trajectories of SHXF/MRCI/SA-4-CASSCF(6,4)/6-31G* dynamics [22] in the previous section.
The resulting data set contains 48750 and 14068 points of PSB3 and CH2NH+2 data, respectively.
The data set is split into training, validation, and test set with ratio of 3:1:1 approximately. We train ML models with the extended training set of PSB3 and CH2NH+2 with the same training parameters used in the previous section. We evaluate the ML models on PSB3 test subset, CH2NH+2 test subset, and the entire test set. In other words, we test our ML models, trained for both PSB3 and CH2NH+2, to (i) PSB3 and CH2NH+2 subsets separately, and (ii) the entire set.
PSB3 test subset CH2NH+2 test subset Total test set
Energy Gradient Energy Gradient Energy Gradient
MAE RMSE MAE RMSE MAE RMSE MAE RMSE MAE RMSE MAE RMSE
EP P S 0.21 0.28 0.71 0.98 0.27 0.38 1.29 3.42 0.22 0.30 0.77 1.46
EOSS 0.23 0.31 0.79 1.11 0.66 0.68 1.46 3.38 0.33 0.43 0.86 1.55
∆SA 0.11 0.36 0.41 1.28 0.31 0.47 0.69 1.92 0.15 0.39 0.44 1.36 Total 0.18 0.32 0.64 1.13 0.41 0.53 1.15 2.99 0.23 0.37 0.69 1.46
Table 3: MAE and RMSE of SSR(2,2) diabatic elements (kcal/mol) and their gradients (kcal/mol/Å) for ML models trained with the combined training set evaluated on PSB3, CH2NH+2 test subsets and the total test set.
In Table 3, the overall errors are about 3times larger than the errors in Table 2. Especially, the errors on the CH2NH+2 subset is larger than that on the PSB3 subset with 0.41 (0.53) kcal/mol and 1.15 (2.99) kcal/mol/Å of MAE(RMSE) for energies and gradients, respectively.
The difference of error magnitudes between subsets indicates a biased fitting of the ML models to PSB3 which is originated from the relatively small size of CH2NH+2 subset. Nevertheless, the magnitudes of errors are still very small suggesting that our ML strategy is flexible to train multiple molecules. The model would be enhanced by a data set with larger chemical and nuclear configuration space by adjusting parmeters. In particular, the successful training of the coupling implies the training of phase across chemical compounds may be possible.