5.3 Analytical Methods
5.3.1 The Support Vector Machine
Table 5.1: Incident ion LET and range in Si for the 10 MeV/u ions used for the mea- surements discussed in this chapter. Produced using the TRIM component of the SRIM software tools.
Ion LET (MeV-cm2/mg) Range (µm)
Ar 9.7 112.0
Cu 21.2 92.9
Kr 30.2 87.7
Xe 58.8 78.3
(a) Hits and Misses (b) Only Hits
Figure 5.2: A series of box and whisker plots of the peak current of recorded transients for the small device for each tested ion species. (a) shows both direct hits to the junction and misses which still triggered the measurement setup. (b) shows the same data, only with the misses removed from the data set using the methods described in this section. Once misses are removed from the data set, a noticeable trend with increasing LET can be seen.
ion strikes directly to and near the device junction. Doing this would allow us to determine the typical peak transient currents due to strikes near the junction versus those due to strikes directly to the junction. While this is one option, the threshold for what constitutes a miss transient and what constitutes a hit transient would have to be determined for each incident ion species and each tested device. Comprehensive simulations would be required for each device/ion species combination. This could rapidly become an intractable solution if more than a few ion species and devices were used. Also, it would require a high level of confidence that the TCAD models and simulations were capable of producing results that were accurate when compared to experimental data. In other words, such an approach could require a tedious amount of device response calibration in addition to a large number of individual simulations. A better approach would be to develop an analytical method for quickly determining which transients were the result of hits and which were the result of misses, and then using a much smaller number of device-level simulations to confirm that the method was performing as intended. Support Vector Machines (SVMs) can provide
one such approach.
An SVM is a machine learning technique used to divide datasets into distinct groups [72]. While not typically found in the radiation effects literature, SVMs are powerful, and commonly used, tools in the world of machine learning and pattern recognition [73, 74].
SVMs have been used for facial recognition in images, automated handwriting analysis, modeling financial markets, and various other applications [75]. Given the proper inputs, they can be very powerful tools for accurately identifying to which group a particular mea- surement belongs, especially in the case of two-group problems (such as separating hit transients from miss transients, for instance). SVMs are also particularly valuable in that they do not require an assumption of the underlying statistical distribution of the dataset to be analyzed [72].
The purpose of this section is not to provide a detailed theoretical background for SVMs, but to describe how the use of an SVM applies to this particular dataset. There are numerous works that deal with the theoretical development of SVMs and their applica- tion [72, 73, 75, 76]. References [75], [76], and [73] are particularly good introductions to the topic.
SVMs operate on training data and test data. The training dataset is commonly a well understood subset of the test data. The goal is to develop a model using well-classified training data to predict the group that any particular observation in the test data belongs to.
In practice, one can select a subset of the test data that is well understood to serve as the training data. Once the SVM is trained, it can then be applied to all of the test data.
Figure 5.3 is a graphical illustration of implementing an SVM on an arbitrary dataset.
In the left figure, the training data are shown as filled symbols. The training data are comprised of two groups, each denoted by either a circle or a triangle. The goal of training the SVM (in graphical terms) is to find a boundary between the two groups that has the largest margin on either side. Finding the largest margin between the two groups represents the best boundary for classification [72]. The name support vector machine comes from the
Figure 5.3: An illustrative example of applying SVM techniques to an arbitrary dataset.
Filled symbols represent training data (with two groups denoted by triangles and circles), while open symbols represent test data. In the left figure, training data are used to train the SVM, which has been applied to new test data in the right figure. The dashed line represents the SVM-predicted boundary between the two groups and the solid lines are the largest margins. After [76].
data points that are the closest to (but not inside) the solid lines in the figure on the left.
These points are referred to as “support vectors” in the machine learning literature.
In the figure on the right, the optimal boundary that was found using the training data plotted in the left figure is applied to new test data. It is important to note that to which group any particular observation in the test data belongs is unknown a priori(hence the need for the SVM). It is not until the trained SVM from the left figure is applied to the test data that it can be successfully divided into groups (circles and triangles) in the right figure.
However, the user must know (or assume with a high level of confidence) to which group the observations in the training data belong. This is necessary for training the SVM.
The plots in Figure 5.3 show a relatively simple case of a linear function being able to adequately divide the two groups in the dataset. The functions which are used as a basis for dividing the two groups are referred to as kernel functions, and many exist beyond the simple linear function shown here [75]. In practice, it is common to work in higher dimensions (which requires finding the optimal hyperplane, as opposed to the optimal line), and with functions that are not linear. However, for the level of understanding required here, Figure 5.3 provides sufficient insight into the motivations and goals of developing an SVM.