Chapter 4: Correspondence Analysis
4.4 Application
MCA locates all the categories in Euclidean space and aims to produce a solution where objects within the same category are plotted close together whereas objects in different categories are plotted far apart. The plotting of the variables are useful for detecting the clustering of attributes. MCA is used to represent and model datasets as “clouds” of points in a multidimensional Euclidean space; this means that it is distinctive in describing the patterns geometrically by locating each variable/unit of analysis as a point in a low- dimensional space (Costa et al., 2013). Each object will be as close together as possible to the category points of categories that apply to the object, thus the categories divide the object into homogeneous subgroups. So if a certain variable discriminates well, the objects will be close to the categories to which they belong.
The map in Figure 4.1 was generated from calculating & distances of points represented in the form of two-way contingency tables and the first two dimensions plotted, are used to examine the associations among the categories.
59
The variables appear to be clustered together making it difficult to differentiate between the points and those variables situated about the origin are not well represented in the Map and do not add to the interpretation of the display. It can be seen that only 13.3 percent of the data is explained by the MCA map which is relatively low. Also, the two dimensions account for 21.07 percent of the total association indicating that there is 78.93 percent error in the display. This implies that the two-dimensional figure accounts for 21.07 percent of the variability in the data, which leaves 78.93 percent unaccounted for.
Inertia and Chi-square decomposition for the MCA is presented in Table 4.1. The total inertia indicates the accuracy of the display and the total Chi-square statistic, which measures the association between the rows and columns in the full dimension of the table is 14573.6 with 1681 degrees of freedom. From Table 4.1 it can be seen that the percentage of inertia accounted for by the first ten dimensions are, 13.13 percent, 7.95 percent, 6.91 percent, 6.06 percent, 4.97 percent, 4.75 percent, 4.08 percent, 4.05 percent, 4.02 percent and 3.70 percent, respectively. Also, 59.2 percent of the total variation is accounted for by the first ten dimensions.
MCA was also carried out using Greenacre and Blasius (2006) proposed method to the adjustment of inertias described in Section 3.4. This adjustment changes the scale of each dimension of the map to best approximate the two tables of association between pairs of variables and everything else in the solution remains intact.
From Table 4.2 it can be seen that when the principal inertias are adjusted, the percentage explained by the two dimensions is 63.56 percent, which is much higher than the 21.07 percent accounted for in the MCA map without the adjustment to inertias. Also, more than 70 percent of the total variation is accounted for by the first three dimensions. The adjustment led to estimates of the explained inertias that are much closer to the true values than the values obtained in MCA.
60 Figure 4.1: Two-dimensional MCA MAP
61
Table 4.1: Inertia and Chi-Square Decomposition
Table 4.2: Greenacre Adjustment to Inertia Decomposition
62
The figures presented in the following three pages describe the MCA map with adjustment of principal inertias along the different dimensions using Greenacres (1994) method. In Figure 4.2, there is a cluster of variables making it difficult to differentiate between the points. A similar conclusion could be seen in Figure 4.1, which represents the two-dimensional MCA map without the adjustment to inertias. The variables situated about the origin are not well represented in the Map and do not add to the interpretation of the display.
Figure 4.3 presents the MCA map with adjustment of principal inertias along Dimension 1 and Dimension 3. There appears to be some clustering of variables about the origin, however, it can be noted that categories corresponding to the positive response of variables, BEmerg, Btrt and BMal, can be found on the bottom left of the MCA map. These variables were used to assess if the individual had knowledge on the usage of the blood required for donation.
Also, the categories for lower educational level (i.e., none and primary school), no knowledge about the different type of blood groups (KDBGNo), having not seen or heard or having not remembered to have seen or heard the messages about blood donation (HSMBDNo, 3) and having a strong opinion that the appropriate way to give blood should not be voluntary non- remunerated blood donation (VNRBDNo), are situated toward the top left of the map.
A similar pattern is displayed in Figure 4.4, which presents the MCA map with adjustment of principal inertias along Dimension 2 and Dimension 3. There appears to be a clustering of variables about the origin making it difficult to differentiate between the points on the map.
However, the category of responses to the variables described above for the lower and upper left regions of the MCA map in Figure 4.3, is again evident in the map in Figure 4.4.
63
Figure 4.2: MCA MAP with adjustment of principal inertias along Dimension 1 and Dimension 2
64
Figure 4.3: MCA MAP with adjustment of principal inertias along Dimension 1 and Dimension 3
65
Figure 4.4: MCA MAP with adjustment of principal inertias along Dimension 2 and Dimension 3
66