Everitt (1998, p. 93) regards data reduction as a procedure of summarising large amounts of data to a minimum size but yet, yields the same quality of analytical results. Objectives 1 and 3 of the current study seek to find out motives and constraints of operators who engage in informal activities in slums respectively.
As discussed in Section 4.5 in the literature review chapter, many motives influence individuals to engage in for informal activities and various constraints hold back their growth and development.
93
Some of these factors may overlap or correlate with each other. Hence, data reduction techniques are employed to achieve a reduced representation of the data. The data reduction techniques employed in the current study are the Principal Components Analysis (PCA) and the Factor Analysis (FA).
Factor and Principal Component analyses are similar in many ways and are often used interchangeably by researchers as both approaches tend to produce similar results (Pallant, 2005).
Hence, a five step protocol for performing Factor Analysis, as proposed by Williams, Brown and Onsman, (2012, p. 4) is used by the current study to perform Factor and Principal component analysis, based on the aforementioned reason.
The five steps depicted in Figure 5.2 are explained below:
Step 1 – the current research ensures the data is adequate for Factor and Principal component analysis. Gaur and Gaur (2006) and Hair et al., (2010) recommend that a sample size of 100 and above is adequate for conducting PCA and FA. On the other hand, Tabachnick and Fidell (2013) suggest that, a sample size of above 300 is appropriate. With a sample size of 344, the current study meets the criteria of the above mentioned authors.
Furthermore, the suitability of the data for Factor and Principal component analysis, is tested using the Kaiser-Meyer-Olkin (KMO) Measure of Sampling Adequacy, and Bartlett's Test of Sphericity.
The KMO measures the shared variance in the variables and has an index ranging from 0 to 1. An index greater that 0.50 is acceptable for Factor and Principal component analysis to be conducted (Hair et al., 2010). Bartlett’s test, on the other hand, tests if there is a linear combination between the variables being studied, checking if the observed correlation matrix is an identity matrix or not.
Hence, when the Bartlett’s test is significant (p<0.05), we can conclude that the correlation matrix is not an identity matrix, therefore, Principal and Factor analysis can be undertaken (Williams, Brown and Onsman, 2012, p. 5; Beavers et al., 2013).
Step 2 – Extraction is the method of obtaining the underlying components of the data set (Gaur and Gaur, 2006). There are several types of extraction methods, namely, the principal components analysis (PCA), principal axis factoring (PAF), maximum likelihood (ML), uweighted least
94
squares, generalised least squares, alpha factoring and image factoring methods. In the current study, the principal component extraction is used for the principal component analysis.
According to De Winter and Dodou (2012), if the goal of the Factor analysis is to produce as many factors as possible, then the PAF method is appropriate (De Winter and Dodou 2012, p. 708).
Hence, the PAF extraction is used in the current study since the goal is to produce as many factors as possible, which motivate operators to engage in informal activities in slums.
Figure 5.2 Five step procedure for Factor and Principal component analysis
Source: Williams, Brown and Onsman, (2012, p. 4) 1. Is the data suitable for
PCA/EFA?
2. How will the
factors/components be extracted?
3. What criteria will assist in determining factor/components
extraction?
4. Selection of rotational method
5. Interpretation and labelling
95
Step 3 – In deciding how many components to retain, the eigenvalues are assessed. Eigenvalues of a particular component is a measurement of the amount of total variance explained by that component (Tabachnick and Fidell 2013, p. 933-966). According to Kaiser (1961), only components with eigenvalues greater than 1 should be retained, and this guideline was followed in the current study.
Step 4 – According to Gaur and Gaur, (2006, p. 133) rotation helps in coming up with the basic patterns of factor loadings, as it highlights loadings with high correlations, and minimizes those with low correlations. There are two types of rotation methods, orthogonal and oblique rotations.
Orthogonal rotation assumes no correlation among the underlying factors of the data set. In social science, however, factors are expected to have some level of correlation amongst themselves.
Hence, the oblique rotation which assumes some correlation among the factors, renders a more accurate result (Costello and Osborne, 2005). Accordingly, the Direct Oblimin, which is a type of Oblique rotation is used in this study.
Step 5 – The final step entails labelling and interpreting the results. This will be done in Chapter Seven.
i. Factor Analysis
Factor analysis enables researchers to analyse the interrelationships among several variables and to explain the variables in terms of their common underlying dimensions (Hair et al., 2010, p. 94).
Factor analysis helps in exploring whether a large number of variables of interest are linearly related to a smaller number of common factors that may be grouped as a set.
There are two types of factor analysis; Exploratory factor analysis (EFA) and Confirmatory factor analysis (CFA). Confirmatory factor analysis shows how each item loads to the relevant factor while also isolating what is explained by other items in a subset. In CFA, researchers use theory and empirically based knowledge to propose relationship between certain variables a priori, then test this hypothesis statistically (Suhr, 2006).
96
On the other hand, EFA helps in studying the interrelationships among variables in an effort to obtain new sets of variables, which express shared characteristics among the original variables (Tabachnick and Fidell, 2013). The core purpose of EFA, according to Johnson and Wichern, (2002, p. 477-478), is to describe as much as possible the correlation relationship in set of variables on the basis of a few underlying factors, hence this method is used in the current study.
EFA is to be employed by the current study to determine the factors that motivate operators to engage in slum activities. Literature survey (Gerxhani, 2004; Maloney, 2004; Williams, 2007;
Chingono, 2014) enabled the researcher to enumerate various motives for engaging in informal sector activities, and the following 15 variables are identified:
• Taking advantage of business opportunities
• Desire for greater independence
• Quest for higher income
• Avoid tax
• Avoid government regulation
• High cost of informal business
• Flexibility of informal work schedule
• Greater satisfaction
• Increase leisure time
• Greater use of one’s expertise
• Redundancy
• Survival
• Supplementing income
• Networks
• Wealth generation.
The current study is interested in finding out the underlying communalities between these motives for informal activities and reduce them, so the variables can be represented by fewer factors.
Benzing, Chu, and Kara, (2009) similarly used FA to evaluate motives, success factors and problems pertaining to small and medium scale entreprises in Turkey.
97 ii. Principal Component Analysis (PCA)
Principal Component Analysis sets out to reduce a large data into smaller sets. In doing this, PCA preserves as much as possible, the variations within the data set (Härdle and Simar 2003, p. 234).
Abdi and Williams (2010, p. 3) enumerated the following as the four main goals of PCA;
a) To extract the most salient information from a large data set
b) To reduce as much as possible the size of a data set while retaining the most relevant information.
c) To simplify the explanation of a data set, and, d) To critically analyse the variables.
In achieving these goals, the transformation process generates principal components, and each of these components is uncorrelated with each other. The components in a descending order (components with highest eigenvalue comes first), explain the variations present in the data set (Jolliffe 2002, p. 1). The firstly extracted component contains the highest possible variance, hence explains the most variation in the data.
In the literature survey (Canagarajah and Sethuraman, 2001; Skinner, 2005; Olawale and Garwe, 2010; Okpara, 2011), various factors were identified as constraints to informal sector development.
In this study, the influence of the following factors as major constraints would be examined by using PCA.
• Lack of land tenure security.
• Lack of production inputs.
• Lack of production tools.
• Difficulty in accessing bank loans.
• Harassment by the municipal authority.
• Lack of expertise in financial management and planning.
• Lack of experience and training in business management.
• Lack of training in bookkeeping/accounting.
• Poor communication systems.
• Erratic electricity supply.
• Lack of networking.
• Poor state of roads.
98
• Lack of collateral security.
• Other infrastructural challenges.
The current study uses PCA to determine the conditions that constrain the growth of informal sector businesses in slums. Hence, in line with other studies (Olawale and Garwe, 2010; Mahadea, 2013), PCA is used for data reduction purposes.