5.2.1 Checking Missing Values
Missing data represents the absence of valid values on one or more variables during the data preparation process. This issue represents a challenge for researchers and might affect the generalisability of the findings. The missing value analysis and the choice of imputation method represent an effective solution to understand and accommodate the missing data (Hair et al. 2014). For this study, the survey setting for data collection did not allow the completion or the registration of a result without answering all the required questions. Therefore, the procedures of missing data analysis and imputation were not conducted since the research data is complete without any missing values.
118 5.2.2 Checking for Outliers
Outliers constitute unique and identifiable characteristics that differ from other data observations. The presence of outliers might be problematic and can affect the findings of the research. Different methods are available to detect and handle outliers (Hair et al. 2014). For this study, the researcher used a univariate detection method to identify extreme and unique observations. This method allows the identification of outliers on each of the variables individually. The data values are converted to standard scores and values of 2.5 or higher are defined as outliers for samples of 80 or fewer observations.
Whereas, this threshold value increases up to 4 for more extensive samples (Hair et al. 2014). In this research, the analysis of variables standard scores showed values below the threshold of 4. This result highlights the absence of outliers and the retention of all observations for further analysis.
5.2.3 Assessing Data Normality
The test for normality refers to the shape of data distribution for an individual variable and its similarity to the normal distribution. This assessment is a fundamental step for multivariate analysis, and the considerable variation from the normal distribution refers to invalid statistical methods (Hair et al. 2014).
There are different statistical tests to assess normality. The research might use the z value of skewness and kurtosis, and if these values exceed the critical value of 2.58 at .01 significance level or 1.96 at .05, then the distribution is non-normal. Other statistical tests for normality are available as well, such as Shapiro-Wilks and Kolmogorov-Smirnov tests (Hair et al. 2014). Despite the importance of these tests to understand how the distribution differs from normality, the researcher might conclude that minor deviations from normality are significant. Thus, the visual shape of the distribution and the interpretation of skewness and kurtosis statistics are critical measures of the degree of normality (Field 2018). Besides, the study’s large sample size of 143 respondents reduces the detrimental impacts of non-normality (Hair et al. 2014).
119
This research adopts the normal probability plot approach to assess the normality of the distribution.
Also, the absolute value of skewness and kurtosis were compared to the accepted measures of less than 2 and 7 respectively suggested by Curran, West and Finch (1996). The FCI absolute value of skewness and kurtosis are -1.495 and 1.674, respectively. The SMT absolute values of skewness and kurtosis are -1.365 and 1.772, respectively. The SMC absolute values of skewness and kurtosis are -0.192 and -0.496, respectively. The DMC absolute values of skewness and kurtosis are - 0.397 and 0.047, respectively.
The AMC absolute values of skewness and kurtosis are - 0.935 and 0.778, respectively. The MP absolute values of skewness and kurtosis are - 0.774 and 0.402, respectively. Figure 13 shows that normality is not an issue for this research. The histograms are approximately bell-shaped, and the variables plotted data falls reasonably close to the diagonal line.
Figure 13: Distributions scores and normal Q-Q plots of research variables
120 5.2.4 Assessing Data Linearity and Homoscedasticity
Linearity and homoscedasticity are essential assumptions to be tested in multivariate techniques based on correlations, such as structural equation modeling (Hair et al. 2014). These two assumptions can be checked with a single graph since it relates to the residuals. The predicted values and errors are converted to z-scores, and their plotted values should not show systematic relationships (Field 2018). The scatterplot (figure 14) of the standardized residuals and standardised predicted values shows that the assumptions of linearity and homoscedasticity are met. The graph did not funnel out, and there is no sort of curve trend in the residuals (Field 2018). Further, the normal P-P plot of the standardised errors proposes that these residuals are normally distributed.
Figure 14: Scatterplot and normal P-P plot of standardised residuals
The scatterplot in figure 14 highlights as well two cases that have standardized residuals greater than 3.
Thus, these observations might present bias in the model and affect the multivariate analysis. The researcher conducted case wise diagnostics, and cases 73 and 79 were proposed as potential evidence of bias. However, Cook’s distance statistics (see appendix 5.1) did not cross the threshold of 1. Thus, these cases did not appear to affect the model negatively.
121 5.2.5 Assessing Multicollinearity
Multicollinearity refers to the strong correlations between two or more predictors. The perfect collinearity poses several problems to the model. For example, the perfectly correlated predictors increase the standard errors of the b coefficients, which leads to unstable equations across the sample.
Besides, the multicollinearity between predictors limits the size of variance in the outcome and reduces the ability to assess the importance of each predictor (Field 2018). SPSS statistics can identify multicollinearity by computing the variance inflation factor and the tolerance statistic. For this research, the results of this method suggest that Multicollinearity (see appendix 5.2) is not a problem, since the variance inflation factor and the tolerance statistic met the guidelines criteria, and were below ten and above 0.1 respectively.
5.2.6 Demographic Data
In total, the completed questionnaires were 143 out of 434 (33% response rate). The study sample includes different types of industries. The firms’ industry type frequency table and pie chart figure (table 5.1 and figure 15) showed that among obtained responses, the retail firms were 13 (9.1%), bank and finance 7 (4.9%), media and communication 12 (8.4%), transportation and logistics 9 (6.3%), food and beverage products 20 (14%), oil and gas 1 (0.7%), manufacturing 8 (5.6%), construction 5 (3.5%), technology 12 (8.4%), education 7 (4.9%), insurance 7 (4.9%), hospitality 10 (7.0%), healthcare 19 (13.3%), other 13 (9.1%).
Table 5.1: Firm industry type demographic data Industry Type Frequency Percentage
Retail 13 9.1
Bank & Finance 7 4.9
Media & communication 12 8.4 Transportation & logistics 9 6.3
122 Food & beverage products 20 14
Oil & gas 1 0.7
Manufacturing 8 5.6
Construction 5 3.5
Technology 12 8.4
Education 7 4.9
Insurance 7 4.9
Hospitality 10 7.0
Healthcare 19 13.3
Other 13 9.1
Total 143 100
Figure 15: Pie chart of firm industry type
The firms’ size frequency table and pie chart figure (table 5.2 and figure 16) showed that among obtained responses, 25 (17.5%) of the firms have less than 100 employees, 48 (33.6%) between 100 & 499 employees, 29 (20.3%) between 500 & 999 employees, and 41 (28.7%) have 1,000 or more employees.
123 Table 5.2: Firm size demographic data
Firm size in the region Frequency Percentage Less than 100 employees 25 17.5
Between 100 & 499 employees 48 33.6 Between 500 & 999 employees 29 20.3 1,000 or more employees 41 28.7
Total 143 100
Figure 16: Pie chart of firm size in the region
The firms’ experience frequency table and pie chart figure (table 5.3 and figure 17) showed that among obtained responses, 10 (7.0%) have less than three years’ experience in the region, 14 (9.8%) between 3 and 5 years, and 119 (83.2%) over five years of experience.
Table 5.3: Firm experience demographic data
Firm experience in the region Frequency Percentage
Less than 3 years 10 7.0
Between 3 & 5 years 14 9.8
Over 5 years 119 83.2
Total 143 100
124 Figure 17: Pie chart of firm experience in the region
The analysis of firms’ demographic data reveals essential insights for consideration during the interpretation and the analysis of research results. First, the presence of different types of industries highlights a broad representative sample. Second, the majority of the firms (93%) have more than three years of experience in the region, which reduces the weakness of a cross-sectional survey since it measures marketing capabilities at a point of time. This high percentage of experienced firms means that the effect of path dependency for capabilities development might not be a major issue for this research, and its effect on performance is minimal (Morgan, Feng & Whitler 2018).
The preliminary analysis underlines the initial procedures to understand the sample demographics and the data obtained. This analysis reveals that the assumptions for multivariate techniques are met, and the study can proceed with exploratory factor analysis.
125