scales or summed ratings of product satisfaction). The objective is to avoid the use of only a single variable to repre- sent a concept and instead to use several variables as indicators, all representing differing facets of the concept to obtain a more well-rounded perspective. The use of multiple indicators enables the researcher to more precisely specify the desired responses. It does not place total reliance on a single response, but instead on the “average” or typical response to a set of related responses.
For example, in measuring satisfaction, one could ask a single question, “How satisfied are you?” and base the analysis on the single response. Or a summated scale could be developed that combined several responses of satis- faction (e.g., finding the average score among three measures—overall satisfaction, the likelihood to recommend, and the probability of purchasing again). The different measures may be in different response formats or in differing areas of interest assumed to comprise overall satisfaction.
The guiding premise is that multiple responses reflect the “true” response more accurately than does a single response. The researcher should assess reliability and incorporate scales into the analysis. For a more detailed intro- duction to multiple measurement models and scale construction, see further discussion in Chapter 3 (Exploratory Factor Analysis) and Chapter 9 (Structural Equations Modeling Overview) or additional resources [48]. In addition, compilations of scales that can provide the researcher a “ready-to-go” scale with demonstrated reliability have been published in recent years [2, 9].
The Impact of Measurement Error The impact of measurement error and poor reliability cannot be directly seen because they are embedded in the observed variables. The researcher must therefore always work to increase reliability and validity, which in turn will result in a more accurate portrayal of the variables of interest. Poor results are not always due to measurement error, but the presence of measurement error is guaranteed to distort the observed relationships and make multivariate techniques less powerful. Reducing measurement error, although it takes effort, time, and additional resources, may improve weak or marginal results and strengthen proven results as well.
other independent variables cannot be attributed to any specific variable. As a result, as multicollinearity increases, the impact of individual variables is under-estimated by the parameters as their explanatory variance shifts from unique explanatory variance to shared explanatory variance. While we do want to understand the unique effects of each variable, multicollinearity tends to diminish the effects attributed to variables based on their correlation with other predictor variables. This results in the principal tradeoff always facing the analyst—including more variables as predictors to increase overall predictive power versus the multicollinearity introduced by more variables which makes it more difficult to attribute the explanatory effect to specific variables.
The decisions by the researcher in managing the variate fall in two primary areas: specifying the independent variables to be included in the analysis and then variable selection during model estimation. Figure 1.2 depicts these two decisions and the options generally available to researchers in each.
Specifying the Variate Variables The primary decision here is whether to use the individual variables or to perform some form of dimensional reduction, such as exploratory factor analysis. Using the original variables may seem like the obvious choice as it preserves the characteristics of the variables and may make the results more interpretable and credible. But there are also pitfalls to this approach. First and foremost is the effect of including hundreds and even thousands of variables and then trying to interpret the impact of each variable, thereby identifying the most impactful variables from a large set. Moreover, as the number of variables increases so does the opportunity for multicollinearity that makes distinguishing the impact of individual variables even more difficult.
The alternative is to perform some form of dimensional reduction—finding combinations of the individual vari- ables that captures the multicollinearity among a set of variables and allows for a single composite value representing the set of variables. This is the objective of exploratory factor analysis discussed in Chapter 3 which finds these groups of multicollinear variables, forms composites and the researcher then uses the composites in further analyses rather than the original variables. It may seem like dimensional reduction is the option to choose, and many times it is.
But the researcher must also recognize that now the “variables” in the analysis are composites, and the impacts for a composite represent the shared effect of those variables, not the individual variables themselves.
As shown in Figure 1.2 dimensional reduction can be performed (a) under the control of the researcher through principal components or common factor analysis, or (b) defined by the software program in techniques like prin- cipal components regression (see Chapter 5). We cannot stress the importance of considering the impact of multi- collinearity. As you will see in our discussion of each technique, multicollinearity can have a profound effect and markedly shape the interpretation of the results, if not the results themselves. Many researchers are tempted to ignore
Figure 1.2
Managing the Variate Variable Specification
Variable Selection
Dimensional Reduction
Software Controlled
User Controlled
Confirmatory or Simultaneous
Combinatorial
(all subsets) Stepwise Ridge
Backward and
Forward LASSO
Sequential Constrained Software
Controlled User
Controlled
PCR (Principal Components
Regression)
Principal Components
Analysis Common Factor Analysis Use Original
Variables
this aspect of managing the variate and instead rely solely on the various variable selection techniques described next. But this has its own pitfalls and in most instances the researcher is ceding control of this critical element of the analysis to the software.
Variable Selection The second decision to make regarding the variate is if the researcher wants to control the specific variables to be included in the analysis or let the software determine the “best” set of variables to constitute the variate. As with the prior decision, it fundamentally revolves around the level of researcher control. With simul- taneous (all variables entered simultaneously) or confirmatory (only a set or sequential sets of variables tested), the researcher can control the exact variables in the model. The combinatorial approach is a variant of the confirmatory approach where all possible combinations of the set of independent variables are estimated and then compared on various model fit criteria. With software control the software employs an algorithm to decide which variables are to be concluded. The most widely used method is the sequential approach, where variables are entered (typically most impactful first) until no other impactful variables can be found. The constrained approach identifies the most impactful variables and constrains all the lesser variables to smaller or even zero estimated parameters. More details on the options in this area are discussed in Chapter 5.
Reconciling the Two Decisions These two decisions in managing the variate give the researcher a range of control—complete researcher control over the variables and the model specification, versus total software control in determining the variables input into the analysis and the final set of variables included in the model. While the researcher may be tempted to let the software have complete control based on the notion that “the software knows best,” allowing total software control has potential pitfalls that can seriously impact the results. We therefore encour- age researchers to carefully review Chapter 3 (Exploratory factor analysis) where dimensional reduction techniques are discussed, and Chapter 5 (Multiple regression) where the options for variable selection are extensively discussed.
There is not a “right” approach for all analyses. Instead, often several principles can be identified:
SPeciFicAtion oF tHe VARiAte iS cRiticAL Many times researchers only focus on how various methods operate in terms of estimating the model of interest. And while technique selection is an important issue, many times the “success or failure” of a research project is dictated by the approach the researcher takes to specification of the variate. Including hundreds or even thousands of variables in an attempt to completely cover all the possible impacts may actually hinder the ability of the model to recover more generalized effects and thus the efficacy of the results. Thus, the more control the researcher retains on which variables are inputs to the model allows for more specificity in how the model answers the specific research question.
VARiABLe SeLection iS neceSSARY Even with researcher control in specifying the variate, empirical models provide flexibility in testing alternative or competing models which vary in the number of variables included. Any estimated model can have a number of “competing or alternative” models that provide equal or even greater predictive accu- racy by including a different set of variables. This is disconcerting to beginning researchers when they find there are many models that work just as well as their selected model, but have different variables and effects. Moreover, the possibility of several alternative models increases as the number of variables grows larger. So researchers should try a number of alternative models in specifying their research, perhaps formulating several alternative models that vary based on whether the researcher controls the process or the software. Chapter 5 explores this consideration by estimating a range of model forms.
ReSeARcHeR contRoL iS PReFeRReD A common theme in both of these decision areas is that while options exist for software control, researcher control provides for the analysis to test many different model specifications (both in variables used and variables selected in the model estimation) and provides a better outcome. Just letting the software make both of these decisions to some extent makes the researcher irrelevant and lets the software dictate the final model. Hopefully the researcher can bring more domain knowledge, coupled with the knowledge of how to “control”
the methods to obtain the best outcome, not just let it be dictated by some software algorithm.
MAnAGinG tHe DePenDence MoDeL
While we discuss both interdependence and dependence models in this text and provide a means for selecting between the alternative methods (see Figure 1.6), we also feel that we should focus in more detail on the range of dependence models that are available to researchers. In Figure 1.3 we distinguish between dependence models based on two factors: single equation versus multi-equation models, and among the single equation models, do we use the general linear model or the generalized linear model? We discuss the single versus multiple equation distinction first and then examine the two basic estimation methods for single equation models.
Figure 1.3
Managing Dependence Models
Classical General Linear Model – GLM
Single Equation Models
Linear Models
Generalized Linear Models – GLZ or GLIM (Non-normal response
variables) Binary/logit models Counts/poisson models Other (gamma, negative
binomial, etc.)
Systems of Equations
Structural Equation Models (SEM and PLS) Regression
ANOVA/MANOVA Discriminant
Analysis Specialized Models (e.g.,
hierarchical models)
Single Versus Multiple Equation The most common applications of multivariate models are the single equation forms—multiple regression, discriminant analysis, ANOVA/MANOVA and logistic regression. All of these model forms provide an approach for specifying a single variate’s relationship with an outcome variable. But there are also multiple equation models that enable the researcher to relate different equations to one another, even where they are interrelated (i.e., the outcome measure on one equation becomes the predictor variable in another equation).
Most often typified by the structural equation “family” of models (covariance-based structural equation modeling and variance-based partial-least squares modeling), researchers are able to look at sequential models (e.g., XSY and then YSZ) in a single analysis. Chapters 9 through 13 discuss this family of structural equation modeling approaches in much more detail.
General Linear Model Versus Generalized Linear Model The foundation for almost all of the single equation tech- niques discussed in this book is the general linear model (GLM), which can estimate canonical correlation, multiple regression, ANOVA, MANOVA, and discriminant analysis, as well as all of the univariate group comparisons – t test and such [40, 49, 12]. There is perhaps no single model form more fundamental to inferential statistics than the general linear model. But one limiting characteristic of the GLM is its assumption of an error distribution following the normal distribution. As such, many times we must transform the dependent variable when we know it does not follow a normal distribution (e.g., counts, binary variables, proportions or probabilities).
A second class of linear models is also available that accommodates non-normal outcome variables, thus eliminating the need to transform to dependent variable. Logistic regression discussed in Chapter 8 is one such model form where the dependent variable is binary and the logit transformation “links” the dependent and inde- pendent variables. Known as the generalized linear model (GLZ or GLIM), this model provides the researcher with an alternative to the general linear model that is based on a dependent variable exhibiting a normal distri- bution. While the GLM requires a transformation of a non-normal dependent variable as discussed above, the GLZ can model them directly without transformation. The GLZ model uses maximum likelihood estimation and thus has a different set of model fit measures, including Wald and Likelihood ratio tests and deviance.
These fit measures will be discussed in Chapter 8 as a means of assessing a logistic regression model. The GLZ is sometimes referred to as a GLM causing confusion with the general linear model. We make the distinction for purposes of clarification.
While the general linear model has been a staple of inferential statistics, the generalized linear model extends the linear model to a wider range of outcome variables. Outside of the different measures of model fit, both model types are estimated and evaluated in the same manner. Researchers encountering situations in which the dependent variables have a non-normal distribution are encouraged to consider using GLZ models as an alternative to trans- forming the dependent measures to achieve normality. A more thorough discussion of the GLZ procedure and its many variations are available in several texts [1, 23, 29, 35, 41].
Summary The dependence model is the key element of analytics, specifying and then testing the relationships between a variate of independent variables and one or more outcome measures. It has many forms based on the measurement qualities of the variables involved that we discuss in a later chapters. But the analyst must also consider the multiple types of dependence models available to best address the research question. Many times a single equation form is the appropriate model, but there are situations in which the research question actually involves several different relationships that interact together. Estimating each relationship separately will not identify the interrelationships between relationships that may be key to understanding. So in these situations a multi-equation approach is best suited and some form of structural equation modeling (CB-SEM or PLS-SEM) can be employed.
Within the single-equation form there are two types of models, differentiated on the distribution of the dependent measure. The general linear model (GLM) is the model underlying most of the widely used statistical techniques, but it is limited to dependent variables with a normal distribution. For non-normal dependent variables, we can either transform them to hopefully confirm to the normal distribution or use the generalized linear model (GLZ or GLIM) that explicitly allows the researcher to specify the error term distribution and thus avoid transformation of the dependent measure. While the use of maximum likelihood estimation requires a different set of model fit measures than the GLM, they are directly comparable and easily used in the same manner as their GLM counterparts (see Chapter 8 for an example). We encourage researchers to consider the GLZ model when faced with research questions that are not directly estimable by the GLM model.
StAtiSticAL SiGniFicAnce VeRSUS StAtiSticAL PoWeR
All the multivariate techniques, except for cluster analysis and perceptual mapping, are based on the statistical infer- ence of a population’s values or relationships among variables from a randomly drawn sample of that population.
A census of the entire population makes statistical inference unnecessary, because any difference or relationship, however small, is true and does exist. Researchers very seldom use a census. Therefore, researchers are often inter- ested in drawing inferences from a sample.
Types of Statistical Error and Statistical Power Interpreting statistical inferences requires the researcher to specify the acceptable levels of statistical error that result from using a sample (known as sampling error). The most common approach is to specify the level of Type I error, also known as alpha 1a2. Type I error is the probability of rejecting the null hypothesis when it is actually true—generally referred to as a false positive. By specifying an alpha level,
Although specifying alpha establishes the level of acceptable statistical significance, it is the level of power that dictates the probability of success in finding the differences if they actually exist. Why not set both alpha and beta at acceptable levels? Because the Type I and Type II errors are inversely related. Thus, Type I error becomes more restrictive (moves closer to zero) as the probability of a Type II error increases. That is, reducing Type I errors reduces the power of the statistical test. Thus, researchers must strike a balance between the level of alpha and the resulting power.
Impacts on Statistical Power But why can’t high levels of power always be achieved? Power is not solely a function of alpha. Power is determined by three factors: effect size, significance level (α) and sample size.
eFFect SiZe The probability of achieving statistical significance is based not only on statistical considerations, but also on the actual size of the effect. Thus, the effect size helps researchers determine whether the observed relationship (difference or correlation) is meaningful. For example, the effect size could be a difference in the means between two groups or the correlation between variables. If a weight loss firm claims its program leads to an average weight loss of 25 pounds, the 25 pounds is the effect size. Similarly, if a university claims its MBA graduates get a starting salary that is 50 percent higher than the average, the percentage is the effect size attributed to earning the degree.
When examining effect sizes, a larger effect is more likely to be found than a smaller effect and is thus more likely to impact the power of the statistical test.
To assess the power of any statistical test, the researcher must first understand the effect being examined. Effect sizes are defined in standardized terms for ease of comparison. Mean differences are stated in terms of standard deviations, thus an effect size of .5 indicates that the mean difference is one-half of a standard deviation. For correlations, the effect size is based on the actual correlation between the variables.
SiGniFicAnce LeVeL1a2 As alpha 1a2 becomes more restrictive (e.g., moving from .10 to .05 to .01), power decreases.
Therefore, as the researcher reduces the chance of incorrectly saying an effect is significant when it is not, the prob- ability of correctly finding an effect decreases. Conventional guidelines suggest alpha levels of .05 or .01. Researchers should consider the impact of a particular alpha level on the power before selecting the alpha level. The relationship of these two probabilities is illustrated in later discussions.
SAMPLe SiZe At any given alpha level, increased sample sizes always produce greater power for the statistical test.
As sample sizes increase, researchers must decide if the power is too high. By “too high” we mean that by increas- ing sample size, smaller and smaller effects (e.g., correlations) will be found to be statistically significant, until at very large sample sizes almost any effect is significant. The researcher must always be aware that sample size can
Reality no Difference Difference Statistical
Decision
H0: No Difference 12 a b
Type II error Ha: Difference
a Type I error
12 b Power
Figure 1.4
Relationship of error Probabilities in Statistical inference
the researcher sets the acceptable limits for error and indicates the probability of concluding that significance exists when it really does not.
When specifying the level of Type I error, the researcher also determines an associated error, termed Type II error, or beta 1b2. The Type II error is the probability of not rejecting the null hypothesis when it is actually false. An extension of Type II error is 12 b, referred to as the power of the statistical inference test. Power is the probability of correctly rejecting the null hypothesis when it should be rejected. Thus, power is the probability that statistical significance will be indicated if it is present. The relationship of the different error probabilities in testing for the difference in two means is shown in Figure 1.4.