Statistical assumptions of substantive analyses across the general linear model: a mini-review Page: 4
5 p.View a full description of this article.
Extracted Text
The following text was automatically extracted from the image on this page using optical character recognition software:
Statistical assumptions
Linearity is important in a practical sense because Pearson's r,
which is fundamental to the vast majority of parametric statistical
procedures (Graham, 2008), captures only the linear relation-
ship among variables (Tabachnick and Fidell, 2001). Pearson's r
underestimates the true relationship between two variables that is
non-linear (i.e., curvilinear; Warner, 2008).
Unless there is strong theory specifying non-linear relation-
ships, researchers may assume linear relationships in their data
(Cohen et al., 2003). However, linearity is not guaranteed and
should be validated with graphical methods (see Tabachnick
and Fidell, 2001). Non-linearity reduces the power of statistical
tests such as ANCOVA, MANOVA, MANCOVA, linear regres-
sion, and canonical correlation (Tabachnick and Fidell, 2001).
In the case of ANCOVA and MANCOVA, non-linearity results
in improper adjusted means (Stevens, 2002). If non-linearity is
detected, researchers may transform data, incorporate curvilin-
ear components, eliminate the variable producing non-linearity,
or conduct a non-linear analysis (cf. Tabachnick and Fidell, 2001;
Osborne and Waters, 2002; Stevens, 2002; Osborne, 2012), as long
as the process is clearly reported.
VARIANCE
Across parametric statistical procedures commonly used in quan-
titative research, at least five assumptions relate to variance. These
are: homogeneity of variance, homogeneity of regression, spheric-
ity, homoscedasticity, and homogeneity of variance-covariance
matrix.
Homogeneity of variance applies to univariate group analyses
(independent samples t test, ANOVA, ANCOVA) and assumes that
the variance of the DV is roughly the same at all levels of the IV
(Warner, 2008). The Levene's test validates this assumption, where
smaller statistics indicate greater homogeneity. Research (Boneau,
1960; Glass et al., 1972) indicates that univariate group analyses
are generally robust to moderate violations of homogeneity of
variance as long as the sample sizes in each group are approxi-
mately equal. However, with unequal sample sizes, heterogeneity
may compromise the validity of null hypothesis decisions. Large
sample variances from small-group sizes increase the risk of Type
I error. Large sample variances from large-group sizes increase
the risk of Type II error. When the assumption of homogeneity
of variance is violated, researchers may conduct and report non-
parametric tests such as the Kruskal-Wallis. However, Maxwell and
Delaney (2004) noted that the Kruskal-Wallis test also assumes
equal variances and suggested that data be either transformed to
meet the assumption of homogeneity of variance or analyzed with
tests such as Brown-Forsythe F* or Welch's W.
Homogeneity of regression applies to group analyses with
covariates, including ANCOVA and MANCOVA, and assumes that
the regression between covariate(s) and DV(s) in one group is the
same as the regression in other groups (Tabachnick and Fidell,
2001). This assumption can be examined graphically or by con-
ducting a statistical test on the interaction between the COV(s)
and the IV(s). Violation of this assumption can lead to very mis-
leading results if covariance is used (Stevens, 2002). For example,
in the case of heterogeneous slopes, group means that have been
adjusted by a covariate could indicate no difference when, in fact,
group differences might exist at different values of the covariate. Ifheterogeneity of regression exists, ANCOVA and MANCOVA are
inappropriate analytic strategies (Tabachnick and Fidell, 2001).
Sphericity applies to repeated measures analyses that involve
three or more measurement occasions (repeated measures
ANOVA) and assumes that the variances of the differences for all
pairs of repeated measures are equal (Stevens, 2002). Presuming
that data are multivariate normal, the Mauchly test can be used
to test this assumption, where smaller statistics indicate greater
levels of sphericity (Tabachnick and Fidell, 2001). Violating the
sphericity assumption increases the risk of Type I error (Box,
1954). To adjust for this risk and provide better control for Type I
error rate, the degrees of freedom for the repeated measures F test
may be corrected using and reporting one of three adjustments:
(a) Greenhouse-Geisser, (b) Huynh-Feldt, and (c) Lower-bound
(see Nimon and Williams, 2009). Alternatively, researchers may
conduct and report analyses that do not assume sphericity (e.g.,
MANOVA).
Homoscedasticity applies to multiple linear regression and
canonical correlation and assumes that the variability in scores for
one continuous variable is roughly the same at all values of another
continuous variable (Tabachnick and Fidell, 2001). Scatterplots
are typically used to test homoscedasticity. Linear regression is
generally robust to slight violations of homoscedasticity; how-
ever, marked heteroscedasticity increases the risk of Type I error
(Osborne and Waters, 2002). Canonical correlation performs best
when relationships among pairs of variables are homoscedastic
(Tabachnick and Fidell, 2001). If the homoscedasticity assumption
is violated, researchers may delete outlying cases, transform data,
or conduct non-parametric tests (see Conover, 1999; Osborne,
2012), as long as the process is clearly reported.
Homogeneity of variance-covariance matrix is a multivariate
generalization of homogeneity of variance. It applies to multivari-
ate group analyses (MANOVA and MANCOVA) and assumes that
the variance-covariance matrix is roughly the same at all levels
of the IV (Stevens, 2002). The Box M test tests this assumption,
where smaller statistics indicate greater homogeneity. Tabachnick
and Fidell (2001) provided the following guidelines for inter-
preting violations of this assumption: if sample sizes are equal,
heterogeneity is not an issue. However, with unequal sample sizes,
heterogeneity may compromise the validity of null hypothesis
decisions. Large sample variances from small-group sizes increase
the risk of Type I error whereas large sample variances from large-
group sizes increase the risk of Type II error. If sample sizes are
unequal and the Box M test is significant at p < 0.001, researchers
should conduct the Pillai's test or equalize sample sizes by random
deletion of cases if power can be retained.
DISCUSSION
With the advances in statistical software, it is easy for researchers to
use point and click methods to conduct a wide variety of statisti-
cal analyses on their datasets. However, the output from statistical
software packages typically does not fully indicate if necessary sta-
tistical assumptions have been met. I invite editors and reviewers to
use the information presented in this article as a basic checklist of
the statistical assumptions to be reported in scholarly reports. The
references cited in this article should also be helpful to researchers
who are unfamiliar with a particular assumption or how to test it.Frontiers in Psychology I Quantitative Psychology and Measurement
Nimon
August 2012 1 Volume 3 1 Article 322 1 4
Upcoming Pages
Here’s what’s next.
Search Inside
This article can be searched. Note: Results may vary based on the legibility of text within the document.
Tools / Downloads
Get a copy of this page or view the extracted text.
Citing and Sharing
Basic information for referencing this web page. We also provide extended guidance on usage rights, references, copying or embedding.
Reference the current page of this Article.
Nimon, Kim F. Statistical assumptions of substantive analyses across the general linear model: a mini-review, article, August 28, 2012; [Lausanne, Switzerland]. (https://digital.library.unt.edu/ark:/67531/metadc1705571/m1/4/: accessed May 17, 2024), University of North Texas Libraries, UNT Digital Library, https://digital.library.unt.edu; crediting UNT College of Information.