MANOVA: Multivariate Analysis of Variance
Review of ANOVA: Univariate Analysis of Variance An univariate analysis of variance looks for the causal impact of a nominal level independent variable (factor) on a single, interval or better level dependent variable The basic question you seek to answer is whether or not there is a difference in scores on the dependent variable attributable to membership in one or the other category of the independent variable - Analysis of Variance (ANOVA): Required when there are three or more levels or conditions of the independent variable (but can be done when there are only two)
- What is the impact of ethnicity (IV) (Hispanic, African-American, Asian-Pacific Islander, Caucasian, etc) on annual salary (DV)?
- What is the impact of three different methods of meeting a potential mate (IV) (online dating service; speed dating; setup by friends) on likelihood of a second date (DV)
Basic Analysis of Variance Concepts We are going to make two estimates of the common population variance, σ2 - The first estimate of the common variance σ2 is called the “between” (or “among”) estimate and it involves the variance of the IV category means about the grand mean
- The second is called the “within” estimate, which will be a weighted average of the variances within each of the IV categories. This is an unbiased estimate of σ2
The ANOVA test, called the F test, involves comparing the between estimate to the within estimate If the null hypothesis, that the population means on the DV for the levels of the IV are equal to one another, is true, then the ratio of the between to the within estimate of σ2 should be equal to one (that is, the between and within estimates should be the same) If the null hypothesis is false, and the population means are not equal, then the F ratio will be significantly greater than unity (one).
Basic ANOVA Output
More Review of ANOVA Even if we have obtained a significant value of F and the overall difference of means is significant, the F statistic isn’t telling us anything about how the mean scores varied among the levels of the IV. We can do some pairwise tests after the fact in which we compare the means of the levels of the IV The type of test we do depends on whether or not the variances of the groups (conditions or levels of the IV) are equal We test this using the Levene statistic. - If it is significant at p < .05 (group variances are significantly different) we use an alternative post-hoc test like Tamhane
- If it is not significant (groups variances are not significantly different) we can use the Sheffé or similar test
- In this example, variances are not significantly different (p > .05) so we use the Sheffé test
Review of Factorial ANOVA Two-way ANOVA is applied to a situation in which you have two independent nominal-level variables and one interval or better dependent variable Each of the independent variables may have any number of levels or conditions (e.g., Treatment 1, Treatment 2, Treatment 3…… No Treatment) In a two-way ANOVA you will obtain 3 F ratios - One of these will tell you if your first independent variable has a significant main effect on the DV
- A second will tell you if your second independent variable has a significant main effect on the DV
- The third will tell you if the interaction of the two independent variables has a significant effect on the DV, that is, if the impact of one IV depends on the level of the other
Review: Factorial ANOVA Example
Plots of Interaction Effects
MANOVA: What Kinds of Hypotheses Can it Test? A MANOVA or multivariate analysis of variance is a way to test the hypothesis that one or more independent variables, or factors, have an effect on a set of two or more dependent variables - For example, you might wish to test the hypothesis that sex and ethnicity interact to influence a set of job-related outcomes including attitudes toward co-workers, attitudes toward supervisors, feelings of belonging in the work environment, and identification with the corporate culture
- As another example, you might want to test the hypothesis that three different methods of teaching writing result in significant differences in ratings of student creativity, student acquisition of grammar, and assessments of writing quality by an independent panel of judges
Why Should You Do a MANOVA? You do a MANOVA instead of a series of one-at-a-time ANOVAs for two main reasons - Supposedly to reduce the experiment-wise level of Type I error (8 F tests at .05 each means the experiment-wise probability of making a Type I error (rejecting the null hypothesis when it is in fact true) is 40%! The so-called overall test or omnibus test protects against this inflated error probability only when the null hypothesis is true. If you follow up a significant multivariate test with a bunch of ANOVAs on the individual variables without adjusting the error rates for the individual tests, there’s no “protection”
- Another reasons to do MANOVA. None of the individual ANOVAs may produce a significant main effect on the DV, but in combination they might, which suggests that the variables are more meaningful taken together than considered separately
- MANOVA takes into account the intercorrelations among the DVs
Assumptions of MANOVA 1. Multivariate normality - All of the DVs must be distributed normally (can visualize this with histograms; tests are available for checking this out)
- Any linear combination of the DVs must be distributed normally
- Check out pairwise relationships among the DVs for nonlinear relationships using scatter plots
- All subsets of the variables must have a multivariate normal distribution
- These requirements are rarely if ever tested in practice
- MANOVA is assumed to be a robust test that can stand up to departures from multivariate normality in terms of Type I error rate
- Statistical power (power to detect a main or interaction effect) may be reduced when distributions are very plateau-like (platykurtic)
Assumptions of MANOVA, cont’d 2. Homogeneity of the covariance matrices - In ANOVA we talked about the need for the variances of the dependent variable to be equal across levels of the independent variable
- In MANOVA, the univariate requirement of equal variances has to hold for each one of the dependent variables
In MANOVA we extend this concept and require that the “covariance matrices” be homogeneous - Computations in MANOVA require the use of matrix algebra, and each person’s “score” on the dependent variables is actually a “vector” of scores on DV1, DV2, DV3, …. DVn
- The matrices of the covariances-the variance shared between any two variables-have to be equal across all levels of the independent variable
Assumptions of MANOVA, cont’d - This homogeneity assumption is tested with a test that is similar to Levene’s test for the ANOVA case. It is called Box’s M, and it works the same way: it tests the hypothesis that the covariance matrices of the dependent variables are significantly different across levels of the independent variable
- Putting this in English, what you don’t want is the case where if your IV, was, for example, ethnicity, all the people in the “other” category had scores on their 6 dependent variables clustered very tightly around their mean, whereas people in the “white” category had scores on the vector of 6 dependent variables clustered very loosely around the mean. You don’t want a leptokurtic set of distributions for one level of the IV and a platykurtic set for another level
- If Box’s M is significant, it means you have violated an assumption of MANOVA. This is not much of a problem if you have equal cell sizes and large N; it is a much bigger issue with small sample sizes and/or unequal cell sizes (in factorial anova if there are unequal cell sizes the sums of squares for the three sources (two main effects and interaction effect) won’t add up to the Total SS)
Assumptions of MANOVA, cont’d 3. Independence of observations - Subjects’ scores on the dependent measures should not be influenced by or related to scores of other subjects in the condition or level
- Can be tested with an intraclass correlation coefficient if lack of independence of observations is suspected
MANOVA Example Let’s test the hypothesis that region of the country (IV) has a significant impact on three DVs, Percent of people who are Christian adherents, Divorces per 1000 population, and Abortions per 1000 populations. The hypothesis is that there will be a significant multivariate main effect for region. Another way to put this is that the vectors of means for the three DVs are different among regions of the country This is done with the General Linear Model/ Multivariate procedure in SPSS (we will look first at an example where the analysis has already been done) Computations are done using matrix algebra to find the ratio of the variability of B (Between-Groups sums of squares and cross-products (SSCP) matrix) to that of the W (Within-Groups SSCP matrix)
MANOVA test of Our Hypothesis
MANOVA Test of our Hypothesis, cont’d
Box’s Test of Equality of Covariance Matrices
Looking at the Individual Dependent Variables If the overall F test is significant, then it’s common practice to go ahead and look at the individual dependent variables with separate ANOVA tests - The experimentwise alpha protection provided by the overall or omnibus F test does not extend to the univariate tests. You should divide your confidence levels by the number of tests you intend to perform, so in this case if you expect to look at F tests for the three dependent variables you should require that p < .017 (.05/3)
This procedure ignores the fact the variables may be intercorrelated and that the separate ANOVAS do not take these intercorrelations into account - You could get three significant F ratios but if the variables are highly correlated you’re basically getting the same result over and over
Univariate ANOVA tests of Three Dependent Variables
Writing up More of Your Results So far you have written the following: - “A one-way MANOVA revealed a significant multivariate main effect for region, Wilks’ λ = .465, F (9, 95.066) = 3.9, p <. 001, partial eta squared = .225. Power to detect the effect was .964. Thus hypothesis 1 was confirmed.”
You continue to write: - “Given the significance of the overall test, the univariate main effects were examined. Significant univariate main effects for region were obtained for percentage of Christian adherents, F (3, 41 ) = 3.944, p <.015 , partial eta square =.224, power = .794 ; and number of divorces per 1000 population, F (3,41 ) = 8.789 , p <.001 , partial eta square = .391, power = .991”
Finally, Post-hoc Comparisons with Sheffé Test for the DVs that had Significant Univariate ANOVAs
Significant Pairwise Regional Differences on the Two Significant DVs
Writing up All of Your MANOVA Results Your final paragraph will look like this “A one-way MANOVA revealed a significant multivariate main effect for region, Wilks’ λ = .465, F (9, 95.066) = 3.9, p <. 001, partial eta squared = .225. Power to detect the effect was .964. Thus Hypothesis 1 was confirmed. Given the significance of the overall test, the univariate main effects were examined. Significant univariate main effects for region were obtained for percentage of Christian adherents, F (3, 41 ) = 3.944, p <.015 , partial eta square =.224, power = .794 ; and number of divorces per 1000 population, F (3,41 ) =8.789 , p <.001 , partial eta square = .391, power = .991. Significant regional pairwise differences were obtained in number of divorces per 1000 population between the West and both the Northeast and Midwest. The mean number of divorces per 1000 population were 5.59 in the West, 3.6 in the Northeast, and 3.74 in the Midwest.” You can present the pairwise results and the MANOVA overall F results and univariate F results in separate tables
Now You Try It! Let’s test the hypothesis that region of the country and availability of an educated workforce have an impact on three dependent variables: % union members, per capita income, and unemployment rate Although a test will be performed for an interaction between region and workforce education level, no specific effect is hypothesized Go to SPSS Data Editor
Running a MANOVA in SPSS Go to Analyze/General Linear Model/ Multivariate Move Census Region and HS Educ into the Fixed Factors category (this is where the IVs go) Move per capita income, unemployment rate, and % of workers who are union members into the Dependent Variables category Under Plots, create four plots, one for each of the two main effects (region, HS educ) and two for their interaction. Use the Add button to add each new plot - Move region into the horizontal axis window and click the Add button
- Move hscat4 (HS educ) into the horizontal axis window and click the Add button
- Move region into the horizontal axis window and hscat 4 into the separate lines window and click Add
- Move hscat4 (HS educ) into the horizontal axis window and region into the separate lines window and click Add, then click Continue
Setting up MANOVA in SPSS - Under Options, move all of the factors including the interactions into the Display Means for window
- Select descriptive statistics, estimates of effect size, observed power, and homogeneity tests
- Set the confidence level to .05 and click continue
- Click OK
Compare your output to the next several slides
MANOVA Main and Interaction Effects
Univariate Tests: ANOVAs on each of the Three DVs for Region, HS Educ
Pairwise Comparisons on the Significant Univariate Tests We found that the only significant univariate main effect was for the effect of region on unemployment rate. Now let’s ask the question, what are the differences between regions in unemployment rate, considered two at a time? What does the Levene’s statistic say about the kind of post-hoc test we can do with respect to the region variable? According to the output, the group variances on unemployment rate are not significantly different, so we can do a Sheffé test
Pairwise Difference of Means
Reporting the Differences
Lab # 9 Duplicate the preceding data analysis in SPSS. Write up the results (the tests of the hypothesis about the main effects of region and HS Educ on the three dependent variables of per capita income, unemployment rate, and % union members, as if you were writing for publication. Put your paragraph in a Word document, and illustrate your results with tables from the output as appropriate (for example, the overall multivariate F table and the table of mean scores broken down by regions). You can also use plots to illustrate significant effects |