## Applied Data Analysis Homework # 6

Part One:  Brief Response

The main, and most important part of Chapter 13 was how to talk and think about factorial ANOVAs; the terminology, notation, calculations, and underlying logic. There were several other major points, but I thought the most important might be how to deal with unequal sample sizes—the problems created and ways to resolve them.

I found section 13.8 (new edition) to be the most confusing. I get most of the individual concepts, like sampling fractions and fixed variables, and I’m a bit confused about how random sampling can generate levels, but I didn’t put it all together: I don’t understand expected mean squares.

Part Two: Problem Set

1a. Reported level of distraction had a significant effect on number of errors made (F(1, 131) = 65.696, p < .001). However, after controlling for level of distraction, there was still a significant effect of task type on number of errors (F(2, 131) = 139.638, p < .001). That is, although level of distraction had a significant effect on the number of errors made, the type of task significantly predicted number of errors above and beyond level of distraction.

1b. The full model from last week, not including distractibility, had F(2, 132) = 113.474, p < .001 and R2 = .632. When distractibility was accounted for, both the F-value and the R2 increased. This is because the extra factor, distractibility, accounts for some of the variation that had previously been considered error, decreasing the SS of error from 16670.400 to 11102.535. Decreasing the error increases the power of the test.

1c. Pattern recognition had a raw mean of 9.644 and a slightly larger adjusted ean of 9.709. The cognitive task had a raw mean of 38.778 and a smaller adjusted mean of 37.236. The driving simulation had a raw mean of 6.356 and a larger adjusted mean of 7.833. The means have changed because we have taken out the effect contributed by distractibility. The raw means are just the mean number of errors per task. The adjusted means are calculated with the effect of the covariate removed, holding it at its mean, 112.54. Figure 4 shows a plot of these means.

 Descriptive Statistics Dependent Variable:Number of Errors Task Type Mean Std. Deviation N Pattern Recognition 9.6444 4.51339 45 Cognitive Task 38.7778 18.05533 45 Driving Simulation 6.3556 5.70150 45 Total 18.2593 18.39288 135

Figure 1. Means are raw means.

 Tests of Between-Subjects Effects Dependent Variable:Number of Errors Source Type III Sum of Squares df Mean Square F Sig. Corrected Model 34229.391a 3 11409.797 134.625 .000 Intercept 1213.155 1 1213.155 14.314 .000 distract 5567.865 1 5567.865 65.696 .000 tasktype 23669.320 2 11834.660 139.638 .000 Error 11102.535 131 84.752 Total 90341.000 135 Corrected Total 45331.926 134 a. R Squared = .755 (Adjusted R Squared = .749)

Figure 2. Showing a significant p-value for distractibility and for

the omnibus test for type of task.

 Task Type Dependent Variable:Number of Errors Task Type Mean Std. Error 95% Confidence Interval Lower Bound Upper Bound Pattern Recognition 9.709a 1.372 6.994 12.423 Cognitive Task 37.236a 1.385 34.495 39.977 Driving Simulation 7.833a 1.384 5.095 10.572 a. Covariates appearing in the model are evaluated at the following values: Distractability = 112.5407.

Figure 3. These are the adjusted means. Figure 4. Estimated marginal means for number of errors by

2a. Distractibility significantly predicted number of errors ( = .438, t (133) = 6.355, p < .001), accounting for just over 23% of the variance in number of errors, R2 = .233. That is, the more distractible participants rated themselves, the more errors they tended to make, and about 23% of the errors made can be accounted for by this relationship.

The reported above (.483) represents the change in errors made, in terms of standard deviations, associated with a one standard deviation change in distractibility. The unstandardized coefficient for distractibility, b = .418, is the change in number of errors associated with a one raw unit change in distractibility. Both of these are statistically significantly different from zero, as indicated by p < .001 (See Figure 6), which means that the variation accounted for by distractibility is significant.

 Model Summary Model R R Square Adjusted R Square Std. Error of the Estimate 1 .483a .233 .227 16.16918 a. Predictors: (Constant), Distractability

Figure5.

 Coefficientsa Model Unstandardized Coefficients Standardized Coefficients t Sig. B Std. Error Beta 1 (Constant) -28.750 7.526 -3.820 .000 Distractability .418 .066 .483 6.355 .000 a. Dependent Variable: Number of Errors

Figure 6.

3. An augmented model including Distractibility (b = .309, t(131) = 8.105, p < .001), and the orthogonal contrasts show in Figure 7, the Pattern Recognition vs. Driving Simulation contrast (b = 9.488, t(131) = 16.695, p < .001), and the Cognitive Task vs. the average of the Pattern Recognition and Driving Simulation contrast (b = .938, (131) = .962, p = .338), provided a reasonably good fit (R2 = .755), significantly predicting variance in errors made (F(3, 131) = 134.625, p < .001). In other words, this model predicted over 75% of the variation in number of errors, which was statistically significant. Note that on the correlation table, Figure 9, the contrasts are correlated at 0, showing that they are indeed orthogonal.

3a. The unstandardized coefficient for Distractibility (.309) means that for every increase of one unit in Distractibility, participants tended to make .309 more errors. The unstandardized coefficient for the Pattern Recognition vs. Driving Simulation contrast (.938) equals half the number of errors that can be attributed to the shift between the pattern recognition task and the driving simulation task. That is, it is half of the difference between the mean number of errors in the pattern recognition task and the driving simulation task. The unstandardized coefficient for the Cognitive Task vs. the average of the Pattern Recognition and Driving Simulation contrast (9.488) is one third the number of errors that can be attributed to the shift between the cognitive task and the average of the pattern recognition and driving simulation tasks. The value under B for the constant in the Coefficients table, Figure 12, is not interpretable in the same way as that coefficient from homework 5; it can only be used to calculate the adjusted means.

3b. The number of errors in the Cognitive Task (adjusted mean = 37.24) was significantly higher than number of errors in the Pattern Recognition and Driving Simulation tasks, even after accounting for the effects of distractibility (b = 9.488, t(131) = 16.696, p < .001). Number of errors in the Pattern Recognition task (adjusted mean = 9.71) was not significantly different from number of errors in the Driving Simulation task (adjusted mean = 7.83) after controlling for the effect of distractibility (b = -.938, t(131) = .962, p = .338). That is, after controlling for distractibility, the number of errors in the Cognitive Task is significantly higher than the Pattern Recognition and Driving Simulation tasks, and the Pattern Recognition and Driving Simulation tasks are not significantly different from each other.

The p-value for pattern vs. driving, p < .338, means that the coefficient for that contrast is not significantly different from zero. The p-value for cognitive vs. pattern and driving, p < .001, means that the coefficient for that contrast, 9.488, is significantly different from zero, which tells us if the comparison coded into that contrast is significant or not.

3c. These results are similar to what the ANCOVA gave us except that they give us more insight into what’s going on between the tasks. The adjusted means for the tasks are the same, R2 is the same, and the F values for the whole models are the same (134.625)  the current t-value for distractibility (squared) equals the F from the ANCOVA; 8.1052 = 65.696. The differences are that the ANCOVA is an omnibus test when it comes to the differences between the task types, so we didn’t tell exactly where the differences are until we do the regression. The degrees of freedom are also different.

 P R C T D S Check CC1 -1 2 -1 0 CC2 -1 0 1 0 Check -1 0 1 0

 Descriptive Statistics Mean Std. Deviation N Number of Errors 18.2593 18.39288 135 Distractability 112.5407 21.25248 135 CT v PR DS .0000 1.41948 135 PR v DS .0000 .81954 135

Figure 7.    Figure 8.

 Correlations # Errors Distractability CT v PR DS PR v DS Pearson Correlation Number of Errors 1.000 .483 .792 -.073 Distractability .483 1.000 .167 -.088 CT v PR DS .792 .167 1.000 .000 PR v DS -.073 -.088 .000 1.000 Sig. (1-tailed) Number of Errors . .000 .000 .199 Distractability .000 . .027 .154 CT v PR DS .000 .027 . .500 PR v DS .199 .154 .500 . N Number of Errors 135 135 135 135 Distractability 135 135 135 135 CT v PR DS 135 135 135 135 PR v DS 135 135 135 135

Figure 9.

 Model Summaryc Model R R Square Adjusted R Square Std. Error of the Estimate Change Statistics R Square Change F Change df1 df2 Sig. F Change 1 .483a .233 .227 16.16918 .233 40.392 1 133 .000 2 .869b .755 .749 9.20609 .522 139.638 2 131 .000 a. Predictors: (Constant), Distractability b. Predictors: (Constant), Distractability, PR v DS, CT v PR DS c. Dependent Variable: Number of Errors

Figure 10.

 ANOVAc Model Sum of Squares df Mean Square F Sig. 1 Regression 10560.071 1 10560.071 40.392 .000a Residual 34771.855 133 261.443 Total 45331.926 134 2 Regression 34229.391 3 11409.797 134.625 .000b Residual 11102.535 131 84.752 Total 45331.926 134 a. Predictors: (Constant), Distractability b. Predictors: (Constant), Distractability, PR v DS, CT v PR DS c. Dependent Variable: Number of Errors

Figure 11.

 Coefficientsa Model Unstandardized Coefficients Standardized Coefficients t Sig. Collinearity Statistics B Std. Error Beta Tolerance VIF 1 (Constant) -28.750 7.526 -3.820 .000 Distractability .418 .066 .483 6.355 .000 1.000 1.000 2 (Constant) -16.499 4.361 -3.783 .000 Distractability .309 .038 .357 8.105 .000 .964 1.037 CT v PR DS 9.488 .568 .732 16.696 .000 .972 1.029 PR v DS -.938 .974 -.042 -.962 .338 .992 1.008 a. Dependent Variable: Number of Errors

Figure 12.

4a. Testing the reduced model (just distractibility, from question 2, or see model 1 in Figure 12) against the full model (distractibility plus the Cognitive Task vs. the average of the Pattern Recognition and the Driving Simulation contrast, and the Pattern Recognition vs. Driving Simulation contrast (model 2 in Figure 12)), we can conclude that the two contrasts explain a significant amount of variance that is not accounted for by distractibility, ΔR2 = .522, ΔF(2, 131) = 139.638, p < .001. That is, adding the two contrasts to our model explains a significantly larger amount of the variation in number of errors made, above and beyond that accounted for by distractibility.

5a. The Homogeneity of Regression assumption is not met in this data set. The model summary table above shows a significant change in F (p < .001) when the covariate interactions are added to the model. The groups who did each task differed significantly on their levels of distractibility, so distractibility cannot be legitimately covaried out; it would be unwise to use ANCOVA here.

5b. The tolerance and VIF statistics for model 2 (shown in Figure 16) are all problematic, except for the statistics for distractibility. Those for model 1 look fine. The correlation table, Figure 13, shows only two problematic correlations; each contrast is highly correlated with its respective covariate interaction. This is to be expected, since the contrasts are factors of the interactions. Looking at the data set, two of the distance values are over three, but all of the Cook’s D and leverage values look fine. Overall I would say that collinearity is not much of a problem here, except that groups are not supposed to differ on the covariate, and some of the adjusted means were higher, some lower than their raw means, suggesting that the groups were different on the covariate.

5c. The results of 5a would probably have kept me from running an ANCOVA in the first place and, having done so, I would not be confident in the results. I would also be somewhat cautious interpreting the results based on 5b, at least until I had talked to someone who knew more than I do about collinearity. (See pages 11-13 for Figures 13-17.)

 Correlations Number of Errors Distractability CT v PR DS PR v DS cov_int1cc1 cov_int2cc2 Pearson Correlation Number of Errors 1.000 .483 .792 -.073 .838 -.059 Distractability .483 1.000 .167 -.088 .208 -.080 CT v PR DS .792 .167 1.000 .000 .981 .012 PR v DS -.073 -.088 .000 1.000 .011 .986 cov_int1cc1 .838 .208 .981 .011 1.000 .022 cov_int2cc2 -.059 -.080 .012 .986 .022 1.000 Sig. (1-tailed) Number of Errors . .000 .000 .199 .000 .249 Distractability .000 . .027 .154 .008 .178 CT v PR DS .000 .027 . .500 .000 .446 PR v DS .199 .154 .500 . .448 .000 cov_int1cc1 .000 .008 .000 .448 . .400 cov_int2cc2 .249 .178 .446 .000 .400 . N Number of Errors 135 135 135 135 135 135 Distractability 135 135 135 135 135 135 CT v PR DS 135 135 135 135 135 135 PR v DS 135 135 135 135 135 135 cov_int1cc1 135 135 135 135 135 135 cov_int2cc2 135 135 135 135 135 135

Figure 13.

 Model Summaryc Model R R Square Adjusted R Square Std. Error of the Estimate Change Statistics R Square Change F Change df1 df2 Sig. F Change 1 .869a .755 .749 9.20609 .755 134.625 3 131 .000 2 .903b .816 .809 8.03698 .061 21.442 2 129 .000 a. Predictors: (Constant), PR v DS, CT v PR DS, Distractability b. Predictors: (Constant), PR v DS, CT v PR DS, Distractability, cov_int1cc1, cov_int2cc2 c. Dependent Variable: Number of Errors

Figure 14.

 ANOVAc Model Sum of Squares df Mean Square F Sig. 1 Regression 34229.391 3 11409.797 134.625 .000a Residual 11102.535 131 84.752 Total 45331.926 134 2 Regression 36999.426 5 7399.885 114.562 .000b Residual 8332.500 129 64.593 Total 45331.926 134 a. Predictors: (Constant), CC2, CC1, Distractability b. Predictors: (Constant), CC2, CC1, Distractability, cov_int1, cov_int2 c. Dependent Variable: Number of Errors

Figure 15.

 Coefficientsa Model Unstandardized Coefficients Standardized Coefficients t Sig. Collinearity Statistics B Std. Error Beta Tolerance VIF 1 (Constant) -16.499 4.361 -3.783 .000 Distractability .309 .038 .357 8.105 .000 .964 1.037 CT v PR DS 9.488 .568 .732 16.696 .000 .972 1.029 PR v DS -.938 .974 -.042 -.962 .338 .992 1.008 2 (Constant) -11.137 3.896 -2.858 .005 Distractability .255 .034 .295 7.443 .000 .909 1.100 CT v PR DS -7.150 2.591 -.552 -2.760 .007 .036 28.064 PR v DS -4.091 5.040 -.182 -.812 .419 .028 35.400 cov_int1cc1 .146 .022 1.318 6.538 .000 .035 28.511 cov_int2cc2 .025 .045 .122 .543 .588 .028 35.334 a. Dependent Variable: Number of Errors

Figure 16.

 Descriptive Statistics Mean Std. Deviation N Number of Errors 18.2593 18.39288 135 Distractability 112.5407 21.25248 135 CC1 .0000 1.41948 135 CC2 .0000 .81954 135 cov_int1 4.9926 166.42866 135 cov_int2 -1.5259 91.49821 135

Figure 17.

1. How to Get Six Pack Fast Says:
2. obnocto Says: