Objective 1: Testing for homogeneity of variance
Testing for homogeneity of variance means testing whether two
standard deviations are significantly different from one another. Since
the exam was so much fun, let's use the exam data once again (last time,
I promise).
a) run Proc Means to get the standard deviations for males and females for variable FITNESS b) use a new DATA statement to create a new SAS data set. c) manually define the two standard deviations (e.g., STD1 = 24) and the sample size of the two gender groups. The larger standard deviation should go in the numerator. (Note that, in this case, the sample sizes are equal. Next week, we'll do the same exercise with unequal sample sizes -- see Levine, 10-8, if you want to read ahead) d) compute v, where v = std1/std2 e) compute the standard error for v: SEv = v * sqrt (2/((2*n) - 5.5)) f) create a two-sided confidence interval around vAre the standard deviations significantly different? How do you know?
Objective 2: The t-test
In objective 1, we tested differences in standard deviations. Now, we will test the differences between group means. Remember that the t-test can only be used if your independent variable has two levels (e.g., gender). The t-test will not work if the independent variable has more than two levels.
Here's the SAS code for the t-test:
proc ttest data=lab7; * you won't need the "data = lab7" if you've already specified your data set; class GENDER; * this defines your independent variable; var fitness ; * this defines your dependent variable(s); run;Easy, right? Look at the output. The Prob> |T| is your p-value. Is there a significant gender difference in fitness scores? How do you know?
Now look at the last line of your output for the t-test. Notice that when you run a t-test, SAS automatically conducts a test for homogeneity of variance. Does the SAS output match your output from Objective 1?
Objective 3: Confidence intervals for testing mean differences
Unfortunately, proc t-test will not give you the standard error for the mean differences, so we'll have to define it ourselves.
a) run PROC MEANS again for the variable FITNESS (by gender), this time asking for the variance. Note the new line in the SAS code below:
proc means data=lab7 * "data = lab7" refers to the fitness SAS data set; N MEAN VAR STD; * this is new. Notice that there is no semicolon. This is all part of the 1st line; class GENDER; var Fitness; run; b) use a new DATA statement to create a new SAS data set. c) manually define the two variances (e.g., VAR1 = 100) and the two sample sizes for males and females. d) compute the standard error where SE = sqrt ((var1/n1)+(var2/n2)); e) create a two-sided 95% confidence interval for the t distribution (look up the critical value in Shavelson) where LB and UB are defined as follows: LB = (mean1-mean2) - error band UB = (mean1-mean2) + error band f) create one-sided confidence intervals where the WC and the BC are defined as follows: WC = (mean1-mean2) - error band BC = (mean1-mean2) + error band