Comparing Two Means (Part 2)
Ray Block, Jr.
PS #585 (Research Methods)
Fall 2003
Today’s Blueprint
Last Class:
- Correlation (A Recap)
- Comparing Two Means
- The Logic Behind Means Difference Tests
Today’s Class
- Comparing the Means of 2 Dependent Populations
- Comparing the Means of 2 Independent Populations
- Difference Tests
- Confidence Intervals (T-Interval)
T-Tests: A Brief Review
What are T-Tests?
- They are tests of significance
- They determine whether there is a significant difference between
the mean values of two groups
- Two Types:
- T-Tests for Independent Means (AKA: The Pooled T-Tests)
- T-Tests for Dependent Means (AKA: The Paired T-Tests)
- T-Tests for Independent Means = Tests the difference between the means
of 2 unrelated groups
- Who is in the second sample doesn’t depend on who is in the first
sample
- T-Tests for Dependent Means = Tests the difference between the means
of 2 related groups
- Who is in the second sample depends on who is in the first sample
What Purpose Do T-Tests Serve?
- Allow researchers to compare groups:
- Determine whether one group is meaningfully different from another
group
- In an experiment, researchers can determine whether differences between
groups are due to the manipulation
All T-Tests Assume the Following:
- Data in each group follow a normal distribution
- For pooled test, the variances for each group are equal (no between
group variation)
- The samples are independent
- But what if samples are not independent?!
- If samples are not independent, then we say that they are dependent,
correlated, or paired
- Ways That Pairing Can Occur:
- When subjects in one group are “matched” with a similar subject in
the second group.
- When subjects serve as their own control by receiving both of two
different treatments.
- When, in “before and after” studies, the same subjects are measured
twice.
When To Use T-Tests:
- When doing an experiment
- When doing a comparative observational study
- Experiments = Manipulating one variable to observe its effect on another
variable
- Example of a simple experiment:
- One dependent variable (Y)
- One independent variable (X) manipulated 2 ways
- Treatment group
- Control group
- Comparative Observation = A research study in which two or more groups
are compared with respect to some measurement or response
- The groups, determined by their natural characteristics, are merely
“observed”
Comparing the Means of 2 Independent Populations
A Pooled T-Test Example
Research Question: Do male and female drivers differ with respect
to their fastest reported driving speed?
- Data:
- Sample of n1 = 17 males report average of 102.1 mph
- Sample of n2 = 21 females report average of 85.7 mph
- The difference in the sample means is 102.06 - 85.71 = 16.35 mph
| Gender |
N |
Mean |
Standard Deviation |
Minimum Speed |
Maximum Speed |
| Female |
21 |
85.71 |
9.39 |
75.00 |
105.00 |
| Male |
17 |
102.06 |
17.05 |
75.00 |
145.00 |
Research Question (in statistical notation)
- Let
= the average fastest speed of all male students
- And let
= the average fastest speed of all female students
- Then we want to know whether
- This is equivalent to wanting to know whether
Set Up The Hypotheses:
- In general, we can always compare two means by seeing how their difference
compares to 0:
- Null hypothesis:
- This is equivalent to saying that
- Alternative hypothesis:
- This is equivalent to saying that
Make Initial Assumption:
- Assume null hypothesis is true
- That is, assume
- Or, equivalently, assume
Determine Significance Test (P-Value):
- P-value = “How likely is it that our sample means would differ by
as much as 16.35 m.p.h. if the difference in population means really is 0?”
- We conventionally choose a significance level of 0.05
Calculating the P-Value:
- The P-value is determined by standardizing, that is, by calculating
the two-sample test statistic:
- …and comparing the value of the test statistic to the appropriate
sampling distribution
- If variances of the measurements of the two groups are not equal,
estimate the standard error of the difference as:
- Then the sampling distribution is an approximate t distribution with
a complicated formula for d.f.
- If variances of the measurements of the two groups are equal...
- Note: Assume variances are equal only if neither sample standard deviation
is more than twice that of the other sample standard deviation.
where:
- Then the sampling distribution is a t distribution with n1+n2-2 degrees
of freedom.
2-Sample T-Test in SPSS:
- Analyze ==> Compare Means ==> Independent-Samples T Test
- Select one or more continuous, numeric test variables. A separate
t test is computed for each variable
- Example: Place the dependent variable (driving speed) and put it
in the “Test variable” box
- Select a dichotomous grouping variable (a categorical variable that
divides cases into two groups)
- Example: Place the independent variable (gender) and put it in the
“grouping variable” box
- Select your confidence interval (default is 95%) from the “options”
box
- Click Define Groups and specify the values of the grouping variable
that define the two groups
- Example: The grouping variable can be a string (alphanumeric) variable
or a numeric variable that uses numeric codes to represent categories (e.g.,
0=Male, 1=Female)
Pooled 2-Sample T-Test Results:
Two sample T for Fastest
Gender N
Mean StDev SE Mean
female 21 85.71
9.39 2.0
male 17
102.1 17.1
4.1
95% CI for mu (female) - mu (male): ( -25.9, -6.8)
T-Test mu (female) = mu (male) (vs not =): T = -3.54
P = 0.0017
DF = 23
Determine Significance Test:
- Our obtained value (0.0017) is smaller than this (0.001)
- This tells us that our sample result is not likely if the null hypothesis
is true
- Therefore, we can reject the null hypothesis
Make A Decision:
- There is sufficient evidence, at the 0.05 level of significance, to
conclude that the average reported fastest driving speed of all male college
students differs from the average reported fastest driving speed of all female
students
How Confident Are You?
- We can be “such-and-such” confident that the difference in the population
means falls in the interval:
- Where the t* multiplier depends on the confidence level and is obtained
either from the appropriate t distribution.
- Interpreting a confidence interval for the difference in 2
means:
If the Confidence Interval contains...
|
Then, we can conclude...
|
Zero
|
The two means may not difffer
|
Only positice numbers
|
The first mean is larger than the second
|
Only negative numbers
|
The first mean is smaller than the second
|
Comparing the Means of 2 Dependent Populations
A Paired T-Test Example
Research Question: Do males earn higher average starting salaries
than females?
- ...The Real question is whether males and females in the same job
earn different average salaries. Better then to compare the difference
in salaries in “pairs” of males and females:
What is the P-Value?
- P-value = How likely is it that a paired sample would have a difference
as large as $2,000 if the true difference were 0?
Hypotheses for Paired T-test
- Does the average difference of the population,
, differ from 0?
Data analyzed as Paired T-Test
Paired T for M - F
N Mean StDev
SE Mean
M
4 41.5 26.2
13.1
F
4 39.5 26.1
13.1
Difference 4
2.000 0.816 0.408
95% CI for mean difference: (0.701, 3.299)
T-Test of mean difference = 0 (vs not = 0):
T-Value = 4.90 P-Value = 0.016
Interpret Results
- P = 0.016. This is smaller than 0.05
- Therefore, reject null
- Sufficient evidence to conclude that average starting salaries differ
between males and females