Measures of Spread
Ray Block, Jr.
PS 585 (Research Methods)
Fall 2003
Today’s Blueprint
Last Class
n Univariate Data Analysis (Part 1)
n Statistical Models
n Measures of Central Tendency
Today’s Class
n Univariate Data Analysis (Part 2)
n Statistical Models (A Recap)
n Measures of Variability
Statistical Models (A Recap)
Recap
What Models Are:
n Symbolic representations of social phenomena
n Statistical models use mathematical/statistical symbols
What Purpose Do [Statistical] Models Serve:
n Discuss significant relationships among concepts
n Enable researchers to form testable propositions between
variables
n Summarize data
The Goal of Statistical Modeling:
n To build a model that best represents the real-world
phenomena of interest
n The degree to which a statistical model represents the
data collected is known as the fit of the model to the data
How Do You Build Statistical Models
n Observe some facts about the world
n Speculate about the process(es) that produced those facts
n Collect data that represent the process(es)
n Reduce the process(es) to a statistical model using the
data you collected
Statistical Models Fall Into 2 Categories
n Models measuring Central Tendencies
n Models measuring Variability
Recap
Central Tendencies (The 4 “Ms”)
1) The Midpoint/Midrange
n Description: Picking the middle slice of bread
n Level of measurement: Ordinal, Interval, Ratio
n Shape of Distribution: N/A
n Research Objective: Crude measure of central tendency
n Note: Seldom used in social science
2) The Mode (Mo)
n Description: Maximum Frequency
n Level of Measurement: Nominal
n Shape of Distribution: Most appropriate for bimodal or
multimodal
n Research Objective: Fast, simple, but rough measure of
central tendency
3) The Median (Mdn)
n Description: Middlemost Value
n Level of Measurement: Ordinal, Interval, or Ratio
n Shape of Distribution: Most appropriate for highly skewed
n Research Objective: Precise measure of central tendency
n Note: Sometimes used to split distributions into categories
(i.e. high vs. low)
4) The [Arithmetic] Mean (X-Bar)
n Description: Center of Gravity
n Level of Measurement: Interval or Ratio
n Shape of Distribution: Most appropriate for unimodal
symmetrical
n Research Objective: Precise measure of central tendency
n Note: Most commonly-used central measure. Used
for hypothesis tests and other statistical operations
Recap
Finding the mode, median, and mean:
n Arrange scores from highest to lowest
n The mode is the most frequent score
n The Median is the middlemost value in the ordered list
of scores
n If there is an odd number of scores, then median is in
the exact middle of the list
n If there is an even number of scores, then the median
is halfway between the two middlemost scores
n Determine the sum of the scores
n Calculate the mean by dividing the sum by the number
of scores
Measures of Variability
AKA: Measures of “Spread” “Width” or “Dispersion”
Measures of Variability
n In data analysis, the purpose of calculating measures
of dispersion is to discover the extent to which scores differ, cluster,
or scatter around a measure of central tendency
Some Measures of Spread:
n The Range
n The Mean Deviation
n The Variance
n The Standard Deviation
n Standard Error
Measures of Variability
n The Range is the difference between the highest and the
lowest score: R = H – L
n Where:
n R = Range
n H = Highest score in a distribution
n L = Lowest score in a distribution
n Advantages:
n Quick and easy to calculate
n Disadvantage:
n Crude measure of variability
n Why? Because it depends only on lowest and highest values
in distribution
Measures of Variability
n Deviation = The distance between any given raw score
and its mean (Xi – X-Bar)
n Mean Deviation = The average distance between the raw
scores and the mean
Where:
n MD = Mean Deviation
n S|Xi - X-Bar| = Sum of absolute deviations (disregarding
plus or minus signs)
n N = Total number of scores
Step-by-Step Illustration:
n Take the following list of numbers (arranged from highest
to lowest):
n Step 1: Find the mean of the distribution
n Step 2: Subtract the mean from each raw score
n Take the absolute values (ignore the signs)
n Add up these absolute deviations
n Step 3: To get MD, Divide S|X - X-bar| by N to adjust
for the cases involved
n Note: Mean deviations are no longer widely used in social
sciences. However, calculating MD is not a complete waste of time
n …Here’s why…
n Recall that we took the absolute values to avoid getting
minus signs: S|Xi - X-bar|
n We use absolute values so that the different signs of
values in S(X - X-bar) do not cancel themselves out
n We can also get around this sign canceling issue by squaring
S(Xi - X-bar)
n Therefore, the variance = The mean of the squared deviations
n Where:
n S2 = Variance
n S(X - X-bar)2 = Sum of squared deviations from mean
n N = Total number of observations
n Variance = The average difference between the mean and
the observations made
n Caveat:
n Squaring the deviations alters the units of measurement
n We need to bring the units back to their original non-squared
values
n The simplest way to do this is to take the square root
everything
n Standard deviation = square root of the variance
n Squared values are not standard (doesn’t make sense to
talk in terms of things squared)
n Standard deviations restate variance in standard units
References
n FYI:
n Levin, Jack and James Alan Fox. 2003. Elementary Statistics
in Social Research, 9th Edition. Boston, MA: Pearson Education Group, Inc.
n Salkind, Neil. 2003. Exploring Research, 5th Edition.
Upper Saddle River, NJ: Prentice Hall.