Central Tendency

Ray Block, Jr.
PS #585 (Research Methods)
Fall 2003

Today’s Blueprint
Last Class(es)
n    Hypothesis Testing
n    Confidence Intervals
n    The General Idea
n    The Notion of Error    
Today’s Class
n    Univariate Data Analysis (Part 1)
n    Statistical Models
n    Measures of Central Tendency

Statistical Models

Statistical Models
n    What are Models?
n    Abstractions from reality that order and simplify our view of reality
n    What Purpose Do They Serve?
n    Discuss significant relationships among concepts
n    Enable researchers to form testable propositions between variables
n    Summarize data
n    Remember:
n    You cannot test theories
n    You can test models based on theories

Statistical Models
n    Model Building:
n    Models are symbolic representations of real-world phenomena
n    The goal is to build a model that best represents the real-world phenomena of interest

Statistical Models
n    How to Build Models:
n    Observe some facts about the world
n    Speculate about the process(es) that produced those facts (i.e. the data generating process)
n    Collect data that represent the process(es)
n    Reduce the process(es) to a statistical model using the data you collected

Statistical Models
n    We use the models we build to make inferences and predictions about real-world processes
n    We want our models to be accurate so that our inferences and predictions will also be accurate
n    If we want our inferences and predictions to be accurate, then the models we build must accurately represent the data we collect
n    The degree to which a statistical model represents the data collected is known as the fit of the model to the data

Statistical Models
n    We will discuss several simple statistical models
n    These models fall into one of the following general categories:
n    Measures of Central Tendency
n    Measures of Spread

Measures of Central Tendency
Think: The 4 “Ms”

Measures of Central Tendency
n    Measures of Central Tendency (The 4 “Ms”):
n    The Midpoint
n    The Mode
n    The Median
n    The (Arithmetic) Mean

Measures of Central Tendency
n    The midpoint (?) is the value that falls equidistant from the lowest and highest points in a scale
n    The Midpoint is not used very often
n    It is a very rough estimate of the average

Measures of Central Tendency
n    The mode (“Maximum Frequency” or “Mo”) is the most frequently occurring number in a list of numbers
n    It is the closest thing to what people mean when they say something is “average” or “typical”
n    Calculating Mo:
n    The mode can easily be found by inspection, rather than through computation

Measures of Central Tendency
n    The median (“middlemost value” of “Mdn(x)”) is the number that falls in the middle of a range of numbers
n    Calculating Mdn(x):
n    The median position can be found by inspection or by the following formula:




n    Where:
n    N = Total number of scores (observations)

n    Interpreting Mdn(x):
n    It’s not the average; it’s the halfway point
n    There are always just as many numbers above the median as below it

Measures of Central Tendency
n    The most commonly used measure of central tendency is the (arithmetic) mean (“Center of Gravity” or “x-bar”)
n    Calculating X-Bar:
n    The mean = sum of scores divided by the total number of scores:

Measures of Central Tendency
n    Calculating X-Bar:
n    The mean = sum of scores divided by the total number of scores:

n    Where:
n    X-Bar = [Arithmetic] Mean
n    S = Sum
n    Xi = Each individual value of X
n    N = Total number of scores (observations)

n    Interpreting X-Bar:
n    It’s not the average nor a halfway point
n    It is a kind of center that balances high numbers with low numbers

Measures of Central Tendency
n    Uses:
n    The mean is the most important measure of central tendency in statistics
n    Most measures of spread are based on the mean
n    Why? Because the mean is the number which has the smallest squared distance from all other numbers in the distribution

Measures of Central Tendency
n    Step-by-Step Illustration:
n    Suppose that a volunteer canvasses houses in her neighborhood collecting money for a local charity.  She receives the following donations (in dollars):








n    Here are the steps you would use to calculate the mode, median and the mean:
n    Step 1: Arrange the scores from highest to lowest:








n    Step 2: Find the most frequently occurring score:
n    By inspection, mode = $5









n    Step 3: Find the middlemost score:
n    By inspection: Because there are 7 scores (an odd number) the fourth score from either end is the median
n    By the formula: (N-1)/2 = (7+1)/2 = 4, so the median is the fourth score from either end
n    In both cases, median score = $10







n    Step 4: Determine the Sum of the Scores












n    Therefore, S Xi = 80

n    Step 5: Determine the Mean by dividing the Sum by the Number of the Scores


Measures of Central Tendency
n    The mode, median, and mean provide different pictures of “charitable” giving in the neighborhood
n    The mode suggests that donations are typically small
n    The median suggests that the average donation is more generous
n    The mean paints the most generous picture of the average donation

Measures of Central Tendency
n    Which Measure You Use Depends On:
n    The Data’s Level of Measurement
n    The Distribution of Data
n    The Research Objective

Measures of Central Tendency
n    The Data’s Level of Measurement:

    Mode    Median    Mean
Nominal     ü        
Ordinal    ü    ü    
Interval/Ratio    ü    ü    ü

n    For nominal level data, you can only use the mode
n    For ordinal level data, you can use the mode or the median
n    For interval level data, you can use the mode, median, or mean



Measures of Central Tendency
n    Distribution of Data = Shape of the Distribution
n    Symmetric
n    Skewed
n    Positive (right)
n    Negative (left)
n    Unimodal, multimodal



Measures of Central Tendency
Measures of Central Tendency
Measures of Central Tendency








Measures of Central Tendency
n    Unimodal = One mode
n    Multi modal = More than one mode











n    In a unimodal symmetric distribution (see image above to the left) the mean, median, and mode are identical
n    In a bimodal distribution (see image to the right) the mode, median, and mean differ

Measures of Central Tendency
n    Research Objective
n    For the Mode, the goal is to obtain fast, simple, but rough measure of central tendency
n    For the Median, The goal is to obtain precise measure of central tendency
n    Sometimes can be used for more advanced statistical operations or for splitting distributions into categories (for example, low versus high)
n    For the Mean, The goal is to obtain precise measure of central tendency
n    Often can be used for more advanced statistical operations, including hypothesis tests

References
FYI:
n    Levin, Jack and James Alan Fox. 2003. Elementary Statistics in Social Research, 9th Edition. Boston, MA: Pearson Education Group, Inc.