3 Descriptive Measures
Definition: Mean
The mean of a data set is the sum of the observations divided by the number of observations.
Definition: Median
The median of a data set is the middle value of the "ordered" data set.
Number of observations is odd - take the "middle" value.
Number of observations is even - take the "two middle" values, and take their average.
Definition: Mode
The mode of a data set is the most commonly occurring value.
Definition: Sample mean
For a variable x, the mean for a sample of size n is called a sample mean and
is denoted
.
Definition: Population mean
For a variable x, the mean for a population of size N is called a population mean
and is denoted m.
Definition: Measure of dispersion
Variation or spread in a data set.
Definition: Range
= Maximum data value - minimum data value
Definition: Standard deviation
How far, on average, data values are from the mean (kind of).
Definition: Deviation from the mean

Definition: Squared deviation

Definition: Sample Variance
Definition: Standard Deviation
Shortcut Formula:
s =
Theorem: The more variation there is in a data set, the larger its standard deviation.
Theorem: (Three-Standard-Deviations Rule) Almost all of the observations in any data set
lie within three standard deviations to either side of the mean.
Chebychev's Rule
For any data set and any number k, at least 100(1 - (1/k2)) %
percent of the data lie
within k standard deviations from the mean.
Empirical Rule (abridged)
For an approximately bell-shaped distribution, roughly 99.7 percent of the data lies within
3 standard deviations from the mean.
Definition: Percentile
The nth percentile divides the bottom nth percent of the increasing order
data from the top (100 - n)th percent of the data.
Notation: Pn
(Note: deciles, quintiles, quartiles also)
Definition: Quartiles
Divides the data set into 4 pieces.
1st quartile - the median of the data at or below the median
2nd quartile - the median of the data set
3rd quartile - the median of the dat at or above the median
Definition: Interquartile Range
Denoted IQR, this is the difference between the first and third quartiles; that is,
IQR = Q3 - Q1.
Roughly speaking, the IQR gives the range of the middle 50% of the observations.
Definition: Five-Number Summary
Consists of the minimum, maximum, and quartiles written in increasing order:
Min, Q1, Q2, Q3, Max.
Definition: Outlier
An observation of the variable "well outside" of the range of the rest of the data
Definition: Population standard deviation
Or, standard deviation of
the variable x, denoted by sx ,
or just s,
where N = population size.
Shortcut:
Definition: Parameter
A parameter is a descriptive measure for a population.
(m, s )
Definition: Statistic
A statistic is a descriptive measure for a sample.
(
, s).
Definition: Standardized Variable
The variable
is the standardized version of x or the standardized variable corresponding to x (called the "z-value").
Note: The z-value for a data point is called the z-score.