3 Descriptive Measures


Definition: Mean
The mean of a data set is the sum of the observations divided by the number of observations.

Definition: Median
The median of a data set is the middle value of the "ordered" data set.
Number of observations is odd - take the "middle" value.
Number of observations is even - take the "two middle" values, and take their average.

Definition: Mode
The mode of a data set is the most commonly occurring value.

Definition: Sample mean
For a variable x, the mean for a sample of size n is called a sample mean and is denoted x-bar.

Definition: Population mean
For a variable x, the mean for a population of size N is called a population mean and is denoted m.

Definition: Measure of dispersion
Variation or spread in a data set.

Definition: Range
= Maximum data value - minimum data value

Definition: Standard deviation
How far, on average, data values are from the mean (kind of).

Definition: Deviation from the mean
x minus x-bar

Definition: Squared deviation
x minus x-bar, squared

Definition: Sample Variance
s-squared

Definition: Standard Deviation
s

Shortcut Formula:
s = x minus x-bar

Theorem: The more variation there is in a data set, the larger its standard deviation.

Theorem: (Three-Standard-Deviations Rule) Almost all of the observations in any data set lie within three standard deviations to either side of the mean.

Chebychev's Rule For any data set and any number k, at least 100(1 - (1/k2)) % percent of the data lie within k standard deviations from the mean.

Empirical Rule (abridged)
For an approximately bell-shaped distribution, roughly 99.7 percent of the data lies within 3 standard deviations from the mean.

Definition: Percentile
The nth percentile divides the bottom nth percent of the increasing order data from the top (100 - n)th percent of the data.
Notation: Pn
(Note: deciles, quintiles, quartiles also)

Definition: Quartiles
Divides the data set into 4 pieces.
1st quartile - the median of the data at or below the median
2nd quartile - the median of the data set
3rd quartile - the median of the dat at or above the median

Definition: Interquartile Range
Denoted IQR, this is the difference between the first and third quartiles; that is, IQR = Q3 - Q1.
Roughly speaking, the IQR gives the range of the middle 50% of the observations.

Definition: Five-Number Summary
Consists of the minimum, maximum, and quartiles written in increasing order:   Min, Q1, Q2, Q3, Max.

Definition: Outlier
An observation of the variable "well outside" of the range of the rest of the data

Definition: Population standard deviation
Or, standard deviation of the variable x, denoted by sx , or just s, where N = population size.
sigma

Shortcut:
sigma

Definition: Parameter
A parameter is a descriptive measure for a population. (m, s )

Definition: Statistic
A statistic is a descriptive measure for a sample. (x-bar, s).

Definition: Standardized Variable
The variable sigma is the standardized version of x or the standardized variable corresponding to x (called the "z-value").

Note: The z-value for a data point is called the z-score.
To return to the Dalton College Homepage, click icon.