| Stemplot |
|---|
![]() |
| Histogram (bar graph) |
|---|
![]() |
| Frequency Polygon (line graph) |
|---|
![]() |
Measures of Location
-
Quartiles divide datasets into four equal quarters
-
Percentiles divide datasets into hundredths
- where =index, = percentile, =total datapoints
- where =datapoints from bottom to datapoint (exclusive), =datapoints equal to measured datapoint, =total datapoints
-
Median refers to the middle datum in an ordered dataset (simple average of two middle points if even number of data)
-
Interquartile range (IQR) refers to the spread between the first and third quartiles
- Outliers are from the 50th percentile (median)
Box Plots
![]() |
|---|
Measures of Center
-
Mean refers to the simple average
- is the sample mean
- is the population mean
-
Median refers to the middle value
-
Mode refers to the most frequent value (a dataset with >1 mode is bimodal)
Law of Large Numbers
as sample size increases.
Sampling Distributions
- Sampling distributions show probability of every result for a statistic from a large sample
- Statistics are calculated from samples ( is a statistic from a sample that estimates )
Grouped Frequency Tables
- The mean of a grouped frequency table can be calculated as where =interval frequency and =interval midpoint
Skewness
- Symmetrical distributions have the same mean and median
- Left-skewed distributions have a longer left tail (mean < median)
- Right-skewed distributions have a longer right tail (mean > median)
Measures of Spread
Standard Deviation
-
Standard deviation measures how far values are spread from the mean
- is the sample standard deviation
- is the population standard deviation
-
Deviation refers to the difference between some number and the mean ( or )
-
Variance refers to the average of the square of the deviations
- Sample variance:
- Population variance:
- Therefore, standard deviation is the square root of variance
Sampling Variability
- Sampling variability refers to how much a statistic varies between samples
- Standard error of the mean is a standard deviation that measures sampling variability ()
Comparing Values
- The -score is a metric that compares values from different datasets by standard deviations
- Sample: ,
- Population: ,
Chebyshevβs Rule
- For any dataset regardless of distribution
- 75% are within 2 SD
- 89% are within 3 SD
- 95% are within 4.5 SD
Empirical Rule
- For symmetric, bell-shaped distributions
- 68% are within 1 SD
- 95% are within 2 SD
- 99% are within 3 SD



