Unit 3 Glossary
Vocabulary Words
95% confidence interval: A range of values for the parameter that we are 95% confident could have produced our observed data.
The Central Limit Theorem (CLT): The math fact that says that if a sample mean (or proportion or correlation) is calculated from a large number of samples, it is generated by a Normal distribution.
The Standard Normal Distribution: The Bell Curve with a mean of 0 and a standard deviation of 1.
confidence level: The choice of how much chance we are willing to take that we got an unlucky sample.
inconsistent: Data that would be unlikely to occur if the null hypothesis were true; i.e., the statistic results in a small p-value.
least-squares estimate: The sample mean is the least-squares estimate of the true mean, because it is the number that has the smallest sum of squared error to the observed values.
null distribution: The collection of values of a statistic that could have been observed by chance if the null hypothesis were true.
null hypothesis: A claim about the parameters that would be true if no interesting trends were present.
p-value: the probability of getting a statistic more extreme than what we saw, in a world where the null hypothesis is true.
random noise: The random amount that a generated quantitative value falls from the mean.
reasonably large number of samples: A general rule of thumb is that more than 30 samples is sufficient, or more than 15 if your original data is not very skewed.
residuals: The observed value minus the true mean. Also called the ‘random noise’ or the ‘deviation from the mean’.
standardized: The process of subtracting a mean and dividing by a standard deviation; i.e., calculating a z-score.
statistic model: A process that generates data with some randomness.
sum of squared error (SSE): The total of the squared distances from the observed values to the proposed mean.
sum of squared residuals (SSR): The total of the squared distances from the observed values to the proposed mean.
tails: The parts of the Bell Curve at either end.
Key Skills and Concepts
- State a null hypothesis in symbols or specific words
- Estimate how uncommon a particular observed statistic is, among values simulated from the null distribution.
- quickly estimate areas under the Bell Curve without the help of technology