Sampling Distributions

Normal distribution results
Consider $X_{1}, \dots, X_{n} \sim N (μ, σ^{2})$ . Then if $\bar{X} = \sum X_{i} / n$ and $S^{2} = \sum (X_{i} - \bar{X})^{2} / (n - 1) :$

$\bar{X} \sim N (μ, σ^{2} / n)$
$\sum (X_{i} - μ)^{2} / σ^{2} \sim χ_{n}^{2}$
$\sum (X_{i} - \bar{X})^{2} / σ^{2} \sim χ_{n - 1}^{2}$
$(n - 1) S^{2} / σ^{2} \sim χ_{n - 1}^{2}$
$\frac{\bar{X} - μ}{S / \sqrt{n}} \sim t_{n - 1}$

fundamental to statistical inference
- relating statistics from the sample to the population parameters
- we use arguments based on sampling distribution of the statistic to state how close the estimate is likely to be to the parameter
  - sampling distribution of a statistic is the probability distribution of that statistic
    - = probability distributions if samples of the same size were to be repeatedly drawn from the population
      - but we dont really do it, math can figure it out
sampling distributions of
- $\bar{X}$
- $S^{2}$
sample distribution of the sample mean( $\bar{x}$ )
- assumed that population is $\infty$ or very large compared to sample size
  - otherwise math changes a little
- $\bar{x}$
  - $X_{1}, X_{2}, . . ., X_{n}$ are n independently drawn observations from a population with mean $μ$ and sd $σ$
  - $\bar{X} = \frac{X_{1} + X + 2 + . . . + X_{n}}{n}$
    - n = sample size, not amount of sample draws
    - $E (\bar{X}) = E (\frac{X_{1} + X_{2} + . . . + X_{n}}{n}) = μ$
      - $E (X_{1}) = E (X_{2}) = E (X_{n}) = μ$
        
        thats how expectation works
      - $\bar{X}$ is an unbiased estimate of $μ$
      - the expectation of a sum is always the sum of the expectations
      - we sometimes write $E (\bar{X}) = μ$ as $μ_{\bar{X}} = μ$
        
        mean of the sampling distribution of the sample mean = mean of the population
        
        $\bar{X}$ can take multiple values in different samples, but their mean will be equal to the population mean
        
        mean of $\bar{X}$ , the random variable, same as expectation
    - $V a r (\bar{X}) = V a r (\frac{X_{1} + X_{2} + . . . + X_{n}}{n}) = \frac{σ^{2}}{n}$
      - if the random variables are independent, the variance of their sum is the sum of their variances
      - often denoted as $\sigma_{\bar X}^2 = \frac{\sigma {#2} }{n}$
        
        $σ (\bar{X}) = \frac{σ}{\sqrt{n}}$
        
        sd of the mean of multiple samples taken will have a smaller variance
    - $\bar{X}$ is normally distributed if we are sampling from a normally distributed population
    - $\bar{X} \sim N (μ, \frac{σ^{2}}{n})$
      - for the mean of n observations
      - $Z = \frac{\bar{X} - μ}{σ / \sqrt{n}}$
        
        tends to $Z \sim N (0, 1)$ as $n \to \infty$
        
        tends to the standardized normal distribution
- $μ_{\bar{x}}$ = mean of $\bar{x}$
- $σ_{\bar{x}}$ = standard deviation of $\bar{x}$

If $S_{1}^{2}$ and $S_{2}^{2}$ are the sample variances of two random samples of size $n_{1}$ and $n_{2}$ from two populations with the variances $σ_{1}^{2}$ and $σ_{2}^{2}$ , then

F = \frac{S_{1}^{2} / σ_{1}^{2}}{S_{2}^{2} / σ_{2}^{2}} \sim F_{n_{1} - 1, n_{2} - 1}