Change of variables Distribution function technique F Y ( y ) = P ( Y ≤ y ) = P ( h ( X 1 , … , X n ) ≤ y ) , f Y ( y ) = ∂ F Y ( y ) ∂ y Transformation technique f y ( y ) = f X ( h − 1 ( y ) ) ⋅ | ∂ h − 1 ( y ) ∂ y | if only one region f Y ( y ) = f X ( h 1 − 1 ( y ) ) | d h 1 − 1 ( y ) d y | + f x ( h 2 − 1 ( y ) ) | | d h 2 − 1 ( y ) d y | f Y 1 , . . . , Y n ( y 1 , . . . , y n ) = f X 1 , . . . , X n ( g 1 ( y 1 , . . . , y n ) , . . . , g n ( y 1 , . . . , y n ) ) ⋅ | J | | J | = det [ ∂ X 1 ∂ Y 1 ∂ X 1 ∂ Y 2 ∂ X 2 ∂ Y 1 ∂ X 2 ∂ Y 2 ] (integrate the dummy vars) Order Statistics f X ( 1 ) = n [ 1 − F X ( x ) ] n − 1 f X ( x ) f X ( n ) = n [ F X ( x ) ] n − 1 f X ( x ) f X ( r ) ( x ) = n ! ( r − 1 ) ! ( n − r ) ! [ F X ( x ) r − 1 ] f X ( x ) [ 1 − F X ( x ) ] n − r f X ~ ( x ) = f X ( n + 1 ) ( x ) = ( 2 n + 1 ) ! n ! ⋅ n ! [ F X ( x ) ] n f X ( x ) [ 1 − F X ( x ) ] n Unbiasedness b i a s ( θ ^ ) = E ( θ ^ − θ ) = E ( θ ^ ) − θ M S E ( θ ^ ) = E [ ( θ ^ − θ ) 2 ] = v a r ( θ ^ ) + [ E ( θ ^ ) − θ ] 2 Efficiency (compare unbiased estimators) efficiency ( θ ^ ) = C R L B V a r ( θ ^ ) ≤ 1 if V a r ( θ ^ ) = C R L B → UMVUE v a r ( θ ^ ) ≥ 1 n E [ ( ∂ ln f ( x ; θ ) ∂ θ ) 2 ]
Distribution
PDF / PMF
(E(X))
E(X²)
(Var(X))
Uniform(a, b)
1 b − a , a ≤ x ≤ b
a + b 2
a 2 + a b + b 2 3
( b − a ) 2 12
Beta(α , β )
Γ ( α + β ) Γ ( α ) Γ ( β ) x α − 1 ( 1 − x ) β − 1 , 0 < x < 1
α α + β
α ( α + 1 ) ( α + β ) ( α + β + 1 )
α β ( α + β ) 2 ( α + β + 1 )
Gamma(k, θ )
1 Γ ( k ) θ k x k − 1 e − x / θ , x > 0
k θ
k ( k + 1 ) θ 2
k θ 2
Poisson(λ )
λ x e − λ x ! , x = 0 , 1 , 2 , …
λ
λ ( λ + 1 )
λ
Binomial(n, p)
( n x ) p x ( 1 − p ) n − x , x = 0 , 1 , … , n
n p
n p ( 1 − p ) + n p 2
n p ( 1 − p )
Geometric(p)
( 1 − p ) x − 1 p , x = 1 , 2 , 3 , …
1 p
2 − p p 2
1 − p p 2
χ 2 ( k )
1 2 k / 2 Γ ( k / 2 ) x k / 2 − 1 e − x / 2 , x > 0
k
k ( k + 2 )
2 k
Expntl(λ )
λ e − λ x , x > 0
1 λ
2 λ 2
1 λ 2
Normal(μ , σ 2 )
1 2 π σ e − 1 2 σ 2 ( x − μ ) 2
μ
μ 2 + σ 2
σ 2
Consistency
θ ^ is a consistent estimator of the parameter θ if and only if ∃ ϵ > 0
lim n → ∞ P ( | θ ^ − θ | < ϵ ) = P ( − ϵ ≤ θ ^ − θ ≤ ϵ ) = P ( θ − ϵ ≤ θ ^ ≤ θ + ϵ ) = 1 If θ ^ is an unbiased estimator of the parameter θ and the V a r ( θ ^ ) → 0 as n → ∞ , then θ ^ is a consistent estimator of θ
Chebyshev’s Theorem P ( | X − μ | < k σ ) ≥ 1 − 1 k 2 , or P ( | X − μ | ≥ k σ ) ≤ 1 k 2 Sufficiency
If the conditional given θ ^ depends on θ , θ ^ is not sufficient
f ( X 1 = x 1 , … , X n = x n | θ ^ ) = f ( X 1 = x 1 , … , X n = x n , θ ^ ) g ( θ ^ ) ← pdf of θ ^ = f ( X 1 = x 1 , … , X n = x n ) g ( θ ^ ) θ ^ is a sufficient estimator iff the joint can be factorized (fact. theorem)
f ( X 1 = x 1 , … , X n = x n ; θ ) = g ( θ ^ , θ ) ⋅ h ( x 1 , … , x n ) Method of Moments (k moments for k parameters) E ( X ) = X ¯ , E ( X 2 ) = X ¯ 2 Method of maximum likelihood estimator L ( θ ) = L ( θ ; x 1 , … , x n ) = f ( x 1 , … , x n ; θ ) = ∏ i = 1 n f ( x i ; θ ) l ( θ ) = ln L ( θ ) = ln ∏ f ( x i ; θ ) = ∑ i = 1 n ln f ( x i ; θ ) d l ( θ ) d θ = 0 to find crit point at θ ^ d 2 l ( θ ) d θ 2 | θ = θ ^ = to check maximum, should be < 0 case 2+ parameters: if [ ∂ 2 ℓ ∂ θ 1 2 ∂ 2 ℓ ∂ θ 1 ∂ θ 2 ∂ 2 ℓ ∂ θ 2 ∂ θ 1 ∂ 2 ℓ ∂ θ 2 2 ] < 0 , then our θ 1 ^ , θ 2 ^ are MLEs of θ 1 , θ 2 Bayesian Estimation (Prior distribution) g ( θ ) = Prior belief about θ (Likelihood) L ( θ ) = f ( x ; θ ) = Data’s likelihood given θ (Posterior distribution) h ( θ | x ) = f ( x , θ ) f ( x ) = f ( x ; θ ) g ( θ ) f ( x ) = L ( θ ) g ( θ ) f ( x ) L ( θ ) g ( θ ) = Unnormalized posterior f ( x ) = ∫ all θ L ( θ ) g ( θ ) d θ = Marginal likelihood h ( θ | x ) = L ( θ ) g ( θ ) f ( x ) = Normalized posterior θ ^ B = E ( θ | x ) = ∫ all θ θ h ( θ | x ) d θ = Bayesian estimate ∑ n = 0 x a r n = a ( 1 − r x + 1 ) 1 − r Confidence Intervals: μ = X ¯ ± z 1 − α / 2 ⋅ σ n ( known σ ) μ = X ¯ ± t n − 1 , α / 2 ⋅ S n ( unknown σ ) μ = X ¯ ± z 1 − α / 2 ⋅ S n ( CLT ) μ 1 − μ 2 = ( X ¯ 1 − X ¯ 2 ) ± z 1 − α 2 ⋅ σ 1 2 n 2 + σ 2 2 n 2 μ 1 − μ 2 = ( X ¯ 1 − X ¯ 2 ) ± t ν , α / 2 ⋅ S 1 2 n 1 + S 2 2 n 2 ( n 1 , n 2 ≥ 30 ) μ 1 − μ 2 = ( X ¯ 1 − X ¯ 2 ) ± t n 1 + n 2 − 2 , α / 2 ⋅ S p 2 ( 1 n 1 + 1 n 2 ) S p 2 = ( n 1 − 1 ) S 1 2 + ( n 2 − 1 ) S 2 2 n 1 + n 2 − 2 σ 2 = ( ∑ ( X i − μ ) 2 χ 1 − α / 2 , n 2 , ∑ ( X i − μ ) 2 χ α / 2 , n 2 ) (known μ ) σ 2 = ( ( n − 1 ) S 2 χ 1 − α / 2 , n − 1 2 , ( n − 1 ) S 2 χ α / 2 , n − 1 2 ) p = p ^ ± z 1 − α / 2 ⋅ p ^ ( 1 − p ^ ) n p 1 − p 2 = ( p ^ 1 − p ^ 2 ) ± z 1 − α / 2 ⋅ p ^ 1 ( 1 − p ^ 1 ) n 1 + p ^ 2 ( 1 − p ^ 2 ) n 2 σ 1 2 σ 2 2 = ( S 1 2 S 2 2 ⋅ 1 F 1 − α / 2 , n 1 − 1 , n 2 − 1 , S 1 2 S 2 2 ⋅ 1 F α / 2 , n 1 − 1 , n 2 − 1 ) Test function: ϕ ( x 1 , … , x n ) = { 1 if ( x 1 , … , x n ) ∈ C (reject H 0 ) 0 otherwise Type I error = α = P ( reject H 0 ∣ H 0 true ) Type II error = β = P ( fail to reject H 0 ∣ H 0 false ) Power = 1 − β = P ( reject H 0 ∣ H 1 true ) π ( θ ) = P ( reject H 0 ∣ θ ) (power function) Neyman–Pearson Lemma (for simple H 0 vs simple H 1 ) : L 0 = ∏ i = 1 n f ( x i ; θ 0 ) (likelihood under H 0 ) L 1 = ∏ i = 1 n f ( x i ; θ 1 ) (likelihood under H 1 ) L 0 L 1 ≤ k ⇒ reject H 0 Choose k to satisfy P ( reject H 0 ∣ H 0 ) = α Critical region: set of values making L 0 L 1 ≤ k Likelihood Ratio Test (general): L ( θ ) = ∏ i = 1 n f ( x i ; θ ) (full likelihood) Λ = max L 0 max L = L ( θ ~ ) L ( θ ^ ) = ∏ i = 1 n f ( x i , θ 0 ) ∏ i = 1 n f ( x i , θ ^ ) Reject H 0 if Λ ≤ k (or equivalently, − 2 ln Λ ≥ c ) Choose k (or c ) so that test has level α Asymptotically: − 2 ln Λ ∼ χ df 2 under H 0 Z/T Tests for Mean ( t n − 1 ) C = { ( x 1 , … , x n ) ; | x ¯ − μ 0 | ≥ Z 1 − α 2 σ n } (2 sided test) θ < θ 0 : z ≤ − Z 1 − α , θ > θ 0 : z ≥ Z 1 − α Difference in Means (Unknown Variances) Z o b s = x ¯ − y ¯ − δ 0 S 1 2 n 1 + S 2 2 n 2 n ≥ 30 t = x ¯ − y ¯ − δ 0 S p 2 ( 1 n 1 + 1 n 2 ) ∼ t n 1 + n 2 − 2 n < 30 , pooled σ 2 Test for Variances Known μ : χ 2 = n σ ^ 2 σ 0 2 = ∑ ( x i − μ ) 2 σ 0 2 ∼ χ n 2 Unknown μ : χ 2 = ( n − 1 ) s 2 σ 0 2 ∼ χ n − 1 2 Reject H 0 : χ o b s 2 < χ α / 2 2 or χ o b s 2 > χ 1 − α / 2 2 F = S 1 2 S 2 2 ∼ F n 1 − 1 , n 2 − 1 Reject H 0 : F o b s > F 1 − α / 2 , or F o b s < F α / 2 Binomial Proportion Test (Exact) Two-sided : X ≥ K ( α / 2 ) or X ≤ K ′ ( α / 2 ) , K smallest and K ′ largest for α 2 One-sided : X ≥ K ( α ) , where P ( X ≥ K | H 0 ) ≤ α Binomial Proportion (Normal Approx.) Z = X − n θ 0 n θ 0 ( 1 − θ 0 ) ≈ N ( 0 , 1 ) Continuity correction: Z = ( x ± 1 2 ) − n θ 0 n θ 0 ( 1 − θ 0 ) Use + 1 2 : x ≥ n θ 0 , − 1 2 if x < n θ 0 Chi-squared Test, Reject if χ o b s 2 ≥ χ d f , 1 − α 2 ∑ i = 1 K Z i 2 ∼ χ k 2 = ∑ ( x i − n i θ i n i θ i ( 1 − θ i ) ) 2 = ∑ i ∑ j ( f i j − E i j ) 2 E i j ∼ χ d f 2 d f = ( k − 1 ) ( c − 1 ) , or k ( c − 1 ) if θ j ′ s are given To test for association (independence):
E i j = n π ^ i j = n π ^ i ⋅ π ^ ⋅ j , d f = H a − H 0 Step 1. Estimate the parameter for the assumed distribution
Step 2. Compute the probability for each observation under the assumed distribution
Step 3. Compute the expected frequencies
Step 4. Test the goodness-of-fit of the assumed distribution to the observed data
d f = k − 1 − # parameters estimated