Method of maximum likelihood

Finds a value for θ such that it gives the maximum probability of observing the observed data in comparison to other values of θ

If x1,,xn are observed values of a random sample from a population with the parameter θ, the likelihood function of θ is

L(θ)=L(θ;x1,,xn)=f(x1,,xn;θ)=i=1nf(xi;θ)

The maximum likelihood estimate (MLE) of θ is the value of θ that maximizes the likelihood function L(θ)

Under the regular case, we use the log-likelihood function, as we will only need to differentiate a sum of functions instead of a product

l(θ)=lnL(θ)=lnf(xi;θ)=i=1nlnf(xi;θ)

And by a lemma, the θ^ that maximizes L(θ) also maximizes l(θ)

Properties

  1. MLE of θ is a sufficient statistic, if one exists, then MLE is a function of it
  2. is known to be asymptotically efficient
  3. Invariance principle: if θ^ is the MLE of θ, then g(θ^) is the MLE of the function g(θ)
  4. Lack of uniqueness: there could be more than one MLE

Example

An experiment with 6 coin tosses, 2 heads

General pdf with arbitrary parameter:

f(2;p)=(62)p2(1p)4

Different values of p give us different probabilities of getting that sample

p=14,f(2;1/4)=0.3p=13,f(2;1/3)=higher idk

Finding the MLE of iidExponential(θ)

L(θ)=1θexi/θ=θnexi/θl(θ)=lnL(θ)=ln(θnexi/θ)=nlnθxiθdl(θ)dθ=nθ+xiθ=0 for crit pointθ=xin=X¯=θ^d2l(θ)dθ2=nθ22xiθ3|θ=θ^=X¯=nX¯2nX¯(X¯)3=nX¯(X¯)3=n(X¯)2<0l(θ) has a maximum at θ^=X¯, a MLE of θ

On a irregular case, iidUniform(0,θ)

L(θ)=f(xi;θ)=1θl(θ)=lnθn=nlnθdl(θ)dθ=nθ=0 no solutionL(θ)=1θI[0<xi<θ]=1θnI[0<x1,,xn<θ] aim to increase L(θ) by decreasing θ to its lowest value possible, X(n)θ^=X(n) is the MLEofθ

Case: 2+ parameters (with hessian)
MLE of μ,μ2,σ,σ2N(μ,σ2)

L(θ)=L(μ,σ2;X~)=(2πσ2)n/2e1/2σ2(xiμ)2l(μ,σ2)=n2ln(2πσ2)12σ2(xiμ)2=n2ln(2π)n2ln(σ2)12σ2(xiμ)2lμ=12σ22(xiμ)(1)=(xiμ)2σ2=0μ^=X¯lσ2=n2σ2+(xiμ)22(σ2)2=nσ2+(xiμ)22(σ2)2=0σ^2=(xiμ)2n using μ^=X¯, we have σ2=1n(xiX¯)2\text{if} \begin{bmatrix} \frac{\partial^2 l }{\partial \mu^2} & \frac{\partial { #2l} }{\partial \mu \partial \sigma^2} \\ \frac{\partial^2 l}{\partial \sigma^2 \partial \mu} & \frac{\partial { #2} l}{\partial (\sigma^2 )^2} \end{bmatrix} < 0, \text{ then our} \hat{\mu}, \hat{\sigma}^2 \text{ are MLEs of }\mu, \sigma^2

Then the MLE of μ2:μ^2=g1(μ^)=g1(x¯)=X¯2
And the MLE of σ:σ^=g2(σ^2)=1n(xiX¯)2

Case: area of solutions
Uniform(θ,θ+1)

L(θ)=f(xi;θ)=1I[θ<ci<θ+1]=1I[θ<x1,,xn<θ+1]=1I[θ<X(1)]I[X(n)<θ+1]L is maximized when both inequalities are true X(n)1<θ<X(1)θ[X(n)1,X(1)],θ is MLE of θ