Best Linear Unbiased Estimator

Linear Regression Model#模型设定如下

\begin{matrix} (1) & {\begin{cases} Y = X^{T} β + e \\ E (e | X) = 0 \end{cases} \\ (2) & {\begin{cases} E (Y^{2}) < \infty \\ E (‖ X ‖^{2}) < \infty \\ E (X X^{T}) ≻ O \end{cases} \end{matrix}

Least Squares Estimator#矩阵形式的情形为

\hat{β} \equiv {(X^{T} X)}^{- 1} (X^{T} Y)

Expectation of LS Estimator

若样本是独立同分布的，则有

E [Y_{i} ∣ X_{1}, \dots, X_{n}] = E [Y_{i} ∣ X_{i}] = X_{i}^{T} β

从而有（需要用到 $E [e ∣ X] = 0$ ）

E [Y ∣ X] = [\begin{matrix} ⋮ \\ E [Y_{i} ∣ X] \\ ⋮ \end{matrix}] = [\begin{matrix} ⋮ \\ X_{i}^{T} β \\ ⋮ \end{matrix}] = X β

参数估计量的条件期望为

\begin{aligned} E [\hat{β} ∣ X] & = E [(X^{T} X)^{- 1} X^{T} Y ∣ X] \\ = (X^{T} X)^{- 1} X^{T} E [Y ∣ X] \\ = (X^{T} X)^{- 1} X^{T} X β \\ = β \end{aligned}

If $(X, e)$ have a joint normal distribution^[1]，根据期望迭代法则有

E (β) = E [E [\hat{β} ∣ X] ∣ X] = E [β ∣ X] = β

Variance of LS Estimator

参数估计量的条件方差为

\begin{aligned} V a r [\hat{β} ∣ X] & = V a r [(X^{T} X)^{- 1} X^{T} Y | X] \\ = (X^{T} X)^{- 1} X^{T} V a r [Y ∣ X] X (X^{T} X)^{- 1} \\ = (X^{T} X)^{- 1} X^{T} V a r [X β + e ∣ X] X (X^{T} X)^{- 1} \\ = (X^{T} X)^{- 1} X^{T} V a r [e | X] X (X^{T} X)^{- 1} \\ = (X^{T} X)^{- 1} X^{T} Ω X (X^{T} X)^{- 1} \end{aligned}

特别地，若误差项满足同方差（homoskedastic）假设，即

Ω = V a r [e | X] = σ^{2} I_{n}

则参数估计量的条件方差为

V a r [\hat{β} ∣ X] = (X^{T} X)^{- 1} σ^{2}

Gauss-Markov Theorem Take the homoskedastic linear regression model. If $\tilde{β}$ is an linear unbiased estimator of $β$ then

V a r [\tilde{β} ∣ X] \geq (X^{T} X)^{- 1} σ^{2}

即 Least Squares Estimator 是所有线性无偏估计量中条件方差最小的。

然而，同方差假设在实践中几乎不可能满足；即使满足， $σ^{2}$ 也是未知参数，无法用于计算参数估计量的标准误（ $S E [\hat{β} ∣ X]$ ）用于统计推断。因此，有必要构建 $σ^{2}$ 的估计量和非同方差假设下参数估计量的条件方差。

👉 Homoskedasticity and Heteroskedasticity

基本假设与模型性质

序号	假设	表达式
①	线性模型	$Y = X β + e$
②	零条件均值	$E [e ∣ X] = 0$
③	随机抽样
④	解释变量不完全共线
⑤	误差项满足同方差假设	$V a r [e ∣ X] = σ^{2} I_{n}$
⑥	误差项服从正态分布	$e ∣ X \sim N (0, σ^{2} I_{n})$

Tip

①-⑤称为 Gauss–Markov assumptions
①-⑥称为 classical linear model assumptions

①②构成了 Linear CEF Model ，③保证了样本独立同分布，④保证了参数估计量有数值解，①-④即可得出无偏性。⑤可得出参数估计量的理想条件方差但并不现实，因此有必要使用各种稳健标准误。⑥用于统计推断但直接假设缺乏说服力，最好借助中心极限定理和渐进理论。

flowchart LR
    统计推断-->标准误
    统计推断-->正态分布
    标准误--同方差假设-->简单标准误
    标准误--异方差假设-->稳健标准误
    正态分布-->大胆假设
    正态分布-->渐进理论

Quote

The CLM assumptions are very strong, and a primary focus in theoretical and applied econometrics has been to conduct inference using OLS in a variety of settings – cross-sectional data, time series data, panel data, and data with a spatial structure – while imposing few assumptions. It is very difficult to get anywhere without relying on asymptotics. Therefore, we replace the CLM assumptions and rely on application of the law of large numbers and central limit theorem.(Wooldridge,2023)

渐进性质

这是一个充分不必要条件；如果 $X$ 服从离散型分布， $\hat{β}$ 的期望和方差可能不存在。 ↩︎