\( \newcommand{\bm}[1]{\boldsymbol{\mathbf{#1}}} \DeclareMathOperator{\tr}{tr} \DeclareMathOperator{\var}{var} \DeclareMathOperator{\cov}{cov} \DeclareMathOperator{\corr}{corr} \newcommand{\indep}{\perp\!\!\!\perp} \newcommand{\nindep}{\perp\!\!\!\perp\!\!\!\!\!\!/\;\;} \)

2.3 Inference

Subject to standard regularity conditions, \(h\) is asymptotically normally distributed with mean \(\beta\) and variance covariance matrix \(I(\beta)^{-1}\). For ‘large enough \(n\)’ we treat this distribution as an approximation.

Therefore, standard errors are given by \[ s.e.(\hat{\beta}_k)=[{\cal I}(\hat{\beta})^{-1}]_{kk}^{{1\over 2}} =[(X^T\hat{W}X)^{-1}]_{kk}^{{1\over 2}} \qquad k=0,\ldots ,p, \] where the diagonal matrix \(\hat{W}={\rm diag}(\hat{w})\) is evaluated at \(h\), that is \(\hat{w}_i=(\widehat{\var}(Y_i)g'(\hat{\mu}_i)^2)^{-1}\) where \(\hat{\mu}_i\) and \(\widehat{\var}(Y_i)\) are evaluated at \(h\) for \(i = 1, \ldots, n\). Furthermore, if \(\var(Y_i)\) depends on an unknown dispersion parameter, then this too must be estimated in the standard error.

The asymptotic distribution of the maximum likelihood estimator can be used to provide approximate large sample confidence intervals, using \[ {{\hat{\beta}_k-\beta_k}\over{s.e.(\hat{\beta}_k)}}\;\;{\buildrel{\rm asymp}\over\sim}\;\; N(0,1). \]