# Regression Sub-Vectors

17 Likelihood Properties In this section we present some general properties of the likelihood which hold broadly ñ not just in normal regression. Suppose that a random vector y has the conditional density f (y j x; ) where the function f is known, and the parameter vector takes values in a parameter space . The log-likelihood function for a random sample fyi j xi : i = 1; :::; ng takes the form log L() = Xn i=1 log f (yi j xi ; ): A key property is that the expected log-likelihood is maximized at the true value of the parameter vector. At this point it is useful to make a notational distinction between a generic parameter value and its true value 0. Set X = (x1; :::; xn). Theorem 5.23 0 = argmax2 E (log L() j X) The proof is presented in Section 5.20. This motivates estimating by Önding the value which maximizes the log-likelihood function. This is the maximum likelihood estimator (MLE): b = argmax 2 log L(): The score of the likelihood function is the vector of partial derivatives with respect to the parameters, evaluated at the true values, @ @ log L() =0 = Xn i=1 @ @ log f (yi j xi ; ) =0 : The covariance matrix of the score is known as the Fisher information: I = var @ @ log L(0) j X : Some important properties of the score and information are now presente CHAPTER 5. NORMAL REGRESSION AND MAXIMUM LIKELIHOOD 164 Theorem 5.24 If log f (y j x; ) is second di§erentiable and the support of y does not depend on then 1. E @ @ log L() =0 j X = 0 2. I = Pn i=1 E @ @ log f (yi j xi ; 0) @ @ log f (yi j xi ; 0) 0 j xi = E @ 2 @@ 0 log L(0) j X The proof is presented in Section 5.20. The Örst result says that the score is mean zero. The second result shows that the variance of the score equals the negative expectation of the second derivative matrix. This is known as the Information Matrix Equality. We now establish the famous CramÈr-Rao Lower Bound. Theorem 5.25 (CramÈr-Rao) Under the assumptions of Theorem 5.24, if e is an unbiased estimator of , then var e j X I 1 : The proof is presented in Section 5.20. Theorem 5.25 shows that the inverse of the information matrix is a lower bound for the covariance matrix of unbiased estimators. This result is similar to the Gauss-Markov Theorem which established a lower bound for unbiased estimators in homoskedastic linear regression. Sir Ronald A. Fisher The British statistician Ronald Fisher (1890-1962) is one of the core founders of modern statistical theory. His contributions include p