# Solving for Least Squares with One Regressor

5.8 Normal Regression Model The normal regression model is the linear regression model with an independent normal error y = x 0 + e (5.4) e N(0; 2 ): As we learned in Section 5.7, the normal regression model holds when (y; x) are jointly normally distributed. Normal regression, however, does not require joint normality. All that is required is that the conditional distribution of y given x is normal (the marginal distribution of x is unrestricted). In this sense the normal regression model is broader than joint normality. Notice that for notational convenience we have written (5.4) so that x contains the intercept. Normal regression is a parametric model, where likelihood methods can be used for estimation, testing, and distribution theory. The likelihood is the name for the joint probability density of the data, evaluated at the observed sample, and viewed as a function of the parameters. The ma CHAPTER 5. NORMAL REGRESSION AND MAXIMUM LIKELIHOOD 153 likelihood estimator is the value which maximizes this likelihood function. Let us now derive the likelihood of the normal regression model. First, observe that model (5.4) is equivalent to the statement that the conditional density of y given x takes the form f (y j x) = 1 (22) 1=2 exp 1 2 2 y x 0 2 : Under the assumption that the observations are mutually independent, this implies that the conditional density of (y1; :::; yn) given (x1; :::; xn) is f (y1; :::; yn j x1; :::; xn) = Yn i=1 f (yi j xi) = Yn i=1 1 (22) 1=2 exp 1 2 2 yi x 0 i 2 = 1 (22) n=2 exp 1 2 2 Xn i=1 yi x 0 i 2 ! def = L(; 2 ) and is called the likelihood function. For convenience, it is typical to work with the natural logarithm log f (y1; :::; yn j x1; :::; xn) = n 2 log(22 ) 1 2 2 Xn i=1 yi x 0 i 2 def = log L(; 2 ) (5.5) which is called the log-likelihood function. The maximum likelihood estimator (MLE) (b mle; b 2 mle) is the value which maximizes the log-likelihood. (It is equivalent to maximize the likelihood or the log-likelihood. See Exercise 5.16.) We can write the maximization problem as (b mle; b 2 mle) = argmax 2Rk, 2>0 log L(; 2 ): (5.6) In most applications of maximum likelihood, the MLE must be found by numerical methods. However, in the case of the normal regression model we can Önd an explicit expression for b mle and b 2 mle as functions of the data. The maximizers (b mle; b 2 mle) of (5.6) jointly solve the Örst-order conditions (FOC) 0 = @ @ log L(; 2 ) =bmle;2=b 2 mle = 1 b 2 mle Xn i=1 xi yi x 0 ib mle (5.7) 0 = @ @2 log L(; 2 ) =bmle;2=b 2 mle = n 2b 2 mle + 1 b 4 mle Xn i=1 yi x 0 ib mle2 : (5.8) The Örst FOC (5.7) is proportional to the Örst-order conditions for the least-squares minimization problem of Section 3.6. It follows that the MLE satisÖes b mle = Xn i=1 xix 0 i !1 Xn i=1 xi CHAPTER 5. NORMAL REGRESSION AND MAXIMUM LIKELIHOOD 154 That is, the MLE for is algebraically identical to the OLS estimator. Solving the second FOC (5.8) for b 2 mle we Önd b 2 mle = 1 n Xn i=1 yi x 0 ib mle2 = 1 n Xn i=1 yi x 0 ib ols2 = 1 n Xn i=1 eb 2 i = b 2 ols