# Coeficient Decomposition

5.16 Likelihood Ratio Test In the previous section we described the t-test as the standard method to test a hypothesis on a single coe¢ cient in a regression. In many contexts, however, we want to simultaneously assess a set of coe¢ cients. In the normal regression model, this can be done by an F test, which can be derived from the likelihood ratio test. Partition the regressors as xi = (x 0 1i ; x 0 2i ) and similarly partition the coe¢ cient vector as = ( 0 1 ; 0 2 ) 0 . Then the regression model can be written as yi = x 0 1i1 + x 0 2i2 + ei : (5.18) Let k = dim(xi), k1 = dim(x1i), and q = dim(x2i), so that k = k1 + q. Partition the variables so that the hypothesis is that the second set of coe¢ cients are zero, or H0 : 2 = 0: (5.19) If H0 is true, then the regressors x2i can be omitted from the regression. In this case we can write (5.18) as yi = x 0 1i1 + ei : (5.20) We call (5.20) the null model. The alternative hypothesis is that at least one element of 2 is non-zero and is written as H1 : 2 6= 0: When models are estimated by maximum likelihood, a well-accepted testing procedure is to reject H0 in favor of H1 for large values of the Likelihood Ratio ñ the ratio of the maximized likelihood function under H1 and H0, respectively. We now construct this statistic in the normal regression model. Recall from (5.9) that the maximized log-likelihood equals log L(b; b 2 ) = n 2 log 2b 2 CHAPTER 5. NORMAL REGRESSION AND MAXIMUM LIKELIHOOD 162 We similarly need to calculate the maximized log-likelihood for the constrained model (5.20). By the same steps for derivation of the unconstrained MLE, we can Önd that the MLE for (5.20) is OLS of yi on x1i . We can write this estimator as e 1 = X0 1X1 1 X0 1y with residual eei = yi x 0 1ie 1 and error variance estimate e 2 = 1 n Xn i=1 ee 2 i : We use the tildes ì~î rather than the hats ì^î above the constrained estimates to distinguish them from the unconstrained estimates. You can calculate similar to (5.9) that the maximized constrained log-likelihood is log L(e 1 ; e 2 ) = n 2 log 2e 2 n 2 : A classic testing procedure is to reject H0 for large values of the ratio of the maximized likelihoods. Equivalently, the test rejects H0 for large values of twice the di§erence in the log-likelihood functions. (Multiplying the likelihood di§erence by two turns out to be a useful scaling.) This equals LR = 2 n 2 log 2b 2 n 2 n 2 log 2e 2 n 2 = n log e 2 b 2 : (5.21) The likelihood ratio test rejects for large values of LR, or equivalently (see Exercise 5.22), for large values of F = e 2 b 2 =q b 2=(n k) : (5.22) This is known as the F statistic for the test of hypothesis H0 against H1: To develop an appropriate critical value, we need the null distribution of F. Recall from (3.29) that nb 2 = e 0M e where M = In P with P = X (X0X) 1 X0 . Similarly, under H0, ne 2 = e 0M1e where M = In P 1 with P 1 = X1 (X0 1X1) 1 X0 1 . You can calculate that M1 M = P P 1 is idempotent with rank q. Furthermore, (M1 M)M = 0: It follows that e 0 (M1 M) e 2 q and is independent of e 0M e. Hence F = e 0 (M1 M) e=q e 0M e=(n k) 2 q=q 2 nk =(n k) Fq;nk an exact F distribution with degrees of freedom q and n k, respectively. Thus under H0, the F statistic has an exact F distribution. The critical values are selected from the upper tail of the F distribution. For a given signiÖcance level (typically = 0:05) we select the critical value c so that P (Fq;nk c) = . (For example, in MATLAB the expression is finv(1-,q,n-k).) The test rejects H0 in favor of H1 if F > c and does not reject H0 otherwise. The p-value of the test is p = 1 Gq;nk (F) where Gq;nk (u) is the Fq;nk distribution function. (In MATLAB, the p-value is computed as 1-fcdf(f,q,n-k).) It is equivalent to reject H0 if F > c or p < . In Stata, the command to test multiple coe¢ cients takes the form ëtest X1 X2íwhere X1 and X2 are the names of the variables whose coe¢ cients are tested. Stata then reports the F statistic for the hypothesis that the coe¢ cients are jointly zero along with the p-value c