# Cross-Validation Model Selection

as and variance.

12.18 Covariance Matrix Estimation

Estimation of the asymptotic variance matrix V is done using similar techniques as for leastsquares estimation. The estimator is constructed by replacing the population moment matrices by

sample counterparts. Thus

Vb

=

Qb

xzQb

1

zz Qb

zx1

Qb

xzQb

1

zz

bQb

1

zz Qb

zx Qb

xzQb

1

zz Qb

zx1

(12.42)

where

Qb

zz =

1

n

Xn

i=1

ziz

0

i =

1

n

Z

0Z

Qb

xz =

1

n

Xn

i=1

xiz

0

i =

1

n

X0Z

b =

1

n

Xn

i=1

ziz

0

i

eb

2

CHAPTER 12. INSTRUMENTAL VARIABLES 427

The homoskedastic variance matrix can be estimated by

Vb

0

=

Qb

xzQb

1

zz Qb

zx1

b

2

b

2 =

1

n

Xn

i=1

eb

2

i

:

Standard errors for the coe¢ cients are obtained as the square roots of the diagonal elements of

n

1Vb

. ConÖdence intervals, t-tests, and Wald tests may all be constructed from the coe¢ cient

estimates and covariance matrix estimate exactly as for least-squares regression.

In Stata, the ivregress command by default calculates the covariance matrix estimator using

the homoskedastic variance matrix. To obtain covariance matrix estimation and standard errors

with the robust estimator Vb

, use the ì,rîoption.

Theorem 12.3 Under Assumption 12.2, as n ! 1,

Vb

0

p

! V

0

Vb

p

! V :

To prove Theorem 12.3 the key is to show

b

p

!

as the other convergence results were

established in the proof of consistency. We defer this to Exercise 12.6.

It is important that the covariance matrix be constructed using the correct residual formula

ebi = yi x

0

ib

2sls. This is di§erent than what would be obtained if the ìtwo-stageî computation

method is used. To see this, letís walk through the two-stage method. First, we estimate the

reduced form

xi = b

0

zi + ubi

to obtain the predicted values xbi = b

0

zi

. Second, we regress yi on xbi to obtain the 2SLS estimator

b

2sls. This latter regression takes the form

yi = xb

0

ib

2sls + vbi (12.43)

where vbi are least-squares residuals. The covariance matrix (and standard errors) reported by this

regression are constructed using the residual vbi

. For example, the homoskedastic formula is

Vb

=

1

n

Xc0

Xc

1

b

2

v =

Qb

xzQb

1

zz Qb

zx1

b

2

v

b

2

v =

1

n

Xn

i=1

vb

2

i

which is proportional to the variance estimate b

2

v

rather than b

2

. This is important because the

residual vbi di§ers from ebi

. We can see this because the regression (12.43) uses the regressor xbi

rather than xi

. Indeed, we can calculate that

vbi = yi x

0

ib

2sls + (xi xbi)

0 b

2sls

= ebi + ub

0

ib

2sls

6= ebi

:

This means that standard errors reported by the regression (12.43) will be incorrect.

This problem is avoided if the 2SLS estimator is constructed directly and the standard errors

calculated with the correct formula rather than taking the ìtwo-s