# Empirical Distribution Function .

This is a more compact argument (often described as more elegant) but such manipulations should

not be done without understanding the notation and the applicability of each step of the argument.

Regardless, the Öniteness of the covariance matrix means that we can then apply the multivariate

CLT (Theorem 6.14).

Theorem 7.2 Under Assumption 7.2,

< 1 (7.6)

and

1

p

n

Xn

i=1

xiei

d ! N (0;

) (7.7)

as n ! 1:

Putting together (7.1), (7.5), and (7.7),

p

n

b

d ! Q1

xx N (0;

)

= N

0; Q1

xx

Q1

xx

as n ! 1; where the Önal equality follows from the property that linear combinations of normal

vectors are also normal (Theorem 5.4).

We have derived the asymptotic normal approximation to the distribution of the least-squares

estimator.

Theorem 7.3 Asymptotic Normality of Least-Squares Estimator

Under Assumption 7.2, as n ! 1

p

n

b

d ! N (0;V )

where

V = Q1

xx

Q1

xx; (7.8)

Qxx = E (xix

0

i

); and

= E

xix

0

i

e

2

i

:

In the stochastic order notation, Theorem 7.3 implies that

b = + Op(n

1=2

)

which is stronger than (7.4).

The matrix V = Q1

xx

Q1

xx is the variance of the asymptotic distribution of p

n

b

:

Consequently, V is often referred to as the asymptotic covariance matrix of b: The expression

V = Q1

xx

Q1

xx is called a sandwich form, as the matrix

is sandwiched between two copies of

Q1

xx.

It is useful to compare the variance of the asymptotic distribution given in (7.8) and the Önitesample conditional variance in the CEF model as given in (4.10):

V b = var

b j X

X0X

1

CHAPTER 7. ASYMPTOTIC THEORY FOR LEAST SQUARES 225

Notice that V b is the exact conditional variance of b and V is the asymptotic variance of

p

n

b

: Thus V should be (roughly) n times as large as V b, or V nV b. Indeed,

multiplying (7.9) by n and distributing, we Önd

nV b =

1

n

X0X

1

1

n

X0DX 1

n

X0X

1

which looks like an estimator of V . Indeed, as n ! 1

nV b

p

! V :

The expression V b is useful for practical inference (such as computation of standard errors and

tests) since it is the variance of the estimator b , while V is useful for asymptotic theory as it

is well deÖned in the limit as n goes to inÖnity. We will make use of both symbols and it will be

advisable to adhere to this convention.

There is a special case where

and V simplify. Suppose that

cov(xix

0

i

; e2

i

) = 0: (7.10)

Condition (7.10) holds in the homoskedastic linear regression model, but is somewhat broader.

Under (7.10) the asymptotic variance formulae simplify as

= E

xix

0

i

E

e

2

i

= Qxx

2

V = Q1

xx

Q1

xx = Q1

xx

2 V