Call Us: US - +1 845 478 5244 | UK - +44 20 7193 7850 | AUS - +61 2 8005 4826

Non-Central Chi-Square and F Distributions .

Inference with Clustered Samples In this section we give some cautionary remarks and general advice about cluster-robust inference in econometric practice. There has been remarkably little theoretical research about the properties of cluster-robust methods ñ until quite recently ñ so these remarks may become dated rather quickly. In many respects cluster-robust inference should be viewed similarly to heteroskedaticity-robust inference, where a ìclusterîin the cluster-robust case is interpreted similarly to an ìobservationîin the heteroskedasticity-robust case. In particular, the e§ective sample size should be viewed as the number of clusters, not the ìsample sizeî n. This is because the cluster-robust covariance matrix estimator e§ectively treats each cluster as a single observation, and estimates the covariance matrix based on the variation across cluster means. Hence if there are only G = 50 clusters, inference should be viewed as (at best) similar to heteroskedasticity-robust inference with n = 50 observations. This is a bit unsettling, for if the number of regressors is large (say k = 20), then the covariance matrix will be estimated quite imprecisely. Furthermore, most cluster-robust theory (for example, the work of Chris Hansen (2007)) assumes that the clusters are homogeneous, including the assumption that the cluster sizes are all CHAPTER 4. LEAST SQUARES REGRESSION 137 identical. This turns out to be a very important simplication. When this is violated ñwhen, for example, cluster sizes are highly heterogeneous ñthe regression should be viewed as roughly equivalent to the heteroskedasticity-robust case with an extremely high degree of heteroskedasticity. Cluster sums have variances which are proportional to the cluster sizes, so if the latter is heterogeneous so will be the variances of the cluster sums. This also has a large e§ect on Önite sample inference. When clusters are heterogeneous then cluster-robust inference is similar to heteroskedasticity-robust inference with highly heteroskedastic observations. Put together, if the number of clusters G is small and the number of observations per cluster is highly varied, then we should interpret inferential statements with a great degree of caution. Unfortunately, small G with heterogeneous cluster sizes is commonplace. Many empirical studies on U.S. data cluster at the ìstateîlevel, meaning that there are 50 or 51 clusters (the District of Columbia is typically treated as a state). The number of observations vary considerably across states since the populations are highly unequal. Thus when you read empirical papers with individuallevel data but clustered at the ìstateîlevel you should be very cautious, and recognize that this is equivalent to inference with a small number of extremely heterogeneous observations. A further complication occurs when we are interested in treatment, as in the tracking example given in the previous section. In many cases (including Duáo, Dupas and Kremer (2011)) the interest is in the e§ect of a speciÖc treatment which is applied at the cluster level (in their case, treatment applies to schools). In many cases (not, however, Duáo, Dupas and Kremer (2011)), the number of treated clusters is small relative to the total number of clusters; in an extreme case there is just a single treated cluster. Based on the reasoning given above, these applications should be interpreted as equivalent to heteroskedasticity-robust inference with a sparse dummy variable as discussed in Section 4.16. As discussed there, standard error estimates can be erroneously small. In the extreme of a single treated cluster (in the example, if only a single school was tracked) then the estimated coe¢ cient on tracking will be very imprecisely estimated, yet will have a misleadingly small cluster standard error. In general, reported standard errors will greatly understate the imprecision of parameter estimates.