Cluster-Robust Inference: Bootstrap Procedures

3.8 Supplementary Appendix

3.8.3 Auxiliary Lemmas and Results

3.8.3.4 Cluster-Robust Inference: Bootstrap Procedures

In this section, we introduce two bootstrap procedures that are suitable for cluster-robust inference. The first algorithm uses a multiplier-bootstrap method to compute studentized and cluster-robust standard errors. This method has been previously described in Kline and Santos (2012) and Callaway et al. (2018). The second procedure is a bootstrap Hausman-type test, which provides bootstrappedp-values.

LetV_i=1ⁿ be a sequence ofi.i.d.random variables with zero mean and unit variance, which is independent of the original sample. One example isi.i.d.Bernoulli random variables withP(V =v0) =1−v0/√

5 andP(V=1−v₀) = v₀/√

5, wherev₀= (√

5+1)/2, as suggested by Mammen (1993). Now, given a genericAT T estimator,τ, and anb estimator of its influence function,η(·), we compute the clustered standard errors as follows:b

Algorithm 3.8.2 1. In iterationb, draw a realization ofV_b for each cluster. All observations within the same cluster share the same value ofV_b.

2. Calculate a bootstrap estimate forAT T as

bτ_b^∗=bτ+En[V_b·η(Wb )].

Form a bootstrap draw of the limiting distribution as

Rb^∗_b=√

n(bτ_b^∗−τ)b .

3. Repeat Steps 1-2Btimes.

4. Calculate the bootstrapped standard error, ˆσ^∗, as the bootstrap interquartile range normalized by the interquartile range of the standard normal distribution: ˆσ^∗= (q_0.75(R)b −q_0.25(bR))/(z_0.75−z_0.25), where q_p(R)b is the p-th

sample quantile of theRb_bin theBdraws, andz_pis thep-th quantile of the standard normal distribution.

Given the two DR DID estimators,bτ_drbased on (3.3.1),bτ_szbased on (3.4.1), and their respective linear expansions, ηb_dr(·)given in (3.3.11) andηb_sz(·)given in (3.4.3), we conduct a cluster-robust Hausman-type test as follows

Algorithm 3.8.3 1. Calculate the Hausman test statistic,Tn, following (3.4.2).

2. In iterationb, generate a realization ofV_bfor each cluster. Observations within the same cluster share the same value ofV_b.

3. Calculate bootstrap estimates of theAT T as

bτ^∗_j,b=τb_j+En[V_b·ηb_j(W)],

Vˆ_b^∗=En[V_b·(ηbe f f(W)−ηbsz(W))²].

Form a bootstrap test statistic,T_b^∗, as

Tb^∗=n τb_dr,b^∗ −τb_sz,b^∗ 2

/Vˆ_b^∗.

4. Repeat Steps 1-2Btimes.

5. Calculate the bootstrapped p-value, p^∗, as the proportion of the bootstrap test statistics, T_b^∗ ^B_b=1, that are greater than or equal toTn.

References

Abadie, A. (2003). Semiparametric instrumental variable estimation of treatment response models.Journal of econometrics, 113(2):231–263.

Abadie, A. (2005). Semiparametric difference-in-difference estimators.Review of Economic Studies, 72(1):1–19.

Abbring, J. H. and van den Berg, G. J. (2003). The nonparametric identification of treatment effects in duration models.Econometrica, 71(5):1491–1517.

Ackerberg, D., Chen, X., Hahn, J., and Liao, Z. (2014). Asymptotic efficiency of semiparametric two-step GMM.The Review of Economic Studies, 81(3):919–943,.

Angrist, J. D., Imbens, G. W., and Rubin, D. B. (1996). Identification of causal effects using instrumental variables.

Journal of the American statistical Association, 91(434):444–455.

Angrist, J. D. and Krueger, A. B. (1992). The effect of age at school entry on educational attainment: an applica- tion of instrumental variables with moments from two samples. Journal of the American Statistical Association, 87(418):328–336.

Anstrom, K. J. and Tsiatis, A. A. (2001). Utilizing propensity scores to estimate causal treatment effects with censored time-lagged data.Biometrics, 57(4):1207–1218.

Arellano, M. and Meghir, C. (1992). Female labour supply and on-the-job search: an empirical model estimated using complementary data sets. The Review of Economic Studies, 59(3):537–559.

Athey, S. and Imbens, G. W. (2006). Identification and inference in nonlinear difference-in-differences models.Econo- metrica, 74(2):431–497.

Beran, R. (1981). Nonparametric regression with randomly censored survival data.

Bernasconi, D. P., Antolini, L., et al. (2022). A causal inference approach to compare leukaemia treatment outcome in the absence of randomization and with dependent censoring.International Journal of Epidemiology, 51(1):314–

323.

Beyhum, J., Florens, J.-P., and Van Keilegom, I. (2021). A nonparametric instrumental approach to endogeneity in competing risks models.arXiv preprint arXiv:2105.00946.

Bickel, P. J., Klaassen, C. A., Ritov, Y., and Wellner, J. A. (1998).Efficient and Adaptive Estimation for Semiparamet- ric Models. Springer-Verlag, New York.

Bierens, H. J. (1982). Consistent model specification tests. Journal of Econometrics, 20(1):105–134.

Blundell, R., Chen, X., and Kristensen, D. (2007). Semi-nonparametric iv estimation of shape-invariant engel curves.

Econometrica, 75(6):1613–1669.

Bouaziz, O. and Lopez, O. (2010). Conditional density estimation in a censored single-index regression model.

Bernoulli, 16(2):514–542.

Braekers, R. and Veraverbeke, N. (2005). A copula-graphic estimator for the conditional survival function under dependent censoring.Canadian Journal of Statistics, 33(3):429–447.

B¨ucher, A., El Ghouch, A., and Van Keilegom, I. (2021). Single-index quantile regression models for censored data.

InAdvances in Contemporary Statistics and Econometrics, pages 177–196. Springer.

Buchinsky, M., Li, F., and Liao, Z. (2022). Estimation and inference of semiparametric models using data from several sources.Journal of Econometrics, 226(1):80–103.

Busso, M., Dinardo, J., and McCrary, J. (2014). New evidence on the finite sample properties of propensity score reweighting and matching estimators.The Review of Economics and Statistics, 96(5):885–895.

Callaway, B., Li, T., and Oka, T. (2018). Quantile treatment effects in difference in differences models under depen- dence restrictions and with only two time periods.Journal of Econometrics, 206(2):395–413.

Callaway, B. and Sant’Anna, P. H. (2021). Difference-in-differences with multiple time periods. Journal of Econo- metrics, 225(2):200–230.

Canay, I. A., Santos, A., and Shaikh, A. M. (2013). On the testability of identification in some nonparametric models with endogeneity.Econometrica, 81(6):2535–2559.

Card, D. and Krueger, A. B. (1994). Minimum wages and employment: A case study of the fast-food industry in new jersey and pennsylvania.The American Economic Review, 84(4):772.

Cattaneo, M. D., Jansson, M., and Ma, X. (2020). Simple local polynomial density estimators.Journal of the American Statistical Association, 115(531):1449–1455.

Chang, N.-C. (2020). Double/debiased machine learning for difference-in-differences models. The Econometrics Journal, 23(2):177–191.

Chen, J. and Roth, J. (2023). Log-like? identified ates defined with zero-valued outcomes are (arbitrarily) scale- dependent.Working Paper.

Chen, X., Hong, H., and Tarozzi, A. (2008). Semiparametric efficiency in GMM models with auxiliary data. The Annals of Statistics, 36(2):808–843.

Chen, X., Linton, O., and Van Keilegom, I. (2003). Estimation of semiparametric models when the criterion function is not smooth. Econometrica, 71(5):1591–1608.

Chen, Y.-H. (2010). Semiparametric marginal regression analysis for dependent competing risks under an assumed copula.Journal of the Royal Statistical Society: Series B (Statistical Methodology), 72(2):235–251.

Chernozhukov, V., Chetverikov, D., and Kato, K. (2014). Gaussian approximation of suprema of empirical processes.

The Annals of Statistics, 42(4):1564–1597.

Chernozhukov, V., Fern´andez-Val, I., and Galichon, A. (2010). Quantile and probability curves without crossing.

Econometrica, 78(3):1093–1125.

Chernozhukov, V., Fern´andez-Val, I., and Melly, B. (2013a). Inference on counterfactual distributions.Econometrica, 81(6):2205–2268.

Chernozhukov, V., Lee, S., and Rosen, A. M. (2013b). Intersection bounds: Estimation and inference.Econometrica, 81(2):667–737.

Chiang, C.-T. and Huang, M.-Y. (2012). New estimation and inference procedures for a single-index conditional distribution model. Journal of Multivariate Analysis, 111:271–285.

Claeskens, G. and Van Keilegom, I. (2003). Bootstrap confidence bands for regression curves and their derivatives.

The Annals of Statistics, 31(6):1852–1884.

Cragg, J. G. and Donald, S. G. (1996). On the asymptotic properties of ldu-based tests of the rank of a matrix.Journal of the American Statistical Association, 91(435):1301–1309.

Crommen, G., Beyhum, J., and Van Keilegom, I. (2022). A gaussian model for survival data subject to dependent censoring and confounding. arXiv preprint arXiv:2208.04184.

Czado, C. and Van Keilegom, I. (2021). Dependent censoring based on copulas.arXiv preprint arXiv:2104.06872.

Dabrowska, D. M. (1989). Uniform consistency of the kernel conditional kaplan-meier estimate. The Annals of Statistics, pages 1157–1167.

Dawid, A. P. (1979). Conditional independence in statistical theory. Journal of the Royal Statistical Society: Series B (Methodological), 41(1):1–15.

Delecroix, M., Hristache, M., and Patilea, V. (2006). On semiparametric M-estimation in single-index regression.

Journal of Statistical Planning and Inference, 136(3):730–769.

Deresa, N. W. and Van Keilegom, I. (2020). Flexible parametric model for survival data subject to dependent censoring.

Biometrical Journal, 62(1):136–156.

Deresa, N. W., Van Keilegom, I., and Antonio, K. (2022). Copula-based inference for bivariate survival data with left truncation and dependent censoring.Insurance: Mathematics and Economics, 107:1–21.

Dom´ınguez, M. A. and Lobato, I. N. (2004). Consistent estimation of models defined by conditional moment restrictions.Econometrica, 72(5):1601–1615.

Einmahl, U. and Mason, D. M. (2005). Uniform in bandwidth consistency of kernel-type function estimators. The Annals of Statistics, 33(3):1380–1403.

Escanciano, J. C. (2006a). A consistent diagnostic test for regression models using projections. Econometric Theory, 22(6):1030–1051.

Escanciano, J. C. (2006b). Goodness-of-fit tests for linear and nonlinear time series models.Journal of the American Statistical Association, 101(474):531–541.

Fan, J., Heckman, N. E., and Wand, M. P. (1995). Local polynomial kernel regression for generalized linear models and quasi-likelihood functions. Journal of the American Statistical Association, 90(429):141–150.

Fan, Y. and Guerre, E. (2016). Multivariate local polynomial estimators: Uniform boundary properties and asymptotic linear representation. InEssays in Honor of Aman Ullah. Emerald Group Publishing Limited.

Fan, Y. and Liu, R. (2018). Partial identification and inference in censored quantile regression. Journal of Economet- rics, 206(1):1–38.

Fan, Y., Sherman, R., and Shum, M. (2014). Identifying treatment effects under data combination. Econometrica, 82(2):811–822.

Firpo, S. (2007). Efficient semiparametric estimation of quantile treatment effects.Econometrica, 75(1):259–276.

Firpo, S., Fortin, N., and Lemieux, T. (2009a). Supplement to ‘unconditional quantile regressions’. Econometrica Supplemental Material, 77.

Firpo, S., Fortin, N. M., and Lemieux, T. (2009b). Unconditional quantile regressions.Econometrica, 77(3):953–973.

Firpo, S. P., Fortin, N. M., and Lemieux, T. (2018). Decomposing wage distributions using recentered influence function regressions. Econometrics, 6(2):28.

Fortin, N., Lemieux, T., and Firpo, S. (2011). Decomposition methods in economics. InHandbook of Labor Eco- nomics, volume 4, pages 1–102. Elsevier, Amsterdam.

Frandsen, B. R. (2015). Treatment effects with censoring and endogeneity. Journal of the American Statistical Association, 110(512):1745–1752.

Fr¨olich, M. (2006). Non-parametric regression for binary dependent variables. The Econometrics Journal, 9(3):511–

540.

Fr¨olich, M. and Melly, B. (2013). Unconditional quantile treatment effects under endogeneity.Journal of Business &

Economic Statistics, 31(3):346–357.

Gatta, G., Capocaccia, R., et al. (2005). Childhood cancer survival trends in europe: a eurocare working group study.

Journal of Clinical Oncology, 23(16):3742–3751.

Genest, C. and MacKay, J. (1986a). The joy of copulas: Bivariate distributions with uniform marginals.The American Statistician, 40(4):280–283.

Genest, C. and MacKay, R. J. (1986b). Copules archim´ediennes et families de lois bidimensionnelles dont les marges sont donn´ees. Canadian journal of statistics, 14(2):145–159.

Gin´e, E. and Mason, D. M. (2007). On local u-statistic processes and the estimation of densities of functions of several sample variables. The Annals of Statistics, 35(3):1105–1145.

Gin´e, E. and Zinn, J. (1984). Some limit theorems for empirical processes.The Annals of Probability, pages 929–989.

Gonz´alez-Manteiga, W. and Crujeiras, R. M. (2013). An updated review of goodness-of-fit tests for regression models.

Test, 22(3):361–411.

Graham, B. S., Pinto, C. C. d. X., and Egel, D. (2016). Efficient estimation of data combination models by the method of auxiliary-to-study tilting (ast).Journal of Business & Economic Statistics, 34(2):288–301.

Guerre, E., Perrigne, I., and Vuong, Q. (2000). Optimal nonparametric estimation of first-price auctions. Economet- rica, 68(3):525–574.

Guggenberger, P. (2010a). The impact of a hausman pretest on the asymptotic size of a hypothesis test. Econometric Theory, 26(2):369–382.

Guggenberger, P. (2010b). The impact of a hausman pretest on the size of a hypothesis test: The panel data case.

Journal of Econometrics, 156(2):337–343.

Hahn, J. (1998). On the role of the propensity score in efficient semiparametric estimation of average treatment effects.

Econometrica, pages 315–331.

H´ajek, J. (1971). Discussion of ‘An essay on the logical foundations of survey sampling, Part I’, by D. Basu. In Go- dambe, V. P. and Sprott, D. A., editors,Foundations of Statistical Inference. Holt, Rinehart, and Winston, Toronto.

H¨ardle, W. and Stoker, T. M. (1989). Investigating smooth multiple regression by the method of average derivatives.

Journal of the American statistical Association, 84(408):986–995.

Hausman, J. A. (1978). Specification tests in econometrics. Econometrica, pages 1251–1271.

Heckman, J. J., Ichimura, H., and Todd, P. (1997). Matching as an econometric evaluation estimator: Evidence from evaluating a job training programme. The Review of Economic Studies, 64(4):605–654.

Heckman, J. J., Lochner, L. J., and Todd, P. E. (2006). Earnings functions, rates of return and treatment effects: The mincer equation and beyond. Handbook of the Economics of Education, 1:307–458.

Heckman, J. J. and Vytlacil, E. (2001). Policy-relevant treatment effects.American Economic Review, 91(2):107–111.

Heckman, J. J. and Vytlacil, E. (2005). Structural equations, treatment effects, and econometric policy evaluation 1.

Econometrica, 73(3):669–738.

Hirano, K., Imbens, G. W., and Ridder, G. (2003). Efficient estimation of average treatment effects using the estimated propensity score.Econometrica, 71(4):1161–1189.

Hirukawa, M., Murtazashvili, I., and Prokhorov, A. (2020). Yet another look at the omitted variable bias. Working Paper.

Dalam dokumen PDF ESSAYS ON THE ECONOMETRICS OF CAUSAL INFERENCE By Qi Xu for the degree of (Halaman 194-200)