Using Data from the Healthcare Cost and Utilization Project for State Health Policy Research
Appendix
This appendix supplements the main text. Section 1 provides a description of the wild cluster bootstrap method used to estimate p-values in the example estimating the impacts of the
Medicaid expansion in Connecticut. Section 2 presents descriptive statistics that supplement the Connecticut analysis. Section 3 discusses why the precision of the state-level cesarean delivery rates shown in Figure 1 of the main text changes over time. We then provide the complete set of cesarean delivery rates for the 18 states we considered.
Section 1: Wild cluster bootstrap procedure for the difference-in-differences estimates In Table 2, we implement a difference-in-differences research design for Connecticut and use four nearby states as comparison states. Because policy treatment varies at the state level, we were concerned about serial correlation. However, with 5 state clusters, we face a small clusters problem, which can lead to an over rejection of the null hypothesis using standard clustering procedures. To address this, we implement a wild cluster bootstrap procedure (Cameron, Gelbach, Miller 2008; Cameron and Miller 2015). We estimate wild cluster bootstrap using the Stata program CLUSTSE. Our main analysis weights state-sex-age group-race/ethnicity-year cells by their population. CLUSTSE syntax, however, does not accept weighted regressions – standard Stata commands for implementing weighted regressions are not accepted and
“manually” weighting the regression by interacting all outcomes and right-hand side variables with the square root of the population is not permitted as the command does not allow
suppression of the constant term. To yield population weighted estimates, we duplicate cells, say cells A and B, such that the ratio of observations of cell A to cell B is identical to the ratio of their populations. We then estimate standard OLS and implement a wild cluster bootstrap with 200 replications.
Section 2: Descriptive statistics for the Connecticut Medicaid expansion analysis.
Appendix Table 1. Sample Composition of Individual (Discharge) Hospital Characteristics for Connecticut and Comparison States, Nationwide versus State Files
Pre Period 2006-2009
Post Period 2010-2011
NIS SID NIS SID
Est (SE) Est (SE) Est (SE) Est (SE)
Connecticut
Individual Characteristics
Age
19-34 0.29 (0.010) 0.30 (0.001) 0.29 (0.009) 0.29 (0.001)
35-49 0.33 (0.003) 0.33 (0.001) 0.30 (0.006) 0.30 (0.001)
50-64 0.38 (0.012) 0.38 (0.001) 0.41 (0.014) 0.4 (0.001)
Male 0.39 (0.006) 0.39 (0.001) 0.4 (0.008) 0.4 (0.001)
Race
White, Non-Hispanic 0.68 (0.026) 0.66 (0.001) 0.68 (0.058) 0.65 (0.001)
Black, Non-Hispanic 0.15 (0.027) 0.14 (<0.001) 0.16 (0.039) 0.15 (0.001)
Other, Non-Hispanic 0.05 (0.010) 0.06 (<0.001) 0.05 (0.008) 0.06 (0.001)
Hispanic 0.12 (0.014) 0.14 (<0.001) 0.11 (0.022) 0.06 (<0.001)
Comparison States (NY NJ RI VT)
Individual Characteristics (Discharge
Level)
Age
19-34 0.30 (0.007) 0.31 (<0.001) 0.30 (0.008) 0.31 (<0.001)
35-49 0.33 (0.003) 0.33 (<0.001) 0.30 (0.004) 0.30 (<0.001)
50-64 0.37 (0.006) 0.37 (<0.001) 0.39 (0.008) 0.39 (<0.001)
Male 0.41 (0.009) 0.41 (<0.001) 0.42 (0.010) 0.41 (<0.001)
Race
White, NH 0.56 (0.027) 0.53 (<0.001) 0.52 (0.037) 0.53 (<0.001)
Black, NH 0.19 (0.014) 0.2 (<0.001) 0.23 (0.024) 0.21 (<0.001)
Other, NH 0.11 (0.015) 0.1 (<0.001) 0.11 (0.018) 0.11 (<0.001)
Hispanic 0.14 (0.016) 0.14 (<0.001) 0.15 (0.018) 0.14 (<0.001)
Source: Healthcare Cost and Utilization Project (HCUP) Nationwide Inpatient Sample (NIS) and State Inpatient Database (SID) 2006-2011.
Section 3: Cesarean Delivery Rates
A. Precision of State Level Cesarean Delivery Rates across Time
As demonstrated in Figure 1 of the main text, state-level estimates from the NIS can have substantially different confidence intervals across years. This is due to the fact that state is not part of the sample strata. The absence of state means that state level sampling fractions are not fixed, but can vary over time. Assuming relatively constant state population sizes over time, state level sample size can change simply due to the sample design. When the sample size in a
particular state is smaller, the standard errors and confidence intervals will tend to be larger. To demonstrate this, Appendix Figure 1 plots the standard errors underlying the confidence intervals surrounding the cesarean delivery rates presented in Figure 1 as a function the unweighted number of inpatient delivery discharges (the denominator of the rates shown in Figure 1). The figure shows that there is substantial variance in the number of observations within state across year in the NIS. For example, the number of inpatient deliveries in Arizona varies between 8,401 and 27,215. The figure also shows a strong negative relationship between the number of
observations and the size of the standard error. While it is possible for the sample size to vary due to changes in population size, the number of inpatient deliveries in the population does not substantially change over time. For example, the number of unweighted deliveries nearly doubled in South Dakota from 2008 to 2009 (532 to 944) and the standard error was cut in half (.038 to .016). However, vital statistics suggests that the number of live births in South Dakota was virtually unchanged from 2008 to 2009 (12,071 versus 11,934).
Appendix Figure 1. Number of Inpatient Deliveries in the NIS sample (unweighted) vs. Standard Errors for Caesarean deliveries
Notes: National Inpatient Sample (NIS) 2004-2011. States included are Arizona, Maryland, Massachusetts, Nebraska, New Jersey, and South Dakota. Each point represents a state-year.
0 0.01 0.02 0.03 0.04 0.05
0 5,000 10,000 15,000 20,000 25,000 30,000 35,000 Standard Error of Cesarian Delivery Rate
Number of Inpatient Deliveries (Unweighted Sample Size)
SD NE MD MA AZ NJ
While sample size likely plays the most important role in the size of the standard errors, other factors are also influential. For example, the standard error from a complex sample is influenced by heterogeneity in the weights. More variance in the weights corresponds to larger standard errors. Given that states are not strata in the sample design, variation in the weights can fluctuate for a given state over time.
B. Complete set of cesarean delivery rates for the 18 states from State Inpatient Databases,2004- 2011.
Arizona
Colorado
Iowa
Kentucky
Maryland
Massachusetts
Nebraska
New Jersey
New York
Nevada
North Carolina
0510152025303540Percent
2004 2005 2006 2007 2008 2009 2010 2011
Year
NIS SID
Oregon
South Carolina
South Dakota
Utah
Vermont
Washington
Wisconsin