IRMSE, and they have better coverage properties. The difference is more prominent when conditional treatment effects are considered. This is partially due to the fact that the component in the expansion of the conditional TEBFs, which is associated with the single-index estimation, is of a lower order than the component appearing in the overall TEBFs, even though both are negligible in the first order. To improve the performance of our bootstrap procedure for conditional TEBFs, one may consider adding the influence functions associated with first stage estimation when constructing the bootstrap processes. This is left for future research.
Overall, the results from finite-sample studies align with the theoretical predictions discussed in Sections 1.4 and 1.5.
Table 1.1: Monte Carlo results for the conditional and overall TEBFs
(a) Lower Conditional TEBF Upper Conditional TEBF
Feasible Avg. Intg. Bias Med. Intg. Bias IRMSE Cvg. Rate Feasible Avg. Intg. Bias Med. Intg. Bias IRMSE Cvg. Rate
CATE 0.040 0.032 0.051 0.929 CATE 0.040 0.032 0.051 0.934
CDTE 0.072 0.062 0.081 0.933 CDTE 0.073 0.063 0.082 0.928
CQTE 0.066 0.057 0.133 0.924 CQTE 0.067 0.057 0.134 0.936
CCHTE 0.186 0.151 0.259 0.937 CCHTE 0.186 0.151 0.259 0.937
Oracle Avg. Intg. Bias Med. Intg. Bias IRMSE Cvg. Rate Oracle Avg. Intg. Bias Med. Intg. Bias IRMSE Cvg. Rate
CATE 0.026 0.021 0.033 0.938 CATE 0.025 0.020 0.032 0.942
CDTE 0.046 0.040 0.051 0.947 CDTE 0.047 0.039 0.051 0.927
CQTE 0.043 0.036 0.086 0.932 CQTE 0.045 0.039 0.089 0.938
CCHTE 0.114 0.096 0.138 0.938 CCHTE 0.114 0.096 0.138 0.938
(b) Lower Conditional TEBF Upper Conditional TEBF
Feasible Avg. Intg. Bias Med. Intg. Bias IRMSE Cvg. Rate Feasible Avg. Intg. Bias Med. Intg. Bias IRMSE Cvg. Rate
CATE 0.039 0.032 0.051 0.932 CATE 0.039 0.031 0.050 0.936
CDTE 0.070 0.061 0.079 0.927 CDTE 0.071 0.062 0.080 0.923
CQTE 0.062 0.053 0.125 0.925 CQTE 0.063 0.054 0.127 0.930
CCHTE 0.192 0.157 0.267 0.939 CCHTE 0.190 0.156 0.259 0.943
Oracle Avg. Intg. Bias Med. Intg. Bias IRMSE Cvg. Rate Oracle Avg. Intg. Bias Med. Intg. Bias IRMSE Cvg. Rate
CATE 0.025 0.020 0.032 0.936 CATE 0.025 0.020 0.032 0.938
CDTE 0.046 0.040 0.050 0.948 CDTE 0.046 0.039 0.050 0.929
CQTE 0.040 0.034 0.081 0.929 CQTE 0.044 0.038 0.085 0.941
CCHTE 0.118 0.100 0.144 0.939 CCHTE 0.119 0.102 0.144 0.940
(c) Lower Overall TEBF Upper Overall TEBF
Feasible Avg. Intg. Bias Med. Intg. Bias IRMSE Cvg. Rate Feasible Avg. Intg. Bias Med. Intg. Bias IRMSE Cvg. Rate
ATE 0.023 0.018 0.028 0.958 ATE 0.023 0.019 0.028 0.956
DTE 0.041 0.036 0.044 0.965 DTE 0.041 0.036 0.044 0.963
QTE 0.037 0.033 0.074 0.961 QTE 0.038 0.032 0.074 0.965
CHTE 0.103 0.091 0.122 0.966 CHTE 0.102 0.088 0.120 0.967
Oracle Avg. Intg. Bias Med. Intg. Bias IRMSE Cvg. Rate Oracle Avg. Intg. Bias Med. Intg. Bias IRMSE Cvg. Rate
ATE 0.021 0.018 0.026 0.954 ATE 0.021 0.018 0.026 0.963
DTE 0.039 0.034 0.042 0.954 DTE 0.039 0.035 0.042 0.950
QTE 0.036 0.031 0.071 0.948 QTE 0.038 0.034 0.074 0.949
CHTE 0.097 0.085 0.115 0.951 CHTE 0.096 0.085 0.114 0.957
(d) Lower Overall TEBF Upper Overall TEBF
Feasible Avg. Intg. Bias Med. Intg. Bias IRMSE Cvg. Rate Feasible Avg. Intg. Bias Med. Intg. Bias IRMSE Cvg. Rate
ATE 0.022 0.018 0.028 0.963 ATE 0.022 0.019 0.027 0.957
DTE 0.040 0.035 0.042 0.965 DTE 0.040 0.035 0.043 0.966
QTE 0.035 0.031 0.069 0.968 QTE 0.036 0.032 0.071 0.965
CHTE 0.107 0.096 0.127 0.965 CHTE 0.104 0.091 0.123 0.968
Oracle Avg. Intg. Bias Med. Intg. Bias IRMSE Cvg. Rate Oracle Avg. Intg. Bias Med. Intg. Bias IRMSE Cvg. Rate
ATE 0.021 0.017 0.026 0.954 ATE 0.021 0.017 0.025 0.966
DTE 0.038 0.034 0.041 0.955 DTE 0.038 0.034 0.041 0.954
QTE 0.034 0.030 0.068 0.957 QTE 0.036 0.032 0.071 0.958
CHTE 0.100 0.088 0.119 0.952 CHTE 0.099 0.087 0.118 0.957
Notes: Simulations are based on 1,000 Monte Carlo experiments with samples of sizen=1,000. Panels (a) and (c) present results for conditional and overall TEBF withθθθ= (1,1.5). Panels (b) and (d) correspond toθθθ= (1,2). In each panel, “feasible” represents results generated with the single-index parameters estimated following Section 1.4.1, whereas the results in the “oracle” sub-panel correspond to those generated using the true single-index parameters. “Avg. Intg. Bias”, “Med. Intg. Bias”, “IRMSE”, and “Cvg. Rate” stand for the average integrated bias, median inte- grated bias, integrated root mean squared errors, and 95% empirical coverage probability, respectively. The empirical coverage probability is based on bootstrap confidence sets computed with 1,999 multiplier bootstrap replications.
Table 1.2: Summary statistics
GHS-2000 (No. Obs. = 514) AHOPCA ALL-2008 (No. Obs. = 536) Statistics Mean St. Dev. Pctl(25) Median Pctl(75) Mean St. Dev. Pctl(25) Median Pctl(75)
Follow-up Duration, Years 4 3.84 0.47 2.73 7.59 2.29 2.05 0.71 1.55 3.47
Abandonments 21% 0.41 0 0 0 15% 0.35 0 0 0
Age 7.67 4.36 3.79 6.79 11.26 7.67 4.94 3.45 6.01 11.82
White Blood Cell Count 4.64 10.01 0.45 1.13 4.24 4.28 8.52 0.53 1.11 4.07
Time to Hospital 4.05 2.85 1.5 4 6 2.87 2.03 1 3 4
Male 61% 0.49 0 1 1 53% 0.5 0 1 1
CNS 6% 0.23 0 0 0 13% 0.34 0 0 0
Linage 92% 0.27 1 1 1 93% 0.26 1 1 1
Living Condition 62% 0.49 0 1 1 46% 0.5 0 0 1
Family Type 45% 0.5 0 0 1 14% 0.35 0 0 0
Phone at Home 36% 0.48 0 0 1 42% 0.49 0 0 1
Note: Summary statistics for two protocols for ALL treatment. The left panel describes the GHS-2000 group (2000 - 2007). The right panel is for the AHOPCA ALL-2008 group (2008-2015). ”MALE”, ”CNS”, ”Lineage”, ”Living Condition”, ”Family Type”, ”Phone at Home” are the dummy variables. These variables stand for whether the subject is male, the involvement of the central nervous system, the type of tumor lineage, whether the patient lives in an urban neighborhood, whether the patient lives in a united family, and if the patient owns a home phone.
ditions, home phone ownership, and distance to the hospital. Table 1.2 summarizes these characteristics for patients undergoing each of the two protocols. Instead of performing multiple imputations as in Bernasconi et al. (2022), miss- ing cases are removed. Results from the table show that patients from the AHOPCA ALL-2008 group are less likely to withdraw treatment, more likely to live in a rural neighborhood, and tend to live closer to the clinic. To account for these imbalances across the treatment groups, Bernasconi et al. (2022) relies on aninverse probability of treatment and censoringweighting strategy, which further depends on the assumption that, conditional on the baseline covariates, the potential EFS and abandonment are mutually independent. Such a restriction, however, is not necessary for our proposed methodology.
The main finding of Bernasconi et al. (2022) is that the AHOPCA ALL-2008 protocol leads to better potential EFS in the first three years and the difference tapers off in the long term (approximately 5 years). Could these results carry over to the case of dependent censoring? To address this question, we consider two scenarios that are characterized by different ranges of the copula parameterθ. Specifically, we assume that the true copula belongs to the Gumbel family and the indexing parameters lie in(1,1.5)for the first case and in(1,2)for the second. Both scenarios feature a mild positive correlation pattern between the EFS and withdrawal time, and both encompass independent censoring as a limiting case. When mapped to Kendall’sτ, the maximum levels of positive correlation under the two scenarios are 1/3 and 1/2, respectively.
In the first step of analysis, we assess the validity of the index sufficiency assumption. In light of Remark 1.1, we can implement the specification test (Algorithm 1.9.2) as presented in Section 1.9.2.2. The null hypothesis is that the
single index assumption holds for the joint distribution of(Td,Cd),d∈ {0,1}. The bootstrap test cannot reject the null for either treatment group at the 10% level, indicating that our methodology can be applied to this context.
Estimation of the BGFs and TEBFs closely follows the procedures described in Section 1.6. As in Bernasconi et al. (2022), we consider two different time frames: 3 and 5 years post-treatment. For the shorter period, the TEBFs are estimated over the index setU3, which is equivalent to{0.05,0.06, ...,0.29,0.3}when the (C)QTE is considered, whileU3={0,0.1, ...,2.9,3.0} for all other types of treatment effects. For the longer period, the index setU5= {0.05,0.06, ...,0.44,0.45}is employed for the (C)QTE, andU5={0,0.1, ...,4.9,5.0}for all other types of treatment effects. For all of our analyses of the conditional treatment effect, we fix the conditioning set at the “representative”
observation, which is the sample average of the baseline covariates.
Figure 1.2: Estimates of the potential EFS curves
Notes: The top plot depicts the unconditional potential survival curve estimates, whereas the bottom figure represents conditional survival curve esti- mates. The solid curves represent SICG estimates for the two protocols, with the independence copula. For each The shaded areas are bounded from above and below by SICG estimates, using Gumbel copula parameters of 1 and 1.5, respectively. The Peterson’s worst case bounds for the treated and control group are depicted with dot-dash and dashed curves correspondingly.
Figure 1.3: Distributional treatment effect estimates
Notes: The top and bottom plots represent overall and conditional DTE estimates along with their 95% uniform confidence bands, respectively. The solid black lines and the dark gray area depict the DTE estimates and their uniform confidence bands under the independent censoring mechanism.
Dashed (dot-dash) lines and the light gray area depict the upper (lower) bound of the DTE and the corresponding uniform confidence bands, with a Gumbel copula andθθθ= (1,1.5). The confidence bands are computed following the bootstrap procedures in Algorithms 1.5.1 and 1.9.1, respectively, with 1,999 bootstrap replications. The Peterson’s worst case bounds for the DTE are delineated with dot-dash and dashed curves correspondingly.
We turn now to a discussion of the estimation results. Figure 1.2 presents the estimated potential EFS curves.
Our findings mirror the original results of Bernasconi et al. (2022), revealing that the new protocol improves survival prospects in the initial years following treatment. However, this beneficial effect appears to taper off over a year earlier than previously indicated. Moreover, if we loosen the independent censoring condition, the beneficial effect may completely vanish. This is evidenced by the overlapping of the estimated identified sets of potential EFS, even under the stricter configuration,θθθa. It is also worth noting that our identified set is significantly narrower than the one derived from the no-information bounds, emphasizing our ability to provide a flexible middle ground compared to the most robust approach.
Table 1.3: Estimation results for conditional and overall TEBFs
(a) Treatment Effect Estimators under Independent Censoring
ATE(3) mad.DTE mad.QTE mad.CHTE CATE(3) mad.CDTE mad.CQTE mad.CCHTE
0.078 -0.062 0.630 -0.071 0.051 -0.061 0.370 -0.071
[ -0.157, 0.314 ] [ -0.189, 0.139 ] [ -1.61, 2.15 ] [ -0.244, 0.189 ] [ -0.082, 0.184 ] [ -0.139, 0.097 ] [ -0.784, 1.034 ] [ -0.187, 0.142 ] Treatment Effect Estimators withθθθa= (1,1.5)
ATE(3) mad.DTE mad.QTE mad.CHTE CATE(3) mad.CDTE mad.CQTE mad.CCHTE
Lower Bd. -0.097 -0.094 -0.490 -0.145 -0.128 -0.100 -0.460 -0.121
[ -0.329, 0.136 ] [ -0.222, 0.128 ] [ -1.564, 1.324 ] [ -0.325, 0.181 ] [ -0.26, 0.004 ] [ -0.177, 0.076 ] [ -1.038, 0.728 ] [ -0.24, 0.119 ]
Upper Bd. 0.198 0.090 0.800 0.129 0.193 0.096 0.680 0.152
[ -0.034, 0.431 ] [ -0.168, 0.221 ] [ -1.377, 2.247 ] [ -0.235, 0.323 ] [ 0.059, 0.327 ] [ -0.108, 0.175 ] [ -0.512, 1.232 ] [ -0.16, 0.277 ] Treatment Effect Estimators withθθθb= (1,2)
ATE(3) mad.DTE mad.QTE mad.CHTE CATE(3) mad.CDTE mad.CQTE mad.CCHTE
Lower Bd. -0.205 -0.124 -0.720 -0.196 -0.230 -0.124 -0.680 -0.174
[ -0.442, 0.032 ] [ -0.251, 0.127 ] [ -1.648, 1.138 ] [ -0.386, 0.19 ] [ -0.366 , -0.095 ] [ -0.202, 0.078 ] [ -1.226, 0.606 ] [ -0.296, 0.122 ]
Upper Bd. 0.268 0.140 1.040 0.208 0.270 0.141 0.830 0.245
[ 0.034, 0.502 ] [ -0.165, 0.274 ] [ -1.329, 2.439 ] [ -0.255, 0.429 ] [ 0.134, 0.406 ] [ -0.091, 0.222 ] [ -0.451, 1.331 ] [ -0.152, 0.386 ]
(b) Treatment Effect Estimators under Independent Censoring
ATE(5) mad.DTE mad.QTE mad.CHTE CATE(5) mad.CDTE mad.CQTE mad.CCHTE
0.169 -0.064 2.860 -0.099 0.018 -0.061 -1.310 0.073
[ -0.26, 0.598 ] [ -0.211, 0.158 ] [ -6.271, 9.041 ] [ -0.332, 0.248 ] [ -0.227, 0.263 ] [ -0.151, 0.129 ] [ -4.084, 3.144 ] [ -0.228, 0.231 ] Treatment Effect Estimators withθθθa= (1,1.5)
ATE(5) mad.DTE mad.QTE mad.CHTE CATE(5) mad.CDTE mad.CQTE mad.CCHTE
Lower Bd. -0.219 -0.122 -2.320 -0.211 -0.359 -0.100 -2.290 -0.133
[ -0.651, 0.213 ] [ -0.268, 0.146 ] [ -7.76, 5.69 ] [ -0.451, 0.24 ] [ -0.596 , -0.123 ] [ -0.189, 0.088 ] [ -5.017, 2.877 ] [ -0.298, 0.164 ]
Upper Bd. 0.407 0.090 3.550 0.163 0.288 0.144 1.060 0.299
[ -0.027, 0.841 ] [ -0.189, 0.242 ] [ -4.023, 7.643 ] [ -0.317, 0.439 ] [ 0.048, 0.529 ] [ -0.12, 0.234 ] [ -1.185, 2.285 ] [ -0.222, 0.486 ] Treatment Effect Estimators withθθθb= (1,2)
ATE(5) mad.DTE mad.QTE mad.CHTE CATE(5) mad.CDTE mad.CQTE mad.CCHTE
Lower Bd. -0.458 -0.152 -3.220 -0.273 -0.570 -0.124 -2.830 -0.192
[ -0.877 , -0.039 ] [ -0.296, 0.144 ] [ -8.706, 5.696 ] [ -0.524, 0.252 ] [ -0.815 , -0.325 ] [ -0.21, 0.087 ] [ -5.504, 2.734 ] [ -0.356, 0.164 ]
Upper Bd. 0.537 0.157 4.090 0.323 0.427 0.199 1.430 0.442
[ 0.108, 0.966 ] [ -0.183, 0.309 ] [ -3.954, 8.114 ] [ -0.355, 0.644 ] [ 0.187, 0.668 ] [ 0.107, 0.291 ] [ -1.097, 2.577 ] [ 0.241, 0.644 ]
Notes: The results in Panel (a) and (b) are generated using the index setsU3andU5, respectively. Estimates of treatment effects are displayed in the first row of each panel. Except for the ATE and CATE, the value with the maximum absolute deviation (mad) from 0 over the index set is reported for each treatment effect. Numbers in square brackets represent the corresponding 95% bootstrap confidence intervals, based on 1,999 bootstrap replications. These are calculated following Algorithms 1.5.1 and 1.9.1, for the overall and conditional cases, respectively.
The findings are further validated by the overall and conditional DTE estimates illustrated in Figure 1.3. According to the uniform confidence bands of (C)DTE under independent censoring, the newer protocol does not statistically significantly outperform the older one, even in the shorter post-treatment period. The introduction of dependent censoring does not alter this conclusion. That said, this conclusion is valid only under the maintained ranges ofθ, which include the special case of non-informative censoring. In order to generalize this to other dependence patterns, we would need to perform robustness checks with corresponding levels ofθ.
More comprehensive results are compiled in Table 1.3. Here, we not only report the “mean” values for (C)DTE as captured by (C)ATE, but also provide the values with the maximum absolute deviations from zero over their index sets for other types of TEBFs. Additionally, the table includes 95% uniform confidence sets corresponding to these reported TEBF estimates. Although Table 1.3 does not conclusively demonstrate treatment effect nullity across the
entire Gumbel copula family, it does suggest that there are no significant differences between the two protocols, on average and uniformly over the index sets, regardless of the type of policy effect under consideration. This observation aligns with the findings depicted in Figures 1.2 and 1.3. Furthermore, this conclusion remains valid across different correlation scenarios and analysis periods.
In sum, when we deviate from the conditional independence censoring mechanism, we do not find enough evidence supporting that AHOPCA ALL-2008 leads to more favorable early-year survival prospects.