• Tidak ada hasil yang ditemukan

Alternative Regression Methods for G-computation in Modeling 30-Day Mortality

N/A
N/A
Protected

Academic year: 2023

Membagikan "Alternative Regression Methods for G-computation in Modeling 30-Day Mortality"

Copied!
7
0
0

Teks penuh

(1)

1 Online Supplemental Material

Alternative regression methods for use with G-computation

In the primary analysis described in the paper, when using G-computation, we used restricted cubic smoothing splines to model the relationship between the log-odds of 30-day mortality and the patient’s age, the year the patient underwent surgery, and the hospital’s

operative volume for that surgical procedure in the year in which the patient underwent surgery.

We also considered three other methods for estimating the logistic regression model that related the log-odds of 30-day mortality to patient characteristics and hospital volume. We examined the following three logistic regression models: (a) all four continuous variables (age, Charlson score, year of procedure, and hospital volume) were assumed to have a linear relationship with the log- odds of death; (b) multivariable fractional polynomials (MFPs) were used to model the

relationship between each the four continuous covariate and the log-odds of patient death [1,2], using the selection algorithm described by Royston and Sauerbrei to select the most appropriate FP transformation for each of these variables [3]; (c) generalized additive models (GAMs) were used in which local regression was used to estimate the smoothed relationships between each of the four continuous predictor variables and the log-odds of patient death [4]. Finally, for

comparative purposes, we used boosted regression trees of depth four to predict the probability of 30-day death conditional on the five covariates [5-7]. We used sequences of 10,000 regression trees.

When using three of the logistic regression models (linear relationships, restricted cubic smoothing splines, and MFPs), our approach can be described as parametric G-computation

(2)

2

since we are using a parametric regression model to predict the potential outcomes. When using the other two approaches, our approach can be described as non-parametric G-computation due to the use of a non-parametric method to predict the potential outcome. We include an ensemble- based method since, in a recent study, ensemble-based G-computation based on boosted

regression trees was found to perform well for estimating the effects of binary treatments on outcomes [8].

Results when using alternative regression methods with G-computation

The estimated average annual numbers of lives saved due to regionalization of surgical services are described in the Table. The relationship between hospital surgical volume and the log-odds of 30-day mortality for each of the regression models is described in the Figure. The Figure describes the relationship between hospital volume and the log-odds of 30-day mortality for a patient whose covariates (age, sex, Charlson score and year of operation) were set to the sample median. There is one panel for each of the three procedures. For resection of the colon or rectum and pancreaticoduodenectomy, the multivariable fractional polynomial (MFP) analysis found that the volume-outcome relationship was linear, whereas for esophagectomy, the log- odds of 30-day mortality was linearly related to the reciprocal of hospital volume. Note that in the panel for pancreaticoduodenectomy, the lines for the linear relationship and the line for the MFP model are superimposed on one another. For pancreaticoduodenectomy, the use of restricted cubic smoothing splines found, similar to the MFP analysis, that the relationship was approximately linear across the range of volume. However, for esophagectomy, both the MFP analysis and the restricted cubic spline analysis found that mortality initially decreased rapidly with increasing volume. However, after this initial period of rapid decrease, improvements in patient outcomes were increasingly marginal with subsequent improvements with increasing

(3)

3

volume. The use of generalized additive models found a ‘bumpy’ relationship between hospital volume and death, which is counterintuitive. A priori, one would expect a smooth, non-

increasing relationship between increasing volume and the risk of death. Finally, boosted regression trees described a non-smooth relationship between hospital volume and mortality.

An advantage to the use of fractional polynomials in this setting is that they allow for possible transformations of the variable denoting hospital volume that result in the relationship between volume and the log-odds of death approaching a horizontal asymptote. While ensemble- based methods may offer advantages in high-dimensional data, it is not clear that they offer these advantages in low-dimensional data, particularly in settings in which one would expect smooth relationships between the principal covariate of interest and the outcome. For these reasons, in data such as ours, we suspect that restricted cubic smoothing splines and fractional polynomials offer the greatest advantages for modeling the volume-outcome relationship.

(4)

4 Reference List

1. Royston P, Altman DG. Regression using fractional polynomials of continuous covariates:

parsimonious parametric modelling (with discussion). Applied Statistics 1994;

43(3):429-467

2. Royston P, Altman DG. Approximating statistical functions by using fractional polynomial regression. The Statistician 1997; 46:411-422

3. Royston P, Sauerbrei W. Multivariable Model-Building. John Wiley & Sons,Ltd: West Sussex, 2008.

4. Hastie TJ, Tibshirani RJ. Generalized Additive Models. Chapman & Hall: London, 1990.

5. Freund Y, Schapire R. Experiments with a new boosting algorithm. In: Machine Learning:

Proceedings of the Thirteenth International Conference. Morgan Kauffman: San Francisco, California, 1996; pp 148-156.

6. Buhlmann P, Hathorn T. Boosting algorithms: Regularization, prediction and model fitting.

Statistical Science 2007; 22:477-505

7. Hastie T, Tibshirani R, Friedman J. The Elements of Statistical Learning. Data Mining, Inference, and Prediction. Springer-Verlag: New York, NY, 2001.

(5)

5

8. Austin PC. Using ensemble-based methods for directly estimating causal effects: An investigation of tree-based G-computation. Multivariate Behavioral Research 2012;

47(1):115-135. 10.1080/00273171.2012-640600

(6)

6

Table. Average annual number of lives saved due to regionalization or concentration.

Relationship between hospital volume, patient characteristics, and the log-odds of 30-day mortality

Surgical procedure

Resection of colon or rectum

Esophagectomy Pancreatico- duodenectom

y

Linear relationship 14.4 (7.3, 21.6) 1.6 (0.3, 2.9) 3.7 (1.9, 5.5)

Restricted cubic smoothing splines 20.2 (12.3, 28.2) 2.0 (0.7, 3.2) 3.6 (1.8, 5.4)

Fractional polynomials 14.7 (4.0, 25.4) 1.6 (0.4, 2.8) 3.7 (1.8, 5.6)

Generalized additive models (GAMs) 17.5 (7.0, 28.0) 1.9 (0.4, 3.3) 3.4 (1.5, 5.2)

Boosted regression trees of depth four 12.0 (5.3, 18.8) 1.5 (0, 2.9) 3.1 (1.4, 4.7)

Each cell reports the expected annual number of lives saved due to regionalization (95%

confidence interval)

(7)

0 50 100 150 200 250 300

−4.8−4.6−4.4−4.2−4.0−3.8

Hospital surgical volume

Log−odds of 30−day death

Resection of colon or rectum

0 10 20 30 40

−4.5−3.5−2.5

Hospital surgical volume

Log−odds of 30−day death

Esophagectomy

0 20 40 60 80 100

−5.0−4.5−4.0−3.5−3.0

Hospital surgical volume

Log−odds of 30−day death

Pancreaticoduodenectomy

Figure. Relationship between surgical volume and 30−day mortality

Restricted cubic smoothing splines Fractional polynomials

Linear relationship

Generalized additive model Boosted regression trees

Referensi

Dokumen terkait