• Tidak ada hasil yang ditemukan

Appendix: Estimation of natural indirect effects under unmeasured confounding and measurement error in the mediator

N/A
N/A
Protected

Academic year: 2024

Membagikan "Appendix: Estimation of natural indirect effects under unmeasured confounding and measurement error in the mediator"

Copied!
22
0
0

Teks penuh

(1)

Appendix: Estimation of natural indirect effects under unmeasured confounding and measurement error in the mediator

Alternate Figure 2

A M Y

W

C

(a) unmeasured confounding ofA-Y (vi- olation of M4)

A M Y

U C

(b) unmeasured confounding of M-Y (violation of M3)

A M Y

C U

W

(c) unmeasured confounding ofA-Y and M-Y (violation of M3 and M4)

A M Y

C U

W

M

(d) measurement error inM and unmea- sured confounding ofA-Y andM-Y

Figure A1: Directed Acyclic Graphs (DAGs) with indirect effects in red.

(2)

A1. Proofs for all results

A1.1 Result 1

Under DAG (a), the true NIE (i.e. W is observed) is given by,

N IEtruth(a, a)M1−M= 40 Z

c

Z

m

Z

w

E[Y|m, a, c, w](fM(m|a, c)−fM(m|a, c))fW(w|c)fC(c)dwdmdc

= Z

c

Z

m

(fM(m|a, c)−fM(m|a, c)) Z

w

E[Y|m, a, c, w]fW(w|c)dw

fC(c)dmdc

= Z

c

Z

m

(fM(m|a, c)−fM(m|a, c)) Z

w

(E[Y|m, a, c, w]−

E[Y|M = 0, a, c, w])fW(w|c)dw

fC(c)dmdc

M40(c)

= Z

c

Z

m

(fM(m|a, c)−fM(m|a, c)) Z

w

γ1(a, m, c)fW(w|c)dw

fC(c)dmdc

= Z

c

Z

m

γ1(a, m, c)(fM(m|a, c)−fM(m|a, c))fC(c)dmdc

The naive NIE which does not use information on W and incorrectly assumes M1-M4 holds is given by,

N IEnaive(a, a) = Z

c

Z

m

E[Y|M =m, A=a, C =c](fM(m|a, c)−fM(m|a, c))dmdc

= Z

c

Z

m

(E[Y|M =m, A=a, C=c]−E[Y|M = 0, A=a, C=c])(fM(m|a, c)−fM(m|a, c))dm

∗∗= Z

c

Z

m

γ1(a, m, c)(fM(m|a, c)−fM(m|a, c))dm

=N IEtruth(a, a)

**We want to show that γ1(a, m, c) can be identified without conditioning on W. To do so, we first recognize that the no additive interaction assumption encodes the following,

E[Y|M =m, A=a, C=c, W =w] =θA(a) +θM(m) +θC(c) +θW(w)

A,M(a, m) +θA,C(a, c) +θM,C(m, c) +θA,W(a, w) +θW,C(w, c) +θA,M,C(a, m, c) +θA,W,C(a, w, c)

so that

(3)

where θA(a), θM(m), and θW(w) are arbitrary functions of a, m,and w, respectively, with the property that θA(0) = θM(0) = θW(0) = 0. Likewise, θA,M(a, m) and θA,W(a, w) are arbitrary functions of interactions between A −M and A −W such that θA,M(0,0) = θA,M(0, m) = θA,M(a,0) = 0 and θA,W(0,0) = θA,W(0, w) = θA,W(w,0) = 0. This idea can be extended to the other terms encoding interactions.

The marginal model is given by,

E[Y|M =m, A=a, C=c] =E[E[Y|M =m, A=a, C=c, W =w]|A=a, M =m, C =c]

= Z

w

E[Y|M =m, A=a, C =c, W =w]fW(w|a, m, c)dw

= Z

w

E[Y|M =m, A=a, C =c, W =w]fW(w|a, c)dw

M40

= θA(a) +θM(m) +θC(c) +E[θW(w)|A=a, C=c]

A,M(a, m) +θA,C(a, c) +θM,C(m, c) +E[θA,W(a, w)|A=a, C=c]

+E[θW,C(w, c)|A=a, C=c] +θA,M,C(a, m, c) +E[θA,W,C(a, w, c)|A=a, C=c]

So that the marginal contrast (i.e. W not observed) is equal to,

E[Y|M =m, A=a, C=c]−E[Y|M = 0, A=a, C =c] =θM(m) +θA,M(a, m) +θM,C(m, c) +θA,M,C(a, m, c)

1(a, m, c)

A1.2 Result 2

Under DAG (a), the true RRN IE is (i.e. W is observed),

RRN IEtruth(a, a|c, w)M1−M4= 00 R

mP r[Y = 1|a, M =m, w, c]fM(m|a, c) dm R

mP r[Y = 1|a, M =m, w, c]fM(m|a, c) dm

= R

mP r[Y = 1|a, M =m, w, c]fM(m|a, c) dm R

mP r[Y = 1|a, M =m, w, c]fM(m|a, c) dm× P[Y = 1|a, M = 0, w, c]

P[Y = 1|a, M = 0, w, c]

(4)

The naive NIE which does not use information on W and incorrectly assumes M1-M4 holds is given by,

RRN IEnaive(a, a|c) = R

mP r[Y = 1|a, M =m, c]fM(m|a, c) dm R

mP r[Y = 1|a, M =m, c]fM(m|a, c) dm

= R

mP r[Y = 1|a, M =m, c]fM(m|a, c) dm R

mP r[Y = 1|a, M =m, c]fM(m|a, c) dm× P[Y = 1|a, M = 0, c]

P[Y = 1|a, M = 0, c]

= R

mP r[Y = 1|a, M =m, c]

P r[Y = 1|a, M = 0, c]fM(m|a, c)dm R

mP r[Y = 1|a, M =m, c]

P r[Y = 1|a, M = 0, c]fM(m|a, c) dm

∗∗= R

mγ2(a, m, c)fM(m|a, c) dm R

mγ2(a, m, c)fM(m|a, c) dm

=RRN IEtruth(a, a|c, w)

**We want to show that γ2(a, m, c) can be identified without conditioning on W. First note that M400 encodes a no M-W interaction assumption;

P r(Y = 1|M =m, A=a, C =c, W =w) = exp(θA(a) +θM(m) +θC(c) +θW(w)

A,M(a, m) +θA,C(a, c) +θM,C(m, c) +θA,W(a, w) +θW,C(w, c) +θA,M,C(a, m, c) +θA,W,C(a, w, c))

so that,

γ2(a, m) =P r[Y = 1|M =m, A=a, C=c, W =w]/P r[Y = 1|M = 0, A=a, C =c, W =w]

= exp(θM(m) +θA,M(a, m) +θM,C(m, c) +θA,M,C(a, m, c))

(5)

The marginal model is given by,

E[Y|M =m, A=a, C=c] =E[E[Y|M =m, A=a, C=c, W =w]|A=a, M =m, C =c]

= Z

w

E[Y|M =m, A=a, W =w, C =c]fW(w|a, m, c)dw

= Z

w

E[Y|M =m, A=a, W =w, C =c]fW(w|a, c)dw

= Z

w

P r[Y = 1|M =m, A=a, W =w, C =c]fW(w|a, c)dw

= Z

w

exp(θA(a) +θM(m) +θC(c) +θW(w) +θA,M(a, m) +θA,C(a, c) +θM,C(m, c) +θA,W(a, w) +θW,C(w, c) +θA,M,C(a, m, c) +θA,W,C(a, w, c))fW(w|a, c)dw

= exp(θA(a) +θM(m) +θC(c) +θA,M(a, m) +θA,C(a, c) +θM,C(m, c) +θA,M,C(a, m, c))

×E[exp(θW(w) +θA,W(a, w) +θW,C(w, c) +θA,W,C(a, w, c))|A=a, C =c]

So that the marginal ratio (i.e. W not observed) is equal to,

E[Y|M =m, A=a, C=c]/E[Y|M = 0, A=a, C=c] = exp(θM(m) +θA,M(a, m) +θM,C(m, c) +θA,M,C(a, m, c))

2(a, m, c)

A1.3 Result 3

M30(c) and (d) imply the following modeling assumptions:

E[Y|A, M, C, U] =θ0mM +θa(A) +θTc(C) +θu(U) (1) E[M|A, C, U] =β0aA+βc(C) +βu(U) (2)

Note that interactions between A, C and M, Care allowed. This is omitted from the above equations for simplicity, but the proof proceeds similarly.

(6)

Under DAG (b), the true NIE (i.e. U is observed) is given by,

N IEtruth(a, a)M1−M3

0

= Z

c,m,u

E[Y|m, a, c, u](fM(m|a, c, u)−fM(m|a, c, u))dmdfU(u|c)dfC(c)

= Z

m,u

E[Y|m, a, c, u](fM(m|a, c, u)−fM(m|a, c, u))dmdfU(u|c)dfC(c)

eq1= Z

c,m,u

(fM(m|a, c, u)−fM(m|a, c, u))(θmM +θa(A) +θTc(C) +θu(U))dmdfU(u|c)dfC(c)

= Z

c,u

E[M|a, c, u]−E[M|a, c, u]dfU(u|c)dfC(c)

eq2= θmβa(a−a)

Now, it suffices to show that we can find unbiased estimators for θm and βa. The former follows directly from GENIUS. Under the conditions given in assumption M30,

θm = E[{A−E[A|C]}{M −E[M|A, C]}Y] E[{A−E[A, C]}{M−E[M|A, C]}M]

= E[{A−E[A|C]}{M−E[M|A, C]}Y] var(A)[var(M|A= 1, C)−var(M|A= 0, C)]

The latter follows from the fact that A⊥U|C=c ∀c,

E[M|A=a, C=c] =E[E[M|A=a, C =c, U =u]|A=a, C=c]

eq2= Z

u

0aa+βc(c) +βu(U))f(u|a, c)

M30(b)

= Z

u

0aa+βc(c) +βu(U))f(u|c)

0aa+βc(c) +E[βu(U)|C =c]

0aa+βc(c)c

That is, βa can be consistently estimated by the OLS estimator, ˆβa from the model E[M|A = a, C = c] =β0aa+βc(c)c.

(7)

A1.4 Result 4

M300(f) implies the following new modeling assumption for the outcome model:

E[Y|A, M, C, U] =θ0mM +θa(A) +θTc(C) +θu(U) +θw(W) (3)

This proof follows similarly to that of Result 3. Under DAG (c), the true NIE (i.e. U,W are observed) is given by,

N IEtruth(a, a)M1−M3

00

= Z

m,u,w,c

E[Y|m, a, c, u, w](fM(m|a, c, u)−fM(m|a, c, u))dmdfW(w|c)dfU(u|c)dfC(c)

= Z

c,m,u

(fM(m|a, c, u)−fM(m|a, c, u))

× Z

w

E[Y|M =m, A=a, C=c, U=u, W =w]fW(w)dw

dmdfW(w|c)dfU(u|c)dfC(c)

= Z

c,m,u

(fM(m|a, c, u)−fM(m|a, c, u)) Z

w

(E[Y|M =m, A=a, C=c, U =u, W =w]− E[Y|M = 0, A=a, C =c, U =u, W =w])fW(w|c)dw

dmdfU(u|c)dfC(c)

eq3= θm

Z

m,u

m(fM(m|a, c, u)−fM(m|a, c, u))dmdfU(u|c)dfC(c)

eq2= θmβa(a−a)

Similar to the proof of Result 3, it suffices to show that we can find unbiased estimators for θm and βa.

(8)

For the latter, a slight update is made to the GENIUS proof,

E[(A−E(A|C))(M −E(M|A, C))Y] =E[E[(A−E[A|C])(M −E[M|A|C])Y|A, M, C, U, W]]

=E[(A−E[A|C]))(M −E[M|A|C])E[Y|A, M, C, U, W]]

eq3= E[(A−E[A|C]))(M−E[M|A, C]){θ0mM +θa(A) +θc(C) +θu(U) +θw(W)}]

=E[(A−E[A|C]))(M −E[M|A, C])θmM] (4) +E[(A−E[A|C]))(M−E[M|A, C])θa(A)] (5) +E[(A−E[A|C]))(M−E[M|A, C])θc(C)] (6) +E[(A−E[A|C]))(M−E[M|A, C])θu(U)] (7) +E[(A−E[A|C]))(M−E[M|A, C])θw(W)] (8)

=E[(A−E[A|C]))(M −E[M|A, C])θmM]

=⇒ θm= E[(A−E(A|C))(M −E(M|A, C))Y] E[(A−E[A|C]))(M −E[M|A, C])M]

The terms (4) - (7) are identical to the GENIUS proof. That is, (5)-(7) are identically zero. Here we show (8) is identically zero:

(4) =E[(A−E[A|C]))(M−E[M|A, C])θw(W)]

=E[E[(A−E[A|C]))(M−E[M|A, C])θw(W)|A, C]]

=E[(A−E[A|C])E[(M −E[M|A, C])θw(W)|A, C]]

=E[(A−E[A|C])E[M−E[M|A, C]|A, C]

| {z }

=0

E[θw(W)|A, C]

= 0

(9)

Thus, we recover the GENIUS identifying formula for θm. Forβa, the fact that A⊥U|C,

E[M|A=a, C=c] =E[E[M|A=a, C=c, U =u]|A=a]

eq2= Z

u

0aa+βc(c) +βu(U))f(u|a, c)

M3000(b)

= Z

u

0aa+βc(c) +βu(U))f(u|c)

0aa+βc(c) +E[βu(U)|C=c]

0aa+βc(c)

That is, βa can be consistently estimated by the OLS estimator, ˆβa from the model E[M|A = a] = β0aa+βc(c).

A1.5 Result 5

The proof follows from Result 4. Under DAG (d), the true NIE is given by,

N IEtruth(a, a) =θmβa(a−a)

where θm comes from the model in equation (3) andβa comes from the model in (2). It suffices to show that we can find unbiased estimators for θm and βa even in the presence of measurement error.

Recall the following from the proof in Result 4 it follows that,

E[(A−E(A|C))(M −E(M|A|C))Y] =E[(A−E(A|C))(M−E(M|A, C))βmM]. (9)

(10)

E[{A−E[A|C]}{M−E[M|A, C]}Y]

(E1−3)

= E[{A−E[A, C]}{M−E[M|A, C]}Y] +E[{A−E[A|C]}{−E[|A, C]}Y]

=E[{A−E[A|C]}{M−E[M|A, C]}Y] +E[{A−E[A|C]}Y]·E{−E[]}

=E[{A−E[A|C]}{M−E[M|A, C]}Y]

(9)=E[{A−E[A|C]}{M−E[M|A, C]}θmM]

=E[{A−E[A|C]}{M−E[M|A, C]}θmM] +E[{A−E[A|C]}θmM]·E{−E[]}

(E1−3)

= E[{A−E[A|C]}{M−E[M|A, C]}θmM]

=E[{A−E[A|C]}{M−E[M|A, C]}θmM]. Therefore

θm = E[{A−E[A|C]}{M−E[M|A, C]}Y] E[{A−E[A|C]}{M−E[M|A, C]}M]

= E[{A−E[A|C]}{M−E[M|A, C]}Y] var(A)[var(M|A= 1, C)−var(M|A= 0, C)]

We recover the MR GENIUS identifying formula for θm. For βa, the fact that A⊥U|C,

E[M|A=a, C =c] =E[E[M|A=a, C=c, U=u]|A=a, C =c]

E1= E[E[M|A=a, C=c, U =u]|A=a, C=c] +E[|A=a, C=c, U =u]

E2= E[E[M|A=a, C=c, U =u]|A=a, C=c] +E[]

E3= E[E[M|A=a, C=c, U =u]|A=a, C=c]

eq2= Z

u

0aa+βc(c) +βu(U))f(u|a, c)

M3000(b)

= Z

u

0aa+βc(c) +βu(U))f(u|c)

0aa+βc(c) +E[βu(U)|C=c]

0aa+βc(c)

That is, βa can be consistently estimated by the OLS estimator, ˆβa from the model E[M|A =a, C = c] =β0aa+βc(c).

(11)

A1.6 Result 6

We want to show,

E[(A−E(A|C))(M −E(M|A, C)){ Y exp(−θmM)

E[Y exp(−θmM)|A, C]−1}] = 0

E[(A−E(A|C))(M−E(M|A, C)) Y exp(−θmM)

E[Y exp(−θmM)|A, C]] (Term 1)

−E[(A−E(A|C))(M−E(M|A, C))] (Term 2)

Term 1 =E[E[(A−E(A|C))(M−E(M|A, C)) Y exp(−θmM)

E[Y exp(−θmM)|A, C]|A, C, M, W, U]]

=E[(A−E(A|C))(M−E(M|A, C))E[Y exp(−θmM)|A, C, W, U] E[Y exp(−θmM)|A, C] ]

=E[(A−E(A|C))(M−E(M|A, C)) exp(θa(A) +θc(C) +θu(U) +θw(W))

exp(θa(A) +θc(C))E[exp(θu(U))|C]E[exp(θw(W))|A, C]]

=E[(A−E(A|C))(M−E(M|A, C)) exp(θu(U) +θw(W))

E[exp(θu(U))|C]E[exp(θw(W))|A, C]]

=E[E[(A−E(A|C))(M−E(M|A, C)) exp(θu(U) +θw(W))

E[exp(θu(U))|C]E[exp(θw(W))|A, C]|A, C]]

=E[(A−E(A|C))E[exp(θu(U) +θw(W))(M−E(M|A, C))|A, C] 1

E[exp(θu(U))|C]E[exp(θw(W))|A, C]]

=E[(A−E(A|C))E[exp(θu(U))(M−E(M|A, C))|A, C] E[exp(θw(W))|A, C]

E[exp(θu(U))|C]E[exp(θw(W))|A, C]]

=E[(A−E(A|C))E[exp(θu(U))(M−E(M|A, C))|A, C] 1

E[exp(θu(U))|C]]

=E[(A−E(A|C))E[E[exp(θu(U))(M −E(M|A, C))|U, A, C]] 1

E[exp(θu(U))|C]]

=E[(A−E(A|C))E[exp(θu(U))(E[M|A, C, U]−E[M|A, C])] 1

E[exp(θu(U))|C]]

=E[(A−E(A|C))E[exp(θu(U))(βu(U)−E[βu(U)|A, C])] 1

E[exp(θu(U))|C]]

−E(A|C))

− )|C])] 1

(12)

A2. Other extensions and related literature

Mediation methods are well-developed for survival outcomes (Lange and Hansen, 2011; Tchetgen Tch- etgen, 2011; VanderWeele, 2011). Below, we show that the conditional natural indirect effect on the hazard difference scale remains identified under conditions analogous to those given in the prior sec- tions. Additionally, we establish that the conditional natural indirect effect on the hazard ratio scale is identified using a proportional hazards model provided the outcome remains rare throughout follow-up.

Therefore, existing statistical methods to estimate this quantity can directly be applied in the presence of uncontrolled exposure-outcome confounding under our conditions.

There are two closely related strands of work in previous literature on mediation methods robust to exposure-outcome uncontrolled confounding. Tchetgen Tchetgen and Phiri (2017) identified the NIE in the presence of unmeasured common cause of the exposure-outcome relation by replacing assumption M4 with an exclusion restriction, which states that the mediator potential outcome under no exposure is constant for all persons in the population. The exclusion restriction likely holds in the context of medication-mediated effects, where the exposure is typically disease status, mediator is medication taken for the disease, and outcome is an unintended effect of medication, so that healthy persons are usually excluded from receiving medication for the disease in question. In a separate strand of work, Fulcher et al. (2017) propose a new form of indirect effect, the population intervention indirect effect (PIIE).

This novel type of indirect effect captures the extent to which the effect of exposure is mediated by an intermediate variable under an intervention that sets the component of exposure directly influencing the outcome to its observed value. More formally,

P IIE(a, a) =E[Y(A, M(A))]−E[Y(M(A, a))] =E[Y]−E[Y(M(a))]

Unlike the NIE, the PIIE is nonparametrically identified under conditions M1-M3, therefore allowing for uncontrolled confounding of exposure-outcome relation without the need for an additional assumption such as no mediator-unmeasured confounding interaction. Additionally, unlikeE[Y(a, M(a)], the coun- terfactual mean E[Y(M(a))] is identified in the presence of exposure-outcome confounding. This result applies both conditional and unconditional on covariates.

A2.1 Hazard difference natural indirect effect proof (additive hazards model)

(13)

HDN IETa,Ma(t)−λT

a,Ma(t) Define the following condition,

M4000 P r(T > t|a, M =m, w, c)

P r(T > t|a, M = 0, w, c) = exp

− Z t

0

λT(u|a, m, w, c)−λT(u|a, m= 0, w, c)du =γ3(a, m, c, t)

equivalently,

M4000 λT(t|a, m, w, c)−λT(t|a, m= 0, w, c) =γ30(a, m, c, t) where

λT(t|a, m, w, c) =λ0(t) +λA(t)a+λM(t)m+λA,M(t)am

C(t)c+λA,C(t)ac+λA,C(t)mc+λA,M,C(t)amc +λW,C(t)wc+λW,A(t)wa+λW,A,C(t)wac

Note that M4000 encodes a no multiplicative interaction on the survival probability scale or a no additive interaction on the hazard difference scale, that is,

λT(t|a, m, c, w)−λT(t|a, m= 0, c, w) =γ30(a, m, c, t)

M(t)m+λA,M(t)am+λA,C(t)ac+λM,C(t)mc+λA,M,C(t)amc

We show that under assumptions M1-M3, M4000, and M5, the conditional HDN IE is identified. Under

(14)

Figure 2a, the true conditional HDN IE (i.e. W is observed) on the survival scale is as follows.

HDN IEtruth(a, a|c, w) = P r(Ta,Ma > t|C=c, W =w) P r(Ta,Ma > t|C=c, W =w)

= R

mP r(T > t|A=a, M =m, C =c, W =w)P r(M =m|A=a, C =c)dm R

mP r(T > t|A=a, M =m, C =c, W =w)P r(M =m|A=a, C=c)dm

= R

mexp(−Rt

0λT(t|a, m, c, w)du)P r(M =m|A=a, C=c)dm R

mexp(−Rt

0λT(t|a, m, c, w)du)P r(M =m|A=a, C =c)dm

= R

mexp(−Rt

0λT(u|a, m, c, w)du)P r(M =m|A=a, C =c)dm R

mexp(−Rt

0λT(u|a, m, c, w)du)P r(M =m|A=a, C =c)dm

exp(Rt

0 λT(u|a, m= 0, c, w)du) exp(Rt

0 λT(u|a, m= 0, c, w)du)

= R

mγ3(a, m, c, t)P r(M =m|A=a, C =c)dm R

mγ3(a, m, c, t)P r(M =m|A=a, C =c)dm

The naive HDN IE on the survival scale which does not use information onW is given by,

HDN IEnaive(a, a|c, w) = P r(Ta,Ma > t|C=c, W =w) P r(Ta,Ma > t|C=c)

= R

mP r(T > t|A=a, M =m, C =c)P r(M =m|A=a, C=c)dm R

mP r(T > t|A=a, M =m, C =c)P r(M =m|A=a, C =c)dm

= R

mexp(−Rt

0λT(t|a, m, c)du)P r(M =m|A=a, C=c)dm R

mexp(−Rt

0λT(t|a, m, c)du)P r(M =m|A=a, C =c)dm

= R

mexp(−Rt

0λT(u|a, m, c)du)P r(M =m|A=a, C =c)dm R

mexp(−Rt

0λT(u|a, m, c)du)P r(M =m|A=a, C =c)dm

exp(Rt

0 λT(u|a, m= 0, c)du) exp(Rt

0 λT(u|a, m= 0, c)du)

= R

mγ3(a, m, c, t)P r(M =m|A=a, C=c)dm R

mγ3(a, m, c, t)P r(M =m|A=a, C=c)dm

∗∗=HDN IEtruth(a, a|c, w)

**We want to show that γ3(a, m, c, t) can be identified without conditioning onW. The marginal model

(15)

is given by,

P r(T > t|A=a, M =m, C =c, W =w) = Z

w

P r(T > t|a, m, c, w)f(w|a, m, c)dw

= Z

w

P r(T > t|a, m, c, w)f(w|a, c)dw

= Z

w

exp

− Z t

0

λT(t|a, m, c, w)ds

f(w|a, c)dw

= Z

w

exp

− Z t

0

λ0(s) +λA(s)a+λM(s)m+λA,M(s)am +λC(s)c+λA,C(s)ac+λA,C(s)mc+λA,M,C(s)amc +λW,C(s)wc+λW,A(s)wa+λW,A(s)wacds

f(w|a, c)dw

= exp

− Z t

0

λ0(s) +λA(s)a+λM(s)m+λA,M(s)am +λC(s)c+λA,C(s)ac+λA,C(s)mc+λA,M,C(s)amcds

×E[exp

− Z t

0

λW,C(s)wc+λW,A(s)wa

W,A(s)wac ds |A=a, C=c]

So that the marginal survival probability ratio (i.e. W not observed) is equal to, P r(T > t|A=a, M =m, C =c, W =w)

P r(T > t|A=a, M = 0, C =c, W =w) =γ3(a, m, c, t) and the hazard difference natural indirect effect (i.e. W not observed) is equal to,

λT(t|a, m, c)−λT(t|a, m= 0, c) = d dt

log(P r(T > t|A=a, M =m, C =c))

−log(P r(T > t|A=a, M = 0, C =c))

30(a, m, c, t)

(16)

M5 exp(−Λt) = 1 (rare outcome)

Note that M40000 encodes a no multiplicative interaction on the hazard ratio scale, that is,

λT(t|a, m, c, w) =λ0(t) exp(θA(a) +θM(m) +θC(c) +θW(w)

A,M(a, m) +θA,C(a, c) +θM,C(m, c) +θA,W(a, w) +θW,C(w, c) +θA,M,C(a, m, c) +θA,W,C(a, w, c)) λT(t|a, m, c, w)/λT(t|a, m= 0, c, w) =γ4(a, m, c)

= exp(θM(m) +θA,M(a, m) +θM,C(m, c) +θA,M,C(a, m, c))

We show that under assumptions M1-M3, M40000, and M5, the conditionalHRN IE is identified.

Under Figure 2a, the true HRN IE is (i.e. W is observed),

HRN IEtruth(a, a|c) =λTaMa(t|a, m, c, w)/λT

aMa(t|a, m, c, w)

= fTaMa(t|a, m, c, w)ST

aMa(t|a, m, c, w) fT

aMa(t|a, m, c, w)STaMa(t|a, m, c, w)

= R

mfT(t|a, m, c, w)fM(m|a, c)dmR

mST(t|a, m, c, w)fM(m|a)dm R

mfT(t|a, m, c, w)fM(m|a, c)dmR

mST(t|a, m, c, w)fM(m|a, c) dm

= R

mλT(t|a, m, c, w)ST(t|a, m, c, w)fM(m|a, c)dmR

mST(t|a, m, c, w)fM(m|a, c)dm R

mλT(t|a, m, c, w)ST(t|a, m, c, w)fM(m|a, c)dmR

mST(t|a, m, c, w)fM(m|a, c)dm

= R

mλT(t|a, m, c, w)exp(−Λt)fM(m|a, c)R

mexp(−Λt)fM(m|a, c)dm R

mλT(t|a, m, c, w)exp(−Λt)fM(m|a, c)dmR

mexp(−Λt)fM(m|a, c)dm

M6

≈ R

mλT(t|a, m, c, w)fM(m|a, c)dmR

mfM(m|a, c)dm R

mλT(t|a, m, c, w)fM(m|a, c)dmR

mfM(m|a, c)dm

= R

mλT(t|a, m, c, w)fM(m|a, c)dm R

mλT(t|a, m, c, w)fM(m|a, c)dm

= R

mλT(t|a, m, c, w)fM(m|a, c)dm R

mλT(t|a, m, c, w)fM(m|a, c)dm

λT(t|a, m= 0, w) λT(t|a, m= 0, c, w)

= R

mγ4(a, m, c)fM(m|a, c)dm R

mγ4(a, m, c)fM(m|a, c)dm

(17)

The naive HRN IE which does not use information on W is given by,

HRnaiveN IE(a, a|c) =λTaMa(t)/λT

aMa(t)

= fTaMa(t|a, m, c)ST

aMa(t|a, m, c) fTaMa(t|a, m, c)STaMa(t|a, m, c)

= R

mfT(t|a, m, c)fM(m|a, c)dmR

mST(t|a, m, c)fM(m|a, c)dm R

mfT(t|a, m, c)fM(m|a, c)dmR

mST(t|a, m, c)fM(m|a, c)dm

= R

mλT(t|a, m, c)ST(t|a, m, c)fM(m|a, c)dmR

mST(t|a, m, c)fM(m|a, c)dm R

mλT(t|a, m, c)ST(t|a, m, c)fM(m|a, c)dmR

mST(t|a, m, c)fM(m|a, c)dm

= R

mλT(t|a, m, c)exp(−Λt)fM(m|a, c)dmR

mexp(−Λt)fM(m|a, c)dm R

mλT(t|a, m, c)exp(−Λt)fM(m|a, c)dmR

mexp(−Λt)fM(m|a, c)dm

M6

≈ R

mλT(t|a, m, c)fM(m|a, c)dmR

mfM(m|a, c)dm R

mλT(t|a, m, c)fM(m|a, c)dmR

mfM(m|a, c)dm

= R

mλT(t|a, m, c)fM(m|a, c)dm R

mλT(t|a, m, c)fM(m|a, c)dm

= R

mλT(t|a, m, c)fM(m|a, c)dm R

mλT(t|a, m, c)fM(m|a, c)dm

λT(t|a, m= 0, c) λT(t|a, m= 0, c)

∗∗= R

mγ4(a, m, c)fM(m|a, c)dm R

mγ4(a, m, c)fM(m|a, c)dm

=HRN IEtruth(a, a|c)

**We want to show thatγ4(a, m, c) can be identified without conditioning onW. The marginal model is given by,

λT(t|a, m, c)≈ft(t|a, m, c)

= Z

w

ft(t|a, m, w, c)f(w|a, m, c)dw

≈ Z

w

λT(t|a, m, w, c)f(w|a, m, c)dw

= Z

w

λT(t|a, m, w, c)f(w|a, c)dw

= Z

λ0exp θA(a) +θM(m) +θC(c) +θW(w)

(18)

So that the marginal ratio (i.e. W not observed) is equal to,

λT(t|a, m, c)/λT(t|a, m= 0, c) = exp(θM(m) +θA,M(a, m) +θM,C(m, c) +θA,M,C(a, m, c))

4(a, m, c)

A3. General Result 3 (incorporate interaction terms)

We have noted in the above proofs and main text that assumptions M30, M300, and M3000 can be made more general. We only show M30 and Result 3 here, but the same result applies,

M30. There exists an unmeasured variable U such that, (a)Y(a, m)⊥M(a)|C=c, U=u ∀a, a, c, u (b)A⊥U |C=c ∀ c

(c) There is no additiveM–(U, A) interaction in the model forE[Y|A, M, C, U]

E[Y|A, M =m, C, U]−E[Y|A, M = 0, C, U] =θmm+θmcmc and no additiveA–(U, C) interaction in the model forE[Y|A, M, C, U]

E[Y|A=a, M, C, U]−E[Y|A= 0, M, C, U] =θa(a) +θac(a, c)

for unknown functionsθa(.) that satisfies θa(0) = 0 andθac(.) that satisfiesθac(0, c) = 0.

(d) There is no additiveA–(U) interaction in the model forE[M|A, C, U]

E[M|A=a, C, U]−E[Y|A= 0, C, U] =βa(a) +βac(ac)

for unknown functionsβa(.) that satisfiesβa(0) = 0 andβac(.) that satisfies βac(0, c) = 0.

(e) var(M|A=a, C =c)−var(M|A=a0, C =c)6= 0 ∀ a, a0, c Under this more general assumption, Result 3 will become,

Result 3 Under assumptions M1, M2, and M30, the natural indirect effect is uniquely identified by

N IE(a, a) =θm

a(a)−βa(a)) +E[βac(a, C)−βac(a, C)]

(β (a)−β (a))E[C] +E[(β (a, c)−β (a, c))C]

(19)

Estimation of θm and θmcnow proceeds from Section 4.4 where equation (9) becomes:

0 =n−1

n

X

i=1

h(Ci)

Ai−E[A|Cˆ i] Mi−E[Mˆ |Ai, Ci] Yi−θˆmMi−θˆmcMiCi

A4. Asymptotic variance for NIE

For inference, the delta method or nonparametric bootstrap can be used to obtain standard errors and corresponding confidence interval for the natural indirect effect. Delta method estimate of standard error of N IE[ is given by the square root of,

V ar(d N IE) = ˆ[ θ2ms2ˆ

θm+ ˆβa2s2ˆ

βa

where sθˆ

m is the standard error for ˆθm obtained from the MR-GENIUS R package. And, sβˆ

a is the ro- bust sandwich estimator of standard error for ˆβa which appropriately accounts for heteroscedastic errors (White, 1980).

A5. Simulation study

A5.1 Simulation setup details

We simulate 2,000 data sets of sample size 1,000 under the following true data generating mechanism

• W ∼Normal(0,1) under DAG (a), (c), (d);

• U ∼Normal(0,1) under DAG (b), (c), (d);

• A|W ∼Bernoulli(p= expit(w));

• M|A, U ∼Normal(a+u,|0.5 + 0.5a|), while the observed mediator under DAG (d) isM =M+ where ∼Normal(0,1);

(20)

and variance of estimated NIE, the Monte Carlo mean of variance estimates, the proportion of bias, the mean squared error, and the coverage of the 95% confidence interval using both the delta method and bootstrap variance estimates.

A5.2 Simulation result under M-U and M-W interaction

Table A1: Operating characteristics of the standard OLS estimator, the GENIUS estimator, and the oracle estimator under directed acyclic graphs (DAGs) in Figure 2 (a)-(d), with additional M-U and M-W interaction in the outcome modelE(Y|A, M, C, U, W).

Figure 2

Method Bias (Var) Proportion Mean Mean Var 95% CI Bootstrap

DAG Bias (%) Squared error Estimate Coverage Coverage

AdditionalM-U interaction in the outcome model: violation ofM0 3 (c) (a)

Standard OLS -0.183 (0.013) -18.3% 0.046 0.010 0.534 0.555

GENIUS 0.410 (0.119) 41.0% 0.288 0.096 0.760 0.771

Oracle -0.006 (0.009) -0.6% 0.009 0.009 0.948 0.955

(c)

Standard OLS -0.156 (0.014) -15.6% 0.038 0.012 0.662 0.682

GENIUS 0.427 (0.139) 42.7% 0.321 0.114 0.802 0.811

Oracle 0.003 (0.010) 0.3% 0.010 0.009 0.950 0.960

(d)

Standard OLS -0.415 (0.011) -41.5% 0.184 0.009 0.028 0.044

GENIUS 0.474 (0.255) 47.4% 0.479 0.284 0.968 0.988

Oracle 0.001 (0.009) 0.1% 0.009 0.009 0.952 0.948

Additional M-W interaction in the outcome model: violation of M0 4 (c) (b)

Standard OLS 0.082 (0.013) 8.2% 0.020 0.011 0.870 0.888

GENIUS 0.484 (0.037) 48.4% 0.271 0.034 0.217 0.243

Oracle -0.006 (0.010) -0.6% 0.010 0.009 0.940 0.956

(c)

Standard OLS -0.304 (0.009) -30.4% 0.102 0.009 0.109 0.123

GENIUS 0.867 (0.165) 86.7% 0.916 0.188 0.325 0.349

Oracle 0.003 (0.010) 0.3% 0.010 0.009 0.948 0.959

(d)

Standard OLS -0.524 (0.008) -52.4% 0.282 0.006 0.000 0.000

GENIUS 0.932 (0.314) 93.2% 1.183 0.551 0.866 0.784

Oracle -0.000 (0.009) -0.0% 0.009 0.009 0.949 0.950

(21)

0.51.01.52.02.53.03.5

DAG (a) DAG (b) DAG (c) DAG (d)

Naive OLS GENIUS Oracle OLS

(a) AdditionalM-U interaction in the outcome modelE(Y|A, M, C, U, W): violation of M30 (c)

1234

DAG (a) DAG (b) DAG (c) DAG (d)

Naive OLS GENIUS Oracle OLS

(22)

References

I. R. Fulcher, I. Shpitser, S. Marealle, and E. J. Tchetgen Tchetgen. Robust inference on indirect causal effects. arXiv preprint arXiv:1711.03611, 2017.

T. Lange and J. V. Hansen. Direct and indirect effects in a survival context.Epidemiology, pages 575–581, 2011.

E. J. Tchetgen Tchetgen. On causal mediation analysis with a survival outcome. The international journal of biostatistics, 7(1):1–38, 2011.

E. J. Tchetgen Tchetgen and K. Phiri. Evaluation of medication-mediated effects in pharmacoepidemi- ology. Epidemiology (Cambridge, Mass.), 28(3):439, 2017.

T. J. VanderWeele. Causal mediation analysis with survival data. Epidemiology (Cambridge, Mass.), 22 (4):582, 2011.

H. White. A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedas- ticity. Econometrica: Journal of the Econometric Society, pages 817–838, 1980.

Gambar

Figure A1: Directed Acyclic Graphs (DAGs) with indirect effects in red.
Table A1: Operating characteristics of the standard OLS estimator, the GENIUS estimator, and the oracle estimator under directed acyclic graphs (DAGs) in Figure 2 (a)-(d), with additional M -U and M -W interaction in the outcome model E(Y |A, M, C, U, W ).

Referensi

Dokumen terkait