Appendix: Estimation of natural indirect eﬀects under unmeasured confounding and measurement error in the mediator

(1)

Appendix: Estimation of natural indirect effects under unmeasured confounding and measurement error in the mediator

Alternate Figure 2

A M Y

W

C

(a) unmeasured confounding ofA-Y (violation of M4)

A M Y

U C

(b) unmeasured confounding of M-Y (violation of M3)

A M Y

C U

W

(c) unmeasured confounding ofA-Y and M-Y (violation of M3 and M4)

A M Y

C U

W

M^∗

(d) measurement error inM and unmeasured confounding ofA-Y andM-Y

Figure A1: Directed Acyclic Graphs (DAGs) with indirect effects in red.

(2)

A1. Proofs for all results

A1.1 Result 1

Under DAG (a), the true NIE (i.e. W is observed) is given by,

N IE_truth(a, a^∗)^M1−M= ⁴⁰ Z

c

Z

m

Z

w

E[Y|m, a, c, w](f_M(m|a, c)−f_M(m|a^∗, c))f_W(w|c)f_C(c)dwdmdc

= Z

c

Z

m

(f_M(m|a, c)−f_M(m|a^∗, c)) Z

w

E[Y|m, a, c, w]f_W(w|c)dw

f_C(c)dmdc

= Z

c

Z

m

(f_M(m|a, c)−f_M(m|a^∗, c)) Z

w

(E[Y|m, a, c, w]−

E[Y|M = 0, a, c, w])fW(w|c)dw

fC(c)dmdc

M4⁰(c)

= Z

c

Z

m

(fM(m|a, c)−fM(m|a^∗, c)) Z

w

γ1(a, m, c)fW(w|c)dw

fC(c)dmdc

= Z

c

Z

m

γ₁(a, m, c)(f_M(m|a, c)−f_M(m|a^∗, c))f_C(c)dmdc

The naive NIE which does not use information on W and incorrectly assumes M1-M4 holds is given by,

N IE_naive(a, a^∗) = Z

c

Z

m

E[Y|M =m, A=a, C =c](f_M(m|a, c)−f_M(m|a^∗, c))dmdc

= Z

c

Z

m

(E[Y|M =m, A=a, C=c]−E[Y|M = 0, A=a, C=c])(f_M(m|a, c)−f_M(m|a^∗, c))dm

∗∗= Z

c

Z

m

γ1(a, m, c)(f_M(m|a, c)−f_M(m|a^∗, c))dm

=N IEtruth(a, a^∗)

**We want to show that γ₁(a, m, c) can be identified without conditioning on W. To do so, we first recognize that the no additive interaction assumption encodes the following,

E[Y|M =m, A=a, C=c, W =w] =θA(a) +θM(m) +θC(c) +θW(w)

+θA,M(a, m) +θA,C(a, c) +θM,C(m, c) +θA,W(a, w) +θW,C(w, c) +θ_A,M,C(a, m, c) +θ_A,W,C(a, w, c)

so that

(3)

where θ_A(a), θ_M(m), and θ_W(w) are arbitrary functions of a, m,and w, respectively, with the property that θ_A(0) = θ_M(0) = θ_W(0) = 0. Likewise, θ_A,M(a, m) and θ_A,W(a, w) are arbitrary functions of interactions between A −M and A −W such that θ_A,M(0,0) = θ_A,M(0, m) = θ_A,M(a,0) = 0 and θ_A,W(0,0) = θ_A,W(0, w) = θ_A,W(w,0) = 0. This idea can be extended to the other terms encoding interactions.

The marginal model is given by,

E[Y|M =m, A=a, C=c] =E[E[Y|M =m, A=a, C=c, W =w]|A=a, M =m, C =c]

= Z

w

E[Y|M =m, A=a, C =c, W =w]fW(w|a, m, c)dw

= Z

w

E[Y|M =m, A=a, C =c, W =w]fW(w|a, c)dw

M4⁰

= θ_A(a) +θ_M(m) +θ_C(c) +E[θ_W(w)|A=a, C=c]

+θA,M(a, m) +θA,C(a, c) +θM,C(m, c) +E[θA,W(a, w)|A=a, C=c]

+E[θ_W,C(w, c)|A=a, C=c] +θ_A,M,C(a, m, c) +E[θ_A,W,C(a, w, c)|A=a, C=c]

So that the marginal contrast (i.e. W not observed) is equal to,

E[Y|M =m, A=a, C=c]−E[Y|M = 0, A=a, C =c] =θ_M(m) +θ_A,M(a, m) +θ_M,C(m, c) +θ_A,M,C(a, m, c)

=γ₁(a, m, c)

A1.2 Result 2

Under DAG (a), the true RR^{N IE} is (i.e. W is observed),

RR^{N IE}_truth(a, a^∗|c, w)^M^1−M4= ⁰⁰ R

mP r[Y = 1|a, M =m, w, c]fM(m|a, c) dm R

mP r[Y = 1|a, M =m, w, c]f_M(m|a^∗, c) dm

= R

mP r[Y = 1|a, M =m, w, c]f_M(m|a, c) dm R

mP r[Y = 1|a, M =m, w, c]fM(m|a^∗, c) dm× P[Y = 1|a, M = 0, w, c]

P[Y = 1|a, M = 0, w, c]

(4)

The naive NIE which does not use information on W and incorrectly assumes M1-M4 holds is given by,

RR^{N IE}_naive(a, a^∗|c) = R

mP r[Y = 1|a, M =m, c]fM(m|a, c) dm R

mP r[Y = 1|a, M =m, c]f_M(m|a^∗, c) dm

= R

mP r[Y = 1|a, M =m, c]f_M(m|a, c) dm R

mP r[Y = 1|a, M =m, c]fM(m|a^∗, c) dm× P[Y = 1|a, M = 0, c]

P[Y = 1|a, M = 0, c]

= R

mP r[Y = 1|a, M =m, c]

P r[Y = 1|a, M = 0, c]f_M(m|a, c)dm R

mP r[Y = 1|a, M =m, c]

P r[Y = 1|a, M = 0, c]fM(m|a^∗, c) dm

∗∗= R

mγ2(a, m, c)fM(m|a, c) dm R

mγ₂(a, m, c)f_M(m|a^∗, c) dm

=RR^{N IE}_truth(a, a^∗|c, w)

**We want to show that γ₂(a, m, c) can be identified without conditioning on W. First note that M4⁰⁰ encodes a no M-W interaction assumption;

P r(Y = 1|M =m, A=a, C =c, W =w) = exp(θA(a) +θM(m) +θC(c) +θW(w)

+θA,M(a, m) +θA,C(a, c) +θM,C(m, c) +θA,W(a, w) +θ_W,C(w, c) +θ_A,M,C(a, m, c) +θ_A,W,C(a, w, c))

so that,

γ₂(a, m) =P r[Y = 1|M =m, A=a, C=c, W =w]/P r[Y = 1|M = 0, A=a, C =c, W =w]

= exp(θ_M(m) +θ_A,M(a, m) +θ_M,C(m, c) +θ_A,M,C(a, m, c))

(5)

The marginal model is given by,

E[Y|M =m, A=a, C=c] =E[E[Y|M =m, A=a, C=c, W =w]|A=a, M =m, C =c]

= Z

w

E[Y|M =m, A=a, W =w, C =c]f_W(w|a, m, c)dw

= Z

w

E[Y|M =m, A=a, W =w, C =c]f_W(w|a, c)dw

= Z

w

P r[Y = 1|M =m, A=a, W =w, C =c]f_W(w|a, c)dw

= Z

w

exp(θA(a) +θM(m) +θC(c) +θW(w) +θA,M(a, m) +θA,C(a, c) +θM,C(m, c) +θA,W(a, w) +θW,C(w, c) +θA,M,C(a, m, c) +θA,W,C(a, w, c))fW(w|a, c)dw

= exp(θ_A(a) +θ_M(m) +θ_C(c) +θ_A,M(a, m) +θ_A,C(a, c) +θ_M,C(m, c) +θ_A,M,C(a, m, c))

×E[exp(θ_W(w) +θ_A,W(a, w) +θ_W,C(w, c) +θ_A,W,C(a, w, c))|A=a, C =c]

So that the marginal ratio (i.e. W not observed) is equal to,

E[Y|M =m, A=a, C=c]/E[Y|M = 0, A=a, C=c] = exp(θ_M(m) +θ_A,M(a, m) +θ_M,C(m, c) +θ_A,M,C(a, m, c))

=γ2(a, m, c)

A1.3 Result 3

M3⁰(c) and (d) imply the following modeling assumptions:

E[Y|A, M, C, U] =θ₀+θ_mM +θ_a(A) +θ^T_c(C) +θ_u(U) (1) E[M|A, C, U] =β₀+β_aA+β_c(C) +β_u(U) (2)

Note that interactions between A, C and M, Care allowed. This is omitted from the above equations for simplicity, but the proof proceeds similarly.

(6)

Under DAG (b), the true NIE (i.e. U is observed) is given by,

N IE_truth(a, a^∗)^M1−M³

0

= Z

c,m,u

E[Y|m, a, c, u](f_M(m|a, c, u)−f_M(m|a^∗, c, u))dmdf_U(u|c)df_C(c)

= Z

m,u

E[Y|m, a, c, u](f_M(m|a, c, u)−f_M(m|a^∗, c, u))dmdf_U(u|c)df_C(c)

eq1= Z

c,m,u

(f_M(m|a, c, u)−f_M(m|a^∗, c, u))(θ_mM +θ_a(A) +θ^T_c(C) +θ_u(U))dmdf_U(u|c)df_C(c)

= Z

c,u

E[M|a, c, u]−E[M|a^∗, c, u]df_U(u|c)df_C(c)

eq2= θmβa(a−a^∗)

Now, it suffices to show that we can find unbiased estimators for θ_m and β_a. The former follows directly from GENIUS. Under the conditions given in assumption M3⁰,

θ_m = E[{A−E[A|C]}{M −E[M|A, C]}Y] E[{A−E[A, C]}{M−E[M|A, C]}M]

= E[{A−E[A|C]}{M−E[M|A, C]}Y] var(A)[var(M|A= 1, C)−var(M|A= 0, C)]

The latter follows from the fact that A⊥U|C=c ∀c,

E[M|A=a, C=c] =E[E[M|A=a, C =c, U =u]|A=a, C=c]

eq2= Z

u

(β₀+β_aa+β_c(c) +β_u(U))f(u|a, c)

M3⁰(b)

= Z

u

(β₀+β_aa+β_c(c) +β_u(U))f(u|c)

=β0+βaa+βc(c) +E[βu(U)|C =c]

=β₀^∗+β_aa+β_c^∗(c)c

That is, βa can be consistently estimated by the OLS estimator, ˆβa from the model E[M|A = a, C = c] =β₀^∗+βaa+β_c^∗(c)c.

(7)

A1.4 Result 4

M3⁰⁰(f) implies the following new modeling assumption for the outcome model:

E[Y|A, M, C, U] =θ₀+θ_mM +θ_a(A) +θ^T_c(C) +θ_u(U) +θ_w(W) (3)

This proof follows similarly to that of Result 3. Under DAG (c), the true NIE (i.e. U,W are observed) is given by,

N IE_truth(a, a^∗)^M1−M³

00

= Z

m,u,w,c

= Z

c,m,u

(f_M(m|a, c, u)−f_M(m|a^∗, c, u))

× Z

w

E[Y|M =m, A=a, C=c, U=u, W =w]f_W(w)dw

dmdf_W(w|c)df_U(u|c)df_C(c)

= Z

c,m,u

(fM(m|a, c, u)−fM(m|a^∗, c, u)) Z

w

(E[Y|M =m, A=a, C=c, U =u, W =w]− E[Y|M = 0, A=a, C =c, U =u, W =w])f_W(w|c)dw

dmdf_U(u|c)df_C(c)

eq3= θm

Z

m,u

m(f_M(m|a, c, u)−f_M(m|a^∗, c, u))dmdf_U(u|c)df_C(c)

eq2= θ_mβ_a(a−a^∗)

Similar to the proof of Result 3, it suffices to show that we can find unbiased estimators for θm and βa.

(8)

For the latter, a slight update is made to the GENIUS proof,

=E[(A−E[A|C]))(M −E[M|A|C])E[Y|A, M, C, U, W]]

eq3= E[(A−E[A|C]))(M−E[M|A, C]){θ₀+θmM +θa(A) +θ_c(C) +θ_u(U) +θ_w(W)}]

=E[(A−E[A|C]))(M −E[M|A, C])θ_mM]

=⇒ θ_m= E[(A−E(A|C))(M −E(M|A, C))Y] E[(A−E[A|C]))(M −E[M|A, C])M]

The terms (4) - (7) are identical to the GENIUS proof. That is, (5)-(7) are identically zero. Here we show (8) is identically zero:

(4) =E[(A−E[A|C]))(M−E[M|A, C])θ_w(W)]

=E[E[(A−E[A|C]))(M−E[M|A, C])θw(W)|A, C]]

=E[(A−E[A|C])E[(M −E[M|A, C])θ_w(W)|A, C]]

=E[(A−E[A|C])E[M−E[M|A, C]|A, C]

| {z }

=0

E[θ_w(W)|A, C]

= 0

(9)

Thus, we recover the GENIUS identifying formula for θ_m. Forβ_a, the fact that A⊥U|C,

E[M|A=a, C=c] =E[E[M|A=a, C=c, U =u]|A=a]

eq2= Z

u

M3⁰⁰⁰(b)

= Z

u

=β₀+β_aa+β_c(c) +E[β_u(U)|C=c]

=β₀^∗+β_aa+β^∗_c(c)

That is, βa can be consistently estimated by the OLS estimator, ˆβa from the model E[M|A = a] = β₀^∗+βaa+β_c^∗(c).

A1.5 Result 5

The proof follows from Result 4. Under DAG (d), the true NIE is given by,

N IE_truth(a, a^∗) =θ_mβ_a(a−a^∗)

where θm comes from the model in equation (3) andβa comes from the model in (2). It suffices to show that we can find unbiased estimators for θm and βa even in the presence of measurement error.

Recall the following from the proof in Result 4 it follows that,

(10)

E[{A−E[A|C]}{M^∗−E[M^∗|A, C]}Y]

(E1−3)

= E[{A−E[A, C]}{M−E[M|A, C]}Y] +E[{A−E[A|C]}{−E[|A, C]}Y]

=E[{A−E[A|C]}{M−E[M|A, C]}Y] +E[{A−E[A|C]}Y]·E{−E[]}

=E[{A−E[A|C]}{M−E[M|A, C]}Y]

(9)=E[{A−E[A|C]}{M−E[M|A, C]}θ_mM]

=E[{A−E[A|C]}{M−E[M|A, C]}θ_mM] +E[{A−E[A|C]}θ_mM]·E{−E[]}

(E1−3)

= E[{A−E[A|C]}{M^∗−E[M^∗|A, C]}θ_mM]

=E[{A−E[A|C]}{M^∗−E[M^∗|A, C]}θ_mM^∗]. Therefore

θm = E[{A−E[A|C]}{M^∗−E[M^∗|A, C]}Y] E[{A−E[A|C]}{M^∗−E[M^∗|A, C]}M^∗]

= E[{A−E[A|C]}{M^∗−E[M^∗|A, C]}Y] var(A)[var(M^∗|A= 1, C)−var(M^∗|A= 0, C)]

We recover the MR GENIUS identifying formula for θ_m. For β_a, the fact that A⊥U|C,

E[M^∗|A=a, C =c] =E[E[M^∗|A=a, C=c, U=u]|A=a, C =c]

E1= E[E[M|A=a, C=c, U =u]|A=a, C=c] +E[|A=a, C=c, U =u]

E2= E[E[M|A=a, C=c, U =u]|A=a, C=c] +E[]

E3= E[E[M|A=a, C=c, U =u]|A=a, C=c]

eq2= Z

u

M3⁰⁰⁰(b)

= Z

u

=β0+βaa+βc(c) +E[βu(U)|C=c]

=β₀^∗+β_aa+β_c^∗(c)

That is, βa can be consistently estimated by the OLS estimator, ˆβa from the model E[M^∗|A =a, C = c] =β₀^∗+βaa+β_c^∗(c).

(11)

A1.6 Result 6

We want to show,

E[(A−E(A|C))(M −E(M|A, C)){ Y exp(−θ_mM)

E[Y exp(−θ_mM)|A, C]−1}] = 0

E[(A−E(A|C))(M−E(M|A, C)) Y exp(−θ_mM)

E[Y exp(−θ_mM)|A, C]] (Term 1)

−E[(A−E(A|C))(M−E(M|A, C))] (Term 2)

Term 1 =E[E[(A−E(A|C))(M−E(M|A, C)) Y exp(−θ_mM)

E[Y exp(−θ_mM)|A, C]|A, C, M, W, U]]

=E[(A−E(A|C))(M−E(M|A, C))E[Y exp(−θ_mM)|A, C, W, U] E[Y exp(−θ_mM)|A, C] ]

=E[(A−E(A|C))(M−E(M|A, C)) exp(θ_a(A) +θ_c(C) +θ_u(U) +θ_w(W))

exp(θ_a(A) +θ_c(C))E[exp(θ_u(U))|C]E[exp(θ_w(W))|A, C]]

=E[(A−E(A|C))(M−E(M|A, C)) exp(θu(U) +θw(W))

E[exp(θ_u(U))|C]E[exp(θ_w(W))|A, C]]

=E[E[(A−E(A|C))(M−E(M|A, C)) exp(θ_u(U) +θ_w(W))

E[exp(θu(U))|C]E[exp(θ_w(W))|A, C]|A, C]]

=E[(A−E(A|C))E[exp(θ_u(U) +θw(W))(M−E(M|A, C))|A, C] 1

E[exp(θ_u(U))|C]E[exp(θ_w(W))|A, C]]

=E[(A−E(A|C))E[exp(θ_u(U))(M−E(M|A, C))|A, C] E[exp(θ_w(W))|A, C]

E[exp(θu(U))|C]E[exp(θ_w(W))|A, C]]

=E[(A−E(A|C))E[exp(θ_u(U))(M−E(M|A, C))|A, C] 1

E[exp(θ_u(U))|C]]

=E[(A−E(A|C))E[E[exp(θ_u(U))(M −E(M|A, C))|U, A, C]] 1

E[exp(θ_u(U))|C]]

=E[(A−E(A|C))E[exp(θ_u(U))(E[M|A, C, U]−E[M|A, C])] 1

E[exp(θu(U))|C]]

=E[(A−E(A|C))E[exp(θ_u(U))(βu(U)−E[βu(U)|A, C])] 1

E[exp(θu(U))|C]]

−E(A|C))

− )|C])] 1

(12)

A2. Other extensions and related literature

Mediation methods are well-developed for survival outcomes (Lange and Hansen, 2011; Tchetgen Tch- etgen, 2011; VanderWeele, 2011). Below, we show that the conditional natural indirect effect on the hazard difference scale remains identified under conditions analogous to those given in the prior sec- tions. Additionally, we establish that the conditional natural indirect effect on the hazard ratio scale is identified using a proportional hazards model provided the outcome remains rare throughout follow-up.

Therefore, existing statistical methods to estimate this quantity can directly be applied in the presence of uncontrolled exposure-outcome confounding under our conditions.

There are two closely related strands of work in previous literature on mediation methods robust to exposure-outcome uncontrolled confounding. Tchetgen Tchetgen and Phiri (2017) identified the NIE in the presence of unmeasured common cause of the exposure-outcome relation by replacing assumption M4 with an exclusion restriction, which states that the mediator potential outcome under no exposure is constant for all persons in the population. The exclusion restriction likely holds in the context of medication-mediated effects, where the exposure is typically disease status, mediator is medication taken for the disease, and outcome is an unintended effect of medication, so that healthy persons are usually excluded from receiving medication for the disease in question. In a separate strand of work, Fulcher et al. (2017) propose a new form of indirect effect, the population intervention indirect effect (PIIE).

This novel type of indirect effect captures the extent to which the effect of exposure is mediated by an intermediate variable under an intervention that sets the component of exposure directly influencing the outcome to its observed value. More formally,

P IIE(a, a^∗) =E[Y(A, M(A))]−E[Y(M(A, a^∗))] =E[Y]−E[Y(M(a^∗))]

Unlike the NIE, the PIIE is nonparametrically identified under conditions M1-M3, therefore allowing for uncontrolled confounding of exposure-outcome relation without the need for an additional assumption such as no mediator-unmeasured confounding interaction. Additionally, unlikeE[Y(a^∗, M(a)], the coun- terfactual mean E[Y(M(a^∗))] is identified in the presence of exposure-outcome confounding. This result applies both conditional and unconditional on covariates.

A2.1 Hazard difference natural indirect effect proof (additive hazards model)

(13)

HD^{N IE} =λ_T_a,Ma(t)−λ_T

a,Ma∗(t) Define the following condition,

M4⁰⁰⁰ P r(T > t|a, M =m, w, c)

P r(T > t|a, M = 0, w, c) = exp

− Z t

0

λ_T(u|a, m, w, c)−λ_T(u|a, m= 0, w, c)du =γ₃(a, m, c, t)

equivalently,

M4⁰⁰⁰ λT(t|a, m, w, c)−λT(t|a, m= 0, w, c) =γ₃⁰(a, m, c, t) where

λT(t|a, m, w, c) =λ0(t) +λA(t)a+λM(t)m+λA,M(t)am

+λ_C(t)c+λ_A,C(t)ac+λ_A,C(t)mc+λ_A,M,C(t)amc +λW,C(t)wc+λW,A(t)wa+λW,A,C(t)wac

Note that M4⁰⁰⁰ encodes a no multiplicative interaction on the survival probability scale or a no additive interaction on the hazard difference scale, that is,

λ_T(t|a, m, c, w)−λ_T(t|a, m= 0, c, w) =γ₃⁰(a, m, c, t)

=λM(t)m+λA,M(t)am+λA,C(t)ac+λM,C(t)mc+λA,M,C(t)amc

We show that under assumptions M1-M3, M4⁰⁰⁰, and M5, the conditional HD^{N IE} is identified. Under

(14)

Figure 2a, the true conditional HD^{N IE} (i.e. W is observed) on the survival scale is as follows.

HD^{N IE}_truth(a, a^∗|c, w) = P r(Ta,Ma > t|C=c, W =w) P r(T_a,M_a∗ > t|C=c, W =w)

= R

mP r(T > t|A=a, M =m, C =c, W =w)P r(M =m|A=a, C =c)dm R

mP r(T > t|A=a, M =m, C =c, W =w)P r(M =m|A=a^∗, C=c)dm

= R

mexp(−Rt

0λT(t|a, m, c, w)du)P r(M =m|A=a, C=c)dm R

mexp(−Rt

0λT(t|a, m, c, w)du)P r(M =m|A=a^∗, C =c)dm

= R

mexp(−Rt

0λ_T(u|a, m, c, w)du)P r(M =m|A=a, C =c)dm R

mexp(−Rt

0λ_T(u|a, m, c, w)du)P r(M =m|A=a^∗, C =c)dm

exp(Rt

0 λ_T(u|a, m= 0, c, w)du) exp(Rt

0 λ_T(u|a, m= 0, c, w)du)

= R

mγ₃(a, m, c, t)P r(M =m|A=a, C =c)dm R

mγ3(a, m, c, t)P r(M =m|A=a^∗, C =c)dm

The naive HD^{N IE} on the survival scale which does not use information onW is given by,

HD^{N IE}_naive(a, a^∗|c, w) = P r(T_a,M_a > t|C=c, W =w) P r(Ta,M_a∗ > t|C=c)

= R

mP r(T > t|A=a, M =m, C =c)P r(M =m|A=a, C=c)dm R

mP r(T > t|A=a, M =m, C =c)P r(M =m|A=a^∗, C =c)dm

= R

mexp(−Rt

0λ_T(t|a, m, c)du)P r(M =m|A=a, C=c)dm R

mexp(−Rt

0λ_T(t|a, m, c)du)P r(M =m|A=a^∗, C =c)dm

= R

mexp(−Rt

0λT(u|a, m, c)du)P r(M =m|A=a, C =c)dm R

mexp(−Rt

0λT(u|a, m, c)du)P r(M =m|A=a^∗, C =c)dm

exp(Rt

0 λT(u|a, m= 0, c)du) exp(Rt

0 λT(u|a, m= 0, c)du)

= R

mγ3(a, m, c, t)P r(M =m|A=a, C=c)dm R

mγ₃(a, m, c, t)P r(M =m|A=a^∗, C=c)dm

∗∗=HD^{N IE}_truth(a, a^∗|c, w)

**We want to show that γ₃(a, m, c, t) can be identified without conditioning onW. The marginal model

(15)

is given by,

P r(T > t|A=a, M =m, C =c, W =w) = Z

w

P r(T > t|a, m, c, w)f(w|a, m, c)dw

= Z

w

P r(T > t|a, m, c, w)f(w|a, c)dw

= Z

w

exp

− Z t

0

λT(t|a, m, c, w)ds

f(w|a, c)dw

= Z

w

exp

− Z t

0

λ0(s) +λ_A(s)a+λ_M(s)m+λ_A,M(s)am +λ_C(s)c+λ_A,C(s)ac+λ_A,C(s)mc+λ_A,M,C(s)amc +λ_W,C(s)wc+λ_W,A(s)wa+λ_W,A(s)wacds

f(w|a, c)dw

= exp

− Z t

0

λ₀(s) +λ_A(s)a+λ_M(s)m+λ_A,M(s)am +λC(s)c+λA,C(s)ac+λA,C(s)mc+λA,M,C(s)amcds

×E[exp

− Z t

0

λ_W,C(s)wc+λ_W,A(s)wa

+λ_W,A(s)wac ds |A=a, C=c]

So that the marginal survival probability ratio (i.e. W not observed) is equal to, P r(T > t|A=a, M =m, C =c, W =w)

P r(T > t|A=a, M = 0, C =c, W =w) =γ3(a, m, c, t) and the hazard difference natural indirect effect (i.e. W not observed) is equal to,

λT(t|a, m, c)−λT(t|a, m= 0, c) = d dt

log(P r(T > t|A=a, M =m, C =c))

−log(P r(T > t|A=a, M = 0, C =c))

=γ₃⁰(a, m, c, t)

(16)

M5 exp(−Λ_t) = 1 (rare outcome)

Note that M4⁰⁰⁰⁰ encodes a no multiplicative interaction on the hazard ratio scale, that is,

λ_T(t|a, m, c, w) =λ₀(t) exp(θ_A(a) +θ_M(m) +θ_C(c) +θ_W(w)

+θA,M(a, m) +θA,C(a, c) +θM,C(m, c) +θA,W(a, w) +θ_W,C(w, c) +θ_A,M,C(a, m, c) +θ_A,W,C(a, w, c)) λT(t|a, m, c, w)/λ_T(t|a, m= 0, c, w) =γ4(a, m, c)

= exp(θ_M(m) +θ_A,M(a, m) +θ_M,C(m, c) +θ_A,M,C(a, m, c))

We show that under assumptions M1-M3, M4⁰⁰⁰⁰, and M5, the conditionalHR^{N IE} is identified.

Under Figure 2a, the true HR^{N IE} is (i.e. W is observed),

HR^{N IE}_truth(a, a^∗|c) =λ_T_aMa(t|a, m, c, w)/λ_T

aMa∗(t|a, m, c, w)

= f_T_aMa(t|a, m, c, w)S_T

aMa∗(t|a, m, c, w) f_T

aMa∗(t|a, m, c, w)S_T_aMa(t|a, m, c, w)

= R

mf_T(t|a, m, c, w)f_M(m|a, c)dmR

mS_T(t|a, m, c, w)f_M(m|a^∗)dm R

mf_T(t|a, m, c, w)f_M(m|a^∗, c)dmR

mS_T(t|a, m, c, w)f_M(m|a, c) dm

= R

mλT(t|a, m, c, w)S_T(t|a, m, c, w)f_M(m|a, c)dmR

mST(t|a, m, c, w)f_M(m|a^∗, c)dm R

mλ_T(t|a, m, c, w)S_T(t|a, m, c, w)f_M(m|a^∗, c)dmR

mS_T(t|a, m, c, w)f_M(m|a, c)dm

= R

mλ_T(t|a, m, c, w)exp(−Λ_t)f_M(m|a, c)R

mexp(−Λ_t)f_M(m|a^∗, c)dm R

mλT(t|a, m, c, w)exp(−Λ_t)fM(m|a^∗, c)dmR

mexp(−Λ_t)fM(m|a, c)dm

M6

≈ R

mλT(t|a, m, c, w)f_M(m|a, c)dmR

mfM(m|a^∗, c)dm R

mλ_T(t|a, m, c, w)f_M(m|a^∗, c)dmR

mf_M(m|a, c)dm

= R

mλ_T(t|a, m, c, w)f_M(m|a, c)dm R

mλT(t|a, m, c, w)f_M(m|a^∗, c)dm

= R

mλ_T(t|a, m, c, w)f_M(m|a, c)dm R

mλ_T(t|a, m, c, w)f_M(m|a^∗, c)dm

λT(t|a, m= 0, w) λT(t|a, m= 0, c, w)

= R

mγ4(a, m, c)f_M(m|a, c)dm R

mγ₄(a, m, c)f_M(m|a^∗, c)dm

(17)

The naive HR^{N IE} which does not use information on W is given by,

HR_naive^{N IE}(a, a^∗|c) =λ_T_aMa(t)/λ_T

aMa∗(t)

= f_T_aMa(t|a, m, c)S_T

aMa∗(t|a, m, c) fT_aMa_∗(t|a, m, c)S_T_aMa(t|a, m, c)

= R

mf_T(t|a, m, c)f_M(m|a, c)dmR

mS_T(t|a, m, c)f_M(m|a^∗, c)dm R

mfT(t|a, m, c)f_M(m|a^∗, c)dmR

mST(t|a, m, c)f_M(m|a, c)dm

= R

mλT(t|a, m, c)S_T(t|a, m, c)f_M(m|a, c)dmR

mST(t|a, m, c)f_M(m|a^∗, c)dm R

mλ_T(t|a, m, c)S_T(t|a, m, c)f_M(m|a^∗, c)dmR

mS_T(t|a, m, c)f_M(m|a, c)dm

= R

mλ_T(t|a, m, c)exp(−Λ_t)f_M(m|a, c)dmR

mexp(−Λ_t)f_M(m|a^∗, c)dm R

mλT(t|a, m, c)exp(−Λ_t)fM(m|a^∗, c)dmR

mexp(−Λ_t)fM(m|a, c)dm

M6

≈ R

mλ_T(t|a, m, c)f_M(m|a, c)dmR

mf_M(m|a^∗, c)dm R

mλ_T(t|a, m, c)f_M(m|a^∗, c)dmR

mf_M(m|a, c)dm

= R

mλ_T(t|a, m, c)f_M(m|a, c)dm R

mλ_T(t|a, m, c)f_M(m|a^∗, c)dm

= R

mλ_T(t|a, m, c)f_M(m|a, c)dm R

mλT(t|a, m, c)f_M(m|a^∗, c)dm

λ_T(t|a, m= 0, c) λT(t|a, m= 0, c)

∗∗= R

mγ4(a, m, c)fM(m|a, c)dm R

mγ₄(a, m, c)f_M(m|a^∗, c)dm

=HR^{N IE}_truth(a, a^∗|c)

**We want to show thatγ₄(a, m, c) can be identified without conditioning onW. The marginal model is given by,

λT(t|a, m, c)≈ft(t|a, m, c)

= Z

w

f_t(t|a, m, w, c)f(w|a, m, c)dw

≈ Z

w

λ_T(t|a, m, w, c)f(w|a, m, c)dw

= Z

w

λT(t|a, m, w, c)f(w|a, c)dw

= Z

λ0exp θA(a) +θM(m) +θC(c) +θW(w)

(18)

So that the marginal ratio (i.e. W not observed) is equal to,

λ_T(t|a, m, c)/λ_T(t|a, m= 0, c) = exp(θ_M(m) +θ_A,M(a, m) +θ_M,C(m, c) +θ_A,M,C(a, m, c))

=γ₄(a, m, c)

A3. General Result 3 (incorporate interaction terms)

We have noted in the above proofs and main text that assumptions M3⁰, M3⁰⁰, and M3⁰⁰⁰ can be made more general. We only show M3⁰ and Result 3 here, but the same result applies,

M3⁰. There exists an unmeasured variable U such that, (a)Y(a, m)⊥M(a^∗)|C=c, U=u ∀a, a^∗, c, u (b)A⊥U |C=c ∀ c

(c) There is no additiveM–(U, A) interaction in the model forE[Y|A, M, C, U]

E[Y|A, M =m, C, U]−E[Y|A, M = 0, C, U] =θmm+θmcmc and no additiveA–(U, C) interaction in the model forE[Y|A, M, C, U]

E[Y|A=a, M, C, U]−E[Y|A= 0, M, C, U] =θ_a(a) +θ_ac(a, c)

for unknown functionsθa(.) that satisfies θa(0) = 0 andθac(.) that satisfiesθac(0, c) = 0.

(d) There is no additiveA–(U) interaction in the model forE[M|A, C, U]

E[M|A=a, C, U]−E[Y|A= 0, C, U] =β_a(a) +β_ac(ac)

for unknown functionsβa(.) that satisfiesβa(0) = 0 andβac(.) that satisfies βac(0, c) = 0.

(e) var(M|A=a, C =c)−var(M|A=a⁰, C =c)6= 0 ∀ a, a⁰, c Under this more general assumption, Result 3 will become,

Result 3 Under assumptions M1, M2, and M3⁰, the natural indirect effect is uniquely identified by

N IE(a, a^∗) =θm

(βa(a)−βa(a^∗)) +E[βac(a, C)−βac(a^∗, C)]

+θ

(β (a)−β (a^∗))E[C] +E[(β (a, c)−β (a^∗, c))C]

(19)

Estimation of θ_m and θ_mcnow proceeds from Section 4.4 where equation (9) becomes:

0 =n⁻¹

n

X

i=1

h(Ci)

Ai−E[A|Cˆ _i] Mi−E[Mˆ |A_i, Ci] Yi−θˆmMi−θˆmcMiCi

A4. Asymptotic variance for NIE

For inference, the delta method or nonparametric bootstrap can be used to obtain standard errors and corresponding confidence interval for the natural indirect effect. Delta method estimate of standard error of N IE[ is given by the square root of,

V ar(d N IE) = ˆ[ θ²_ms²_ˆ

θm+ ˆβ_a²s²_ˆ

βa

where s_θ_ˆ

m is the standard error for ˆθm obtained from the MR-GENIUS R package. And, s_β_ˆ

a is the robust sandwich estimator of standard error for ˆβa which appropriately accounts for heteroscedastic errors (White, 1980).

A5. Simulation study

A5.1 Simulation setup details

We simulate 2,000 data sets of sample size 1,000 under the following true data generating mechanism

• W ∼Normal(0,1) under DAG (a), (c), (d);

• U ∼Normal(0,1) under DAG (b), (c), (d);

• A|W ∼Bernoulli(p= expit(w));

• M|A, U ∼Normal(a+u,|0.5 + 0.5a|), while the observed mediator under DAG (d) isM^∗ =M+ where ∼Normal(0,1);

(20)

and variance of estimated NIE, the Monte Carlo mean of variance estimates, the proportion of bias, the mean squared error, and the coverage of the 95% confidence interval using both the delta method and bootstrap variance estimates.

A5.2 Simulation result under M-U and M-W interaction

Table A1: Operating characteristics of the standard OLS estimator, the GENIUS estimator, and the oracle estimator under directed acyclic graphs (DAGs) in Figure 2 (a)-(d), with additional M-U and M-W interaction in the outcome modelE(Y|A, M, C, U, W).

Figure 2

Method Bias (Var) Proportion Mean Mean Var 95% CI Bootstrap

DAG Bias (%) Squared error Estimate Coverage Coverage

AdditionalM-U interaction in the outcome model: violation ofM⁰ 3 (c) (a)

Standard OLS -0.183 (0.013) -18.3% 0.046 0.010 0.534 0.555

GENIUS 0.410 (0.119) 41.0% 0.288 0.096 0.760 0.771

Oracle -0.006 (0.009) -0.6% 0.009 0.009 0.948 0.955

(c)

Standard OLS -0.156 (0.014) -15.6% 0.038 0.012 0.662 0.682

GENIUS 0.427 (0.139) 42.7% 0.321 0.114 0.802 0.811

Oracle 0.003 (0.010) 0.3% 0.010 0.009 0.950 0.960

(d)

Standard OLS -0.415 (0.011) -41.5% 0.184 0.009 0.028 0.044

GENIUS 0.474 (0.255) 47.4% 0.479 0.284 0.968 0.988

Oracle 0.001 (0.009) 0.1% 0.009 0.009 0.952 0.948

Additional M-W interaction in the outcome model: violation of M⁰ 4 (c) (b)

Standard OLS 0.082 (0.013) 8.2% 0.020 0.011 0.870 0.888

GENIUS 0.484 (0.037) 48.4% 0.271 0.034 0.217 0.243

Oracle -0.006 (0.010) -0.6% 0.010 0.009 0.940 0.956

(c)

Standard OLS -0.304 (0.009) -30.4% 0.102 0.009 0.109 0.123

GENIUS 0.867 (0.165) 86.7% 0.916 0.188 0.325 0.349

Oracle 0.003 (0.010) 0.3% 0.010 0.009 0.948 0.959

(d)

Standard OLS -0.524 (0.008) -52.4% 0.282 0.006 0.000 0.000

GENIUS 0.932 (0.314) 93.2% 1.183 0.551 0.866 0.784

Oracle -0.000 (0.009) -0.0% 0.009 0.009 0.949 0.950

(21)

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

0.51.01.52.02.53.03.5

DAG (a) DAG (b) DAG (c) DAG (d)

Naive OLS GENIUS Oracle OLS

(a) AdditionalM-U interaction in the outcome modelE(Y|A, M, C, U, W): violation of M3⁰ (c)

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

● ●●●●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

●●

●

1234

DAG (a) DAG (b) DAG (c) DAG (d)

Naive OLS GENIUS Oracle OLS

(22)

References

I. R. Fulcher, I. Shpitser, S. Marealle, and E. J. Tchetgen Tchetgen. Robust inference on indirect causal effects. arXiv preprint arXiv:1711.03611, 2017.

T. Lange and J. V. Hansen. Direct and indirect effects in a survival context.Epidemiology, pages 575–581, 2011.

E. J. Tchetgen Tchetgen. On causal mediation analysis with a survival outcome. The international journal of biostatistics, 7(1):1–38, 2011.

E. J. Tchetgen Tchetgen and K. Phiri. Evaluation of medication-mediated effects in pharmacoepidemi- ology. Epidemiology (Cambridge, Mass.), 28(3):439, 2017.

T. J. VanderWeele. Causal mediation analysis with survival data. Epidemiology (Cambridge, Mass.), 22 (4):582, 2011.

H. White. A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity. Econometrica: Journal of the Econometric Society, pages 817–838, 1980.