Optimal Control of Malware - 3A.3 CA proctype

3A.3 CA proctype

4.5 Optimal Control of Malware

The ultimate goal of this section is to determine the optimal distribution timeT_D^∗ such that the accumulated cost caused by the epidemic is minimized. Via optimal control theory [34], we aim to solve the optimization problem.

Minimize J= Z Tf

[NI(t)]^β+ν·u²(t)dt Subject to ˙I(t) =GI(I(t),R(t),u(t)),

R(t) =˙ GR(I(t),R(t),u(t),φ(t)), S(t) +I(t) +R(t) =1,

S(t)≥0, I(t)≥0, R(t)≥0 (4.25) whereβ>0 represents the severity of the epidemic,T0is the initial time, which is set to be 0, andTf is the completion time, which is assumed to be free.ν is the coefficient representing the cost of control signal distribution with respect to the malware propagation process, and for simplicity it is normalized toν= ¹₂. If ν =0, then the cost of control signal distribution is irrelevant to the malware propagation process. The performance measureJ represents the accumulated cost caused by the epidemic, and it takes its quadratic form for the control functionu(t), such that it is jointly convex inI(t)andu(t). The physical inter- pretation ofJ is that it is proportional to the accumulated infected population, which relates to the number of nodes that have received the malware over time.

Moreover, whenβ=1, it accounts for the accumulated infected population from T0 toTf, which coincides with the performance measure in various networks of interest to us [19, 32, 55].

With Equation 4.25, we aim to find the optimal control signal distribution timeT_D^∗such thatT_D^∗=argminTDJ. By Pontryagin’s minimum principle [41], if GI(I(t),R(t),u(t))andGR(I(t),R(t),u(t),φ(t))are jointly concave inI(t),R(t), u(t), andφ(t), the optimal control functionu^∗(t)can be obtained by minimiz- ing the Hamiltonian (Lagrangian dual function) with costate variablesΛI(t)and ΛR(t), where

H(I(t),R(t),u(t),φ(t),ΛI(t),ΛR(t)) = J(I(t),u(t)) +ΛI(t)GI(I(t),R(t),u(t)) +ΛR(t)GR(I(t),R(t),u(t),φ(t)) The costate variables are updated by the costate equations

Λ˙I(t) =−∂H

∂I ; ˙ΛR(t) =−∂H

∂R (4.26)

where ˙ΛI(t)≥0 and ˙ΛR(t)≥0 with boundary conditionsΛI(Tf) =ΛR(Tf) =0.

Note that during the update process, the negative state values are truncated to zero, such that the nonnegativity state constraints (S(t),I(t),R(t)≥0) are satisfied.

The solution of optimal control theory resides in the fact that there is no inher- ent restriction on the control functionu(t). However, it is worth noting that when the control capability is associated withTD, the solution of optimal control theory only provides the trends of the system outputs and may fail to be a feasible opera- tion for control signal distribution. Despite its impracticality, the results obtained

from Pontryagin’s minimum principle provide performance comparisons to our proposed approach. To compensate the insufficiency of optimal control theory, we adopt dynamic programming [4] to solve the optimal control signal distribution time. By discretizing the time intoMintervals with length∆t =Tf/M, we define the costCmas a function of the infected population at themth period and the newly infected population between themth andm+1th stage, 0≤m≤M−1, where

Cm= [NI(m∆t) +NGI(I(m∆t),R(m∆t),u(m∆t))·∆t]^β= [NI((m+1)∆t)]^β (4.27) LetVm(I(m∆t),R((m∆t)),u(m∆t)) denote the accumulated cost from the mth stage with terminal conditionVM(I(M∆t),R(M∆t),u(M∆t)) =0 (i.e., the entire system is in its stable stage); the optimal distribution time can be obtained by solving the optimality equation

Vm= min

am∈{0,1}{Cm+Vm+1}, 0≤m≤M−1 (4.28) where am =1 means that the control signal is distributed, and the immunity mechanisms take effect from themth stage. That is, T_D^∗ =m∆t and f(m∆t) = f(n∆t), ∀n≥m.V0represents the minimum accumulated cost, which is equiv- alent to the performance measureJ in Equation 4.25. Equation 4.28 is equiv- alent to finding an optimal one-time switch from 0 to 1 among all possible one-time switch paths of theM stages to minimize the accumulated cost, and it can be solved via Bellman–Ford algorithm [4] withO(2^M)complexity. In other words, incorporating the malware propagation process and the time-dependent control capability, the optimal control signal distribution time can be obtained via dynamic programming in Equation 4.28 in real time to minimize the accumulated network cost.

With the state equations, the corresponding Hamiltonian is obtained by plugging the parameters in Equations 4.13, 4.17, 4.20, 4.24, and 4.25 into Equation 4.26:

H= [NI(t)]^β +1

2u²(t)+ΛI(t)

λ_{in f}(ηin f−1)S(t)I(t)+1 N

Z t 0

I˙in f(τ)W˙(τ,t−τ)dτ−u(t)I(t)

+ΛR(t)

u(t)I(t) + 1 N

Z t 0

R˙in f(τ)Q(τ˙ ,t−τ)dτ+φ(t)(ηin f−1)S(t)R(t)

(4.29)

Hiva-Network.Com

from which the costate equations are ˙ΛI(t) =−∂H/∂I and ˙ΛR(t) =−∂H/∂R.

With the switching functionθ^∗(t) = [Λ^∗_I(t)−Λ^∗_R(t)]I^∗(t), the constrained optimal control functionu^∗(t)that minimizesJis the saturation function

u^∗(t) =











0, θ^∗(t)≤0, θ^∗(t), θ^∗(t)∈(0,1), 1, θ^∗(t)≥1

(4.30)

Considering the time-dependent control capability, the optimal control signal distribution timeT_D^∗can be obtained by solving the dynamic programming in Equa- tion 4.28. Similarly, the saturation function in Equation 4.30 only provides an attainable lower bound on control of malware propagation with time-dependent control capability.

4.5.1 Early-stage analysis

With the approximation thatS(t)≈1 at early stages and the initial condition W(z,0) =1, from Equation 4.10, we have the approximation of incremental spa- tial infection

W(z,s) =

σλ_proη_pro 2 s+1

(4.31) Moreover, we also have the approximation thatI(t)≈Iin f(t), since at early stages Iin f(t)∝I(t), whileIpro(t)∝p

I(t). That is, the malware propagates at a faster speed through infrastructure-based links than through proximity-based links [16, 51]. At some early staget^′,

S(t^′) =1−I0−I0u(t)

t^′+φ(t)(η_{in f}−1) 2 t^′²

and we have the first-order ODE

I(t) = [λ˙ in f(ηin f−1)S(t^′)−u(t)]I(t) +1 N

Z t 0

I(τ)˙ W˙(τ,t−τ)dτ (4.32) Using the subgradient of u(t)att =TD to define the subderivative ˙u(TD) =0, and differentiating Equation 4.32 with respect tot at both sides, we have the second-order ODE (neglecting the second-order term ofW(z,s))

I(t) = [λ¨ in f(ηin f−1)S(t^′) +σλ_proη_proN⁻¹−u(t)]I(t)˙ ,[K1−K2φ(t)−u(t)]I(t˙ ) (4.33) whereK1=λ_{in f}(ηin f−1)[1−I0−I0u(t)t^′] +σλ_proη_proN⁻¹and

K2=I0u(t)η_{in f}−1 2 t^′².

With the initial valuesI(0) =I0and ˙I(0) =λ_{in f}(ηin f−1)(1−I0)I0,K3, we obtain

I(t) = K3

K1−K2φ(t)−u(t)exp{[K1−K2φ(t)−u(t)]t}+I0− K3

K1−K2φ(t)−u(t)

= ( _K₃

K1exp{K1t}+I0−^K_K³₁, t<TD,

K1−K2κ−f(TD)exp{[K1−K2κ−f(TD)]t}+I0−_K₁_−K₂^K_κ−³ _f(T_D₎, t≥TD

(4.34) The performance measureJin Equation 4.25 can be evaluated as

J= Z TD

0 [NI(t)]^βdt+ Z Tf

[NI(t)]^β+1

2f²(TD)dt

= _NK

K13

K1β (exp{K1βTD} −1) +

I0−K3

_NK

K1−K2κ−f3 (TD)

[K1−K2κ−f(TD)]β×

exp{[K1−K2κ−f(TD)]βTf}

−exp{[K1−K2κ−f(TD)]βTD}

I0− K3

K1−K2κ−f(TD)+1 2f²(TD)

(Tf−TD) (4.35)

For early-stage analysis, the optimal control signal distribution time can be obtained byTf_D^∗=argminTDJ.

4.5.2 Performance evaluation

The parameter setup of the simulation is the same as in the previous section.

When dynamic programming is applied to determine the optimal distribution time, severe epidemics (large β) contribute to early distribution to minimize the accumulated cost, as shown in Figure 4.14. Moreover, both optimal control and early-stage analysis suggest early distribution as the effectiveness of signal(α)increases, as shown in Figure 4.15. The relative difference of these two approaches is plotted in Figure 4.16. Compared with early-stage analysis, optimal control via dynamic programming prefers early distribution whenαis small, while it prefers late distribution asαincreases.

1.5 2 2.5 3 3.5 4 5

10 15 20 25 30

β = 1, Self-healing β = 2, Self-healing β = 1, Vaccine spreading β = 2, Vaccine spreading

T*D

Figure 4.14: Optimal control signal distribution time via dynamic programming under different (α,β) configurations in IoT networks.N= 2000,L= 50,I0= 1/N,δ

= 1.1,λ_inf=λ_pro= 0.05,η_inf= 6,η_pro= 3,κ= 0.1 ,Tf= 200,M= 1000,t^′= 1, andc= 10⁻³.

1.5 2 2.5 3 3.5 4

5 10 15 20 25 30 35 40 45

β = 1, Self-healing β = 2, Self-healing β = 1, Vaccine spreading β = 2, Vaccine spreading

T~ *D

Figure 4.15: Optimal control signal distribution time via early-stage analysis under different (α,β) configurations in IoT networks.N= 2000,L= 50,I0= 1/N,δ= 1.1, λ_inf=λ_pro= 0.05,η_inf= 6,η_pro= 3,κ= 0.1,Tf = 200,t^′= 1, andc= 10⁻³.

1.5 2 2.5 3 3.5 4

−1

−0.8

−0.6

−0.4

−0.2 0 0.2 0.4

β = 1, Self-healing β = 2, Self-healing β = 1, Vaccine spreading β = 2, Vaccines preading

Figure 4.16: Relative difference of optimal control signal distribution time under different (α,β) configurations in IoT networks.N= 2000,L= 50,I0= 1/N,δ= 1.1, λ_inf=λ_pro= 0.05,η_inf= 6,η_pro= 3,κ= 0.1,Tf = 200,M= 1000,t^′= 1, andc= 10⁻³.

4.5.3 Summary

The contributions of this section are twofold. First, with the aid of epidemic mod- eling, we provide an analytically tractable parametric plug-in model for malware propagation control regarding the time-dependent control capability, with the aim of determining the optimal control signal distribution time to minimize the accumulated network cost in real time via dynamic programming. Second, we demon- strate how to use our developed tools to control malware propagation in IoT networks. Compared with the self-healing scheme, we show that vaccine spreading further mitigates the accumulated cost when the immune nodes participate in for- warding control signal. Consequently, this section provides novel mathematical tools for malware propagation with and without control over IoT networks.

Dalam dokumen Security and Privacy in Internet of Things (Halaman 99-105)