Chapter III: Computational Electrohydrodynamic Lithography of Dielectric Films 44
3.3 EHL as a Constrained Inverse Problem and Its Optimal Control
Having given an overview of the physical principle, mathematical modeling and nu- merical simulation of EHL, we are now in the position to discuss electrostatic control of pattern formation in dielectric viscous thin liquid film, which essentially reduces to optimization, a well established subject in the field of mathematics and engineering.
Nevertheless an immediate solution to the inverse problem of EHL is not quite readily available. The challenge we face here is threefold: first, liquid films are not parameter- ized by a discrete set of parameters but a continuous distribution of interface height;
secondly, electrostatic shaping of liquid films is a not static but evolving process; thirdly, the evolution equation is nonlinear in both film profile and electrode topography. The relevant tools to overcome these difficulties are scattered in different context of op- timization. The textbook by Tröltzsch (2010) discusses how adjoint equations arise during parameter identification of coefficients in differential equations describing steady continuum systems. A one-dimensional tutorial is illustrated in the monograph by Vogel
(2002) which highlights ill-posedness of these inverse problems with a strong focus on various regularization methods. Although the adjoint formalism presented is applicable to time-dependent systems, the author admits the numerical implementation can be much more problematic than in the steady-state case. The first volume of the book series by Bertsekas (2005) on dynamic programming elaborates on the continuous-time optimal control of deterministic discrete systems over a finite horizon (i.e. a finite span of time), which we extend to continuum fluid systems such as EHL. Finally, identifying the optimal electrode topography under the adjoint formalism reduces to a large-scale unconstrained nonlinear optimization problem, various solution methods of which are explained in the comprehensive book by Nocedal and Wright (2006) on numerical opti- mization. We refer the interested readers to the monograph by Isakov (2006) for rigorous mathematical treatment and latest development on the theory of inverse problem for partial differential equations. We shall see how a combination of these tools adopted from different perspectives of optimization yields a drastic improvement of EHL pattern fidelity on a computational level.
State, control, constraint and objective
In this subsection we layout the computational methodology of finding the optimal counter electrode pattern D(X). By optimal what we mean is that, under certain constraint (should there be any) the electrode pattern D(X) is expected to drive an evolving film state H(X, τ) from an initially uniform film of average thickness H0 towards a desired target shape H(X) at a user-specified termination time τ = τ as close as possible.
On an abstract level, the optimization task of EHL can be reformulated as a special case of the finite horizon problem in dynamic programming (Bertsekas,2005) which was popularized in the 50’s through the pioneer work of Bellman (1957) and has been widely applied to problems of optimal control (see Hocking (1991) for numerous applications).
In this framework, an admissible control D(X) is sought over a finite horizon [0, τ], which, together with its corresponding state trajectory{H(X, τ)|τ ∈[0, τ]}, minimizes a terminal objective (or cost) functional of the formJ[H(X, τ), D(X)]. Mathematically we encode the two primary goals of Electrohydrodynamic Lithography, namely the fidelity of final interface shapeH(X, τ)to the target profileH(X)and the physical plausibility of the electrode topographyD(X), into the objective functional
J[H(X, τ), D(X)] =Z
Ω
1
2H(X, τ)−H(X)2dΩ+R[D(X)], (3.49) where functional R[D]is a regularization that penalizes nonphysical geometric features of the electrode pattern. For example, the H1-regularization (Heinkenschloss, 1998;
Vogel,2002)
R[D(X)] =γ Z
Ω
1
2|∇kD|2dΩ (3.50)
would suppress excessive amount of high-frequency spatial oscillations and ensure certain level of smoothness inD(X)depending the numerical value of penalty parameterγ >0.
Our goal is to minimize the objective functionalJdefined in (3.49) as much as possible.
Obviously this is not a free minimization: the final film shape H(X, τ =τ) is the last snapshot of the evolution equation (3.17) which is directly influenced by the electrode topography D(X). The optimization problem we are dealing with here is constrained and most likely non-convex: given an initial film thickness H0, a termination time τ, dimensions of the periodic domainΩand the desired target film profile H(X), find the optimal electrode topography
D(X) = argmin
D J[H(X, τ), D(X)]
subject to ∂H
∂τ +N(H, D) = 0 for 0≤τ ≤τ ,
(3.51)
where N(H, D) stands for the negative of the nonlinear operator on the right hand side of equation (3.17) in shorthand. Most optimization algorithms are gradient-based methods which search for local minima of the objective. Unlike the explicit gradient in the context of multi-variable calculus, the gradient in optimization problem (3.51) is rather abstract, which can be interpreted as the infinitesimal total variation of the objectiveJ[H(X, τ), D(X)]with respect to infinitesimal changes in the control variable D(X) along the hypersurface implicitly given by the constraint between the control D(X) and the stateH(X, τ).
The type of control studied in this work falls into the open-loop category (Kirk,2004):
once the optimal topography is computed, the control action exerted by the electrode pattern D(X) is independent of the evolving film stateH(X, τ). As the film deforms, modifying the electrode topography concurrently, i.e. D(X) → D(X, τ), would be difficult at the scale of micro- and nano-lithography. Although it is experimentally feasible to alter the overall amplitude of voltage difference across the thin gap in time.
It is mathematically equivalent to replace the effective electrostatic pressure Π(D, H) with A(τ)Π(D, H) where A(τ) is a function of time. This type of control is called feedback or closed-loop (Kirk,2004; Bertsekas,2005), which is particularly useful when stochasticity in EHL is taken into account and is left for future study.
Optimal electrode topography acting on an initially flat film
Let electrode topography D(X) be one of the optimal designs which at least locally minimizes the constrained objective functional J[H, D]defined in (3.49) andH(X, τ) be the spatial-temporal film profile generated by D(X). In this subsection we derive a set of necessary conditions which the optimal topography D(X) and the optimal evolution trajectory of film profile H(X, τ) must satisfy.
The derivation of optimality condition here closely follows the variational formalism in Tröltzsch (2010) which treats adjoint state as a Lagrange multiplier. In order to enforce the dynamic evolution (3.17) on film state H(X, τ), it is convenient to employ the method of augmented Lagrangian with the Lagrangian functional
L[H(X, τ), D(X), Λ(X, τ)] =J[H, D]− Z τ
0
Z
Ω
Λh∂H
∂τ +N(H, D)idΩdτ, (3.52) whereΛ(X, τ)is the spatial-temporal Lagrange multiplier which implicitly imposes the dynamic constraint between the time-varying film profileH(X, τ)and the electrode to- pographyD(X). By definition the augmented LagrangianL[H, D, Λ]is unconstrained in all of its arguments, the critical points of which coincide with the ones of the con- strained objective J[H, D]. One way to characterize the optimal evolution trajectory H(X, τ), the optimal topography D(X) and the optimal multiplier Λ(X, τ) of the Lagrangian is to identify conditions under which the first variations δL of Lagrangian L[H, D, Λ]evaluated at optimal solutions vanish against all infinitesimal perturbations in its arguments.
Let δH(X, τ), δD(X), δΛ(X, τ) be some admissible infinitesimal variations to the optimal solutions. Then the first variations of Lagrangian L[H, D, Λ]are given by the Fréchet derivative in the directions of these prescribed variations,
δL[H, D, Λ;δΛ] =Z τ
0
Z
Ω
δΛh∂H
∂τ +N(H, D)idΩdτ, (3.53)
δL[H, D, Λ;δH] =δJ[H, D;δH]− Z τ
0
Z
ΩΛh∂δH
∂τ +LH(H, D)δHidΩdτ, (3.54) δL[H, D, Λ;δD] =δJ[H, D;δD]−
Z τ 0
Z
Ω
ΛLD(H, D)δDdΩdτ, (3.55) where δJ[H, D;δH] and δJ[H, D;δD] are the unconstrained (free) variations of the objective functionalJ with respect to its arguments. The two operatorsLH andLD in (3.54) and (3.55) are the linearizations of the nonlinear operator N(H, D) about the optimal solutions H andD, respectively,
LH(H, D)δH =∇k·
M(H)∇k
h∇k2δH +∂Π
∂H
H,DδHi +∇k·
dM dH
HδH∇kh∇k2H+Π(H, D)i , (3.56) LD(H, D)δD =∇k·
M(H)∇kh∂Π(H, D)
∂D δDi. (3.57)
The condition that the first variation δL in (3.53) vanishes for any spatial-temporal variation δΛ of the optimal multiplier Λ simply recovers the nonlinear constraint, i.e.
the optimal trajectory H and topography D must fulfill the electrohydrodynamic thin film equation (3.17),
∂H
∂τ +N(H, D) = 0 for 0≤τ ≤τ , H(X,0) =H0 at τ = 0.
(3.58)
Equation (3.58) is called the state (forward) PDE.
The condition that δL in (3.54) vanishes for any spatial-temporal variation δH of the optimal evolution trajectory H is not yet explicit because the operators under the spatiotemporal integrals in (3.54) are still acting on the variation δH. We need to rearrange the integrals into equivalent forms such asR0τRΩδH×(...) dτdΩ so that the response to the variationδH can be explicitly identified. Performing integration by parts on (3.54) yields
Z τ 0
Z
ΩΛh∂δH
∂τ +LH(H, D)δHidΩdτ
=Z
Ω
Z τ 0 Λ∂δH
∂τ dτdΩ+Z τ
0
Z
Ω
LH(H, D)δHdΩdτ
=Z
Ω
δH(X, τ)Λ(X, τ) dΩ− Z
Ω
δH(X,0)Λ(X,0) dΩ
− Z
Ω
Z τ 0 δH∂Λ
∂τ dτdΩ+Z τ
0
Z
Ω
δHL†H(H, D)ΛdΩdτ
=Z
ΩδH(X, τ)Λ(X, τ) dΩ+Z τ
0
Z
ΩδH
−∂Λ
∂τ +L†H(H, D)Λ∗dΩdτ, (3.59) where we drop the integral at τ = 0 because the initial condition is meant to be fixed for which δH(X,0) = 0. The linear operator L†H(H, D) is the adjoint of LH(H, D) defined in (3.57), the closed-form expression of which can be derived thanks to the Green’s identity. We begin with the definition of adjoint operator,
Z
ΩALH(H, D)BdΩ=Z
ΩBL†H(H, D)AdΩ
=Z
Ω
Bh∇k2+∂Π
∂H H,D
i∇k·hM(H)∇kAidΩ
− Z
ΩB dM
dH
H∇kh∇k2H+Π(H, D)i· ∇kA
dΩ, (3.60)
where periodic boundary conditions on A, B, H and D eliminate all boundary terms arsing from Green’s identity. This concludes the form of the adjoint operator
L†H(H, D)A=h∇k2+∂Π
∂H H,D
i∇k·hM(H)∇kAi−dM dH
H∇kh∇k2H+Π(H, D)i·∇kA.
(3.61) Recall from definition (3.49) of the objective functional J that its first unconstrained variation in the direction ofδH is given by
δJ[H, D;δH] =Z
Ω
δH(X, τ)hH(X, τ)−H(X)idΩ. (3.62) With (3.59) and (3.62) in mind, the condition thatδLvanishes for any spatial-temporal variationδHto the optimal trajectoryHis equivalent to impose another transient partial differential equation on the optimal multiplierΛ,
−∂Λ
∂τ +L†H(H, D)Λ= 0 for 0≤τ ≤τ , Λ(X, τ) =H(X)−H(X) at τ =τ .
(3.63)
Equation (3.63) is called the adjoint (backward) PDE. The issue of ill-posedness and severe instability in backward parabolic PDEs is discussed in Isakov (2006). For instance, in the case where forward model is simply a linear diffusion PDE, the operators N = LH = L†H = −∇k2 all become the Laplacian and the adjoint PDE (3.63) is now the backwards diffusion equation which is peculiarly unstable at all high spatial frequencies.
In general it is advised to integrate the unstable adjoint PDE backwards in time (Giles and Pierce,2000; Stoll and Wathen,2013) by relabeling the temporal variableτ =τ−τ so that the final time condition atτ =τ becomes an initial condition atτ = 0.
Likewise, the condition for which variation δL in (3.55) vanishes in all directions of δD would require us to first compute the adjoint operator of LD(H, D) which has the analytic form
Z
Ω
ALD(H, D)BdΩ=Z
Ω
BL†D(H, D)AdΩ
=Z
ΩB ∂Π
∂D
H,D∇k·hM(H)∇kAidΩ, L†D(H, D)A= ∂Π
∂D
H,D∇k·hM(H)∇kAi. (3.64) The free variation of the objective functional J with respect toδD usually comes from the regularizationR[D]alone. For instance, the H1 regularization (3.50) leads to,
δJ[H, D;δD] =δR[D;δD] =Z
Ω
γ∇kδD· ∇kDdΩ=Z
ΩδD(−γ∇k2D) dΩ. (3.65) With (3.65) and (3.64) substituted into (3.55), the variation of Lagrangian L[H, D, Λ]
in the directions of allδD is evaluated to be δL[H, D, Λ;δD] =Z
ΩδD(−γ∇k2D) dΩ− Z τ
0
Z
Ω
δDL†D(H, D)ΛdΩdτ
=Z
Ω
δD
−γ∇k2D− Z τ
0 L†D(H, D)ΛdτdΩ, (3.66) where we have exploited the fact thatδD(X) is only a spatial variation to the optimal topography D(X)and must commute with the time integral. The condition forδL to vanish for all variationsδD(X) is now straightforward:
δL δD
H,D,Λ=−γ∇k2D− Z τ
0 L†D(H, D)Λdτ = 0. (3.67) Equation (3.67) is called the control PDE. We observe that, the effect ofH1-regularization, i.e. the term multiplied by γ, is equivalent to the Laplacian smoothing introduced through a small amount of artificial diffusion controlled by the size ofγ into the system so that the optimal topographyD(X)is regularized at high spatial frequencies to ensure physical plausibility in the optimal design of the electrode pattern.
H(X,0) H(X, τ) H(X, τ) H(X)
Λ(X, τ) Λ(X, τ)
Λ(X,0)
δL/δD
δR/δD δJ/δD
D(X) Optimal Stop
Suboptimal Gradient update
Raw gradient Regularization
Figure 3.9: Flow chart of computing constrained variational derivative in electrode to- pography D(X): state PDE (blue) for H(X, τ), adjoint PDE (red) for Λ(X, τ) and control equation (green) for D(X).
In order to obtain the optimal topography D(X), one must simultaneously solve all three equations for the optimal solutionsH(X, τ),D(X)andΛ(X, τ)which is a non- trivial task. In practice, the solutions to (3.58), (3.63) and (3.67) are usually acquired incrementally (Nocedal and Wright, 2006). In a typical iterative framework, we would start with a good initial guess of the topography functionD(X). The control equation (3.67) is relaxed whereas the state equation (3.58) and the adjoint equation (3.63) are solved exactly for H(X, τ) and Λ(X, τ). We then compute the residue of the control equation (3.67) and use the gradient informationδL/δDto help the objective functional J[H, D]descend. Once J almost reaches its (local) minimum J, we expect H → H, Λ → Λ and D→ D converging to their (local) optimal solutions as well. This entire process is illustrated in the flow chart shown in figure 3.9.