• Tidak ada hasil yang ditemukan

NCMs as Lyapunov Functions

Chapter 6: Learning-based Safe and Robust Control and Estimation

6.2 NCMs as Lyapunov Functions

where ๐‘ฅ = [๐‘ฅ1, ๐‘ฅ2, ๐‘ฅ3]โŠค, ๐œŽ = 10, ๐›ฝ = 8/3, and ๐œŒ = 28. It can be seen that the steady-state estimation errors of the NCM and CV-STEM are below the optimal steady-state upper bound of Theorem 4.5 (dotted line in the left-hand side of Fig. 6.2) while the EKF has a larger error compared to the other two. In this example, training data is sampled along100trajectories with uniformly distributed initial conditions (โˆ’10 โ‰ค ๐‘ฅ๐‘– โ‰ค 10, ๐‘– = 1,2,3) using Theorem 4.5, and then modeled by a DNN as in Definition 6.2. Additional implementation and network training details can be found in [1].

Example 6.2. Consider a robust feedback control problem of the planar spacecraft perturbed by deterministic disturbances, the unperturbed dynamics of which is given as follows [9]:

๏ฃฎ

๏ฃฏ

๏ฃฏ

๏ฃฏ

๏ฃฏ

๏ฃฏ

๏ฃฐ

๐‘š 0 0

0 ๐‘š 0

0 0 ๐ผ

๏ฃน

๏ฃบ

๏ฃบ

๏ฃบ

๏ฃบ

๏ฃบ

๏ฃป

ยฅ ๐‘ฅ =

๏ฃฎ

๏ฃฏ

๏ฃฏ

๏ฃฏ

๏ฃฏ

๏ฃฏ

๏ฃฐ

cos๐œ™ sin๐œ™ 0

โˆ’sin๐œ™ cos๐œ™ 0

0 0 1

๏ฃน

๏ฃบ

๏ฃบ

๏ฃบ

๏ฃบ

๏ฃบ

๏ฃป

๏ฃฎ

๏ฃฏ

๏ฃฏ

๏ฃฏ

๏ฃฏ

๏ฃฏ

๏ฃฐ

โˆ’1 โˆ’1 0 0 1 1 0 0

0 0 โˆ’1 โˆ’1 0 0 1 1

โˆ’โ„“ โ„“ โˆ’๐‘ ๐‘ โˆ’โ„“ โ„“ โˆ’๐‘ ๐‘

๏ฃน

๏ฃบ

๏ฃบ

๏ฃบ

๏ฃบ

๏ฃบ

๏ฃป ๐‘ข

where ๐‘ฅ = [๐‘๐‘ฅ, ๐‘๐‘ฆ, ๐œ™,๐‘ยค๐‘ฅ,๐‘ยค๐‘ฆ,๐œ™ยค]๐‘‡ with ๐‘๐‘ฅ, ๐‘๐‘ฆ, and ๐œ™ being the horizontal coordi- nate, vertical coordinate, and yaw angle of the spacecraft, respectively, with the other variables defined as๐‘š =mass of spacecraft,๐ผ =yaw moment of inertia,๐‘ข = thruster force vector,โ„“ =half-depth of spacecraft, and๐‘=half-width of spacecraft (see Fig. 8 of [9] for details).

As shown in the right-hand side of Fig. 6.2, the NCM keeps their trajectories within the bounded error tube of Theorem 4.2 (shaded region) around the target trajectory (dotted line) avoiding collision with the circular obstacles, even under the presence of disturbances, whilst requiring much smaller computational cost than the CV- STEM. Data sampling and the NCM training are performed as in Example 6.1 [1].

Remark 6.3. We could also simultaneously synthesize a controller๐‘ขand the NCM, and Theorem 6.1 can provide formal robustness and stability guarantees even for such cases [8], [10]. In Chapter 7, we generalize the NCM to learn contraction theory-based robust control using a DNN only taking๐‘ฅ, ๐‘ก, and a vector containing local environment information as its inputs, to avoid the online computation of๐‘ฅ๐‘‘ for the sake of automatic guidance and control implementable in real-time.

0 20 40 timet

10 20 30 40

estimationerror||xโˆ’ห†x||

EKF CV-STEM NCM Mean of upper bounds

0 10 20

horizontal coordinate px

0 10 20

verticalcoordinatepy

start

goal LQR CV-STEM NCM Desired trajectory

Bounded error tube

Figure 6.2: Lorenz oscillator state estimation error smoothed using a 15-point moving average filter in Example 6.1 (left), and spacecraft positions (๐‘๐‘ฅ, ๐‘๐‘ฆ) on a planar field in Example 6.2 (right).

result different from Theorems 6.1 and 6.3. To this end, let us recall that designing optimal ๐‘˜ of ๐‘ข = ๐‘˜(๐‘ฅ , ๐‘ฅ๐‘‘, ๐‘ข๐‘‘, ๐‘ก) reduces to designing optimal ๐พ(๐‘ฅ , ๐‘ฅ๐‘‘, ๐‘ข๐‘‘, ๐‘ก) of ๐‘ข = ๐‘ข๐‘‘ โˆ’๐พ(๐‘ฅ , ๐‘ฅ๐‘‘, ๐‘ข๐‘‘, ๐‘ก) (๐‘ฅโˆ’๐‘ฅ๐‘‘) due to Lemma 3.2 of Chapter 3. For such ๐‘ข, the virtual system of (6.1)/(6.2) and (6.3) without any perturbation, which has ๐‘ž = ๐‘ฅ and๐‘ž=๐‘ฅ๐‘‘as its particular solutions, is given as follows:

ยค

๐‘ž =(๐ด(๐œš, ๐‘ฅ , ๐‘ฅ๐‘‘, ๐‘ข๐‘‘, ๐‘ก) โˆ’๐ต(๐‘ฅ , ๐‘ก)๐พ(๐‘ฅ , ๐‘ฅ๐‘‘, ๐‘ข๐‘‘, ๐‘ก)) (๐‘žโˆ’๐‘ฅ๐‘‘) + ๐‘“(๐‘ฅ๐‘‘, ๐‘ก) +๐ต(๐‘ฅ๐‘‘, ๐‘ก)๐‘ข๐‘‘ (6.19) where ๐ดis the SDC matrix of (6.1) in Lemma 3.1, i.e., ๐ด(๐œš, ๐‘ฅ , ๐‘ฅ๐‘‘, ๐‘ข๐‘‘, ๐‘ก) (๐‘ฅโˆ’๐‘ฅ๐‘‘) = ๐‘“(๐‘ฅ , ๐‘ก) +๐ต(๐‘ฅ , ๐‘ก)๐‘ข๐‘‘โˆ’ ๐‘“(๐‘ฅ๐‘‘, ๐‘ก) โˆ’๐ต(๐‘ฅ๐‘‘, ๐‘ก)๐‘ข๐‘‘. Using the result of Theorem 2.3, one of the Lyapunov functions for the virtual system (6.19) with unknown๐พ can be given as the following transformed squared length integrated over๐‘ฅ of (6.1)/(6.2) and๐‘ฅ๐‘‘ of (6.3):

๐‘‰NCM=

โˆซ ๐‘ฅ

๐‘ฅ๐‘‘

๐›ฟ๐‘žโŠคM (๐‘ฅ , ๐‘ฅ๐‘‘, ๐‘ข๐‘‘, ๐‘ก)๐›ฟ๐‘ž (6.20)

where M defines the NCM of Definition 6.2 modeling ๐‘€ of the CV-STEM con- traction metric in Theorem 4.2.

Theorem 6.4. Suppose(6.1)and(6.2)are controlled by๐‘ข =๐‘ข๐‘‘โˆ’๐พโˆ—(๐‘ฅ , ๐‘ฅ๐‘‘, ๐‘ข๐‘‘, ๐‘ก)e, wheree=๐‘ฅโˆ’๐‘ฅ๐‘‘, and that ๐พโˆ—is designed by the following convex optimization for given (๐‘ฅ , ๐‘ฅ๐‘‘, ๐‘ข๐‘‘, ๐‘ก):

(๐พโˆ—, ๐‘โˆ—) =arg min

๐พโˆˆC๐พ, ๐‘โˆˆR

โˆฅ๐พโˆฅ2๐น +๐‘2 (6.21)

s.t.

โˆซ ๐‘ฅ ๐‘ฅ๐‘‘

๐›ฟ๐‘žโŠค( ยคM +2 sym(M๐ดโˆ’ M๐ต๐พ))๐›ฟ๐‘ž โ‰ค โˆ’2๐›ผ

โˆซ ๐‘ฅ ๐‘ฅ๐‘‘

๐›ฟ๐‘žโŠค(M +๐›ฝI)๐›ฟ๐‘ž+๐‘ (6.22)

where M defines the NCM of Definition 6.2 that models ๐‘€ of the CV-STEM con- traction metric in Theorem 4.2 as in(6.18),C๐พis a convex set containing admissible ๐พ,๐›ผ is the contraction rate, and ๐›ฝis as defined in theorem 3.1 (i.e., ๐›ฝ = 0for de- terministic systems and ๐›ฝ=๐›ผ๐‘  for stochastic systems). Then Theorems 2.4 and 2.5 still hold for the Lyapunov function defined in(6.20), yielding the following bounds:

โˆฅe(๐‘ก) โˆฅ2 โ‰ค ๐‘‰NCM(0) ๐‘š

๐‘’โˆ’2๐›ผ๐‘๐‘ก+

ยฏ ๐‘ ๐‘š + ๐‘‘ยฏ2๐‘๐‘š

๐›ผ๐‘‘๐‘š

2๐›ผ๐‘

(1โˆ’๐‘’โˆ’2๐›ผ๐‘๐‘ก) (6.23) E

โˆฅe(๐‘ก) โˆฅ2

โ‰ค E[๐‘‰NCM(0)]

๐‘š

๐‘’โˆ’2๐›ผ๐‘ก +๐ถ๐ถ๐‘š+๐‘ยฏ 2๐›ผ๐‘š

(6.24) P[โˆฅe(๐‘ก) โˆฅ โ‰ฅ ๐œ€] โ‰ค 1

๐œ€2

E[๐‘‰NCM(0)]

๐‘š

๐‘’โˆ’2๐›ผ๐‘ก+ ๐ถ๐ถ๐‘š+๐‘ยฏ 2๐›ผ๐‘š

(6.25) where ๐‘ยฏ = sup๐‘ฅ ,๐‘ฅ

๐‘‘,๐‘ข๐‘‘,๐‘ก ๐‘, ๐‘šI โชฏ M โชฏ ๐‘šI as in (2.26), ๐›ผ๐‘‘ โˆˆ R>0 is an arbitrary positive constant (see(6.26)) selected to satisfy๐›ผ๐‘ =๐›ผโˆ’๐›ผ๐‘‘/2 >0,๐ถ๐ถ =๐‘”ยฏ2

๐‘(2๐›ผ๐บโˆ’1+ 1), ๐›ผ๐บ โˆˆ R>0 is an arbitrary constant as in Theorem 2.5, ๐œ€ โˆˆ Rโ‰ฅ0, and the disturbance terms๐‘‘ยฏ๐‘and๐‘”ยฏ๐‘are given in(6.1)and(6.2), respectively. Note that the problem(6.21)is feasible due to the constraint relaxation variable๐‘.

Furthermore, if๐œ–โ„“ =0in(6.18),(6.21)with๐‘=0is always feasible, and the optimal feedback gain ๐พโˆ— minimizes its Frobenius norm under the contraction constraint (6.22).

Proof. Computing๐‘‰ยคNCM for the deterministic system (6.1) along with the virtual dynamics (6.19) and the stability condition (6.22) yields

๐‘‰ยคNCM โ‰ค โˆ’2๐›ผ๐‘‰NCM+๐‘‘ยฏ๐‘

โˆš ๐‘š

โˆš๏ธ

๐‘‰NCM+ ๐‘ยฏ โ‰ค โˆ’2(๐›ผโˆ’๐›ผ๐‘‘/2)๐‘‰NCM+๐‘‘ยฏ2

๐‘๐‘š/๐›ผ๐‘‘+๐‘ยฏ(6.26) where the relation 2๐‘Ž ๐‘ โ‰ค ๐›ผโˆ’1

๐‘‘ ๐‘Ž2+๐›ผ๐‘‘๐‘2, which holds for any๐‘Ž, ๐‘ โˆˆRand๐›ผ๐‘‘ โˆˆR>0

(๐‘Ž = ๐‘‘ยฏ๐‘

โˆš

๐‘š and๐‘ = โˆš

๐‘‰NCM in this case), is used to obtain the second inequality.

Since we can arbitrarily select ๐›ผ๐‘‘ to have ๐›ผ๐‘ = ๐›ผโˆ’ ๐›ผ๐‘‘/2 > 0, (6.26) gives the exponential bound (6.23) due to the comparison lemma of Lemma 2.1. Similarly, computingโ„’๐‘‰NCMfor the stochastic system (6.2) yields

โ„’๐‘‰NCM โ‰ค โˆ’2๐›ผ๐‘‰NCM+๐ถ๐‘๐‘š+๐‘ยฏ

which gives (6.24) and (6.25) due to Lemma 2.2. If๐œ–โ„“ =0 in (6.18),๐พ =๐‘…โˆ’1๐ตโŠค๐‘€ satisfies (6.22) with๐‘ =0 as we haveM = ๐‘€and๐‘€defines the contraction metric of Theorem 4.2, which implies that (6.21) with ๐‘ =0 is always feasible.

We could also use the CCM-based differential feedback formulation of Theorem 4.6 in Theorem 6.4 (see [14], [15]). To this end, define๐ธNCM(Riemannian energy) as follows:

๐ธ =

โˆซ 1 0

๐›พ๐œ‡(๐œ‡, ๐‘ก)โŠคM (๐›พ(๐œ‡, ๐‘ก), ๐‘ก)๐›พ๐œ‡(๐œ‡, ๐‘ก)๐‘‘๐œ‡ (6.27) where M defines the NCM of Definition 6.2 that models ๐‘€ of the CCM in The- orem 4.6, ๐›พ is the minimizing geodesic connecting ๐‘ฅ(๐‘ก) = ๐›พ(1, ๐‘ก) of (6.1) and ๐‘ฅ๐‘‘(๐‘ก) =๐›พ(0, ๐‘ก) of (6.3) (see Theorem 2.3), and๐›พ๐œ‡ =๐œ• ๐›พ/๐œ• ๐œ‡.

Theorem 6.5. Suppose๐‘ข = ๐‘ขโˆ— of (6.1) is designed by the following convex opti- mization for given (๐‘ฅ , ๐‘ฅ๐‘‘, ๐‘ข๐‘‘, ๐‘ก):

(๐‘ขโˆ—, ๐‘โˆ—) =arg min

๐‘ขโˆˆC๐‘ข, ๐‘โˆˆR

โˆฅ๐‘ขโˆ’๐‘ข๐‘‘โˆฅ2+๐‘2 (6.28)

s.t.

๐œ• ๐ธ

๐œ• ๐‘ก

+2๐›พ๐œ‡(1, ๐‘ก)โŠคM (๐‘ฅ , ๐‘ก) (๐‘“(๐‘ฅ , ๐‘ก) +๐ต(๐‘ฅ , ๐‘ก)๐‘ข)

s.t. โˆ’2๐›พ๐œ‡(0, ๐‘ก)โŠคM (๐‘ฅ๐‘‘, ๐‘ก) (๐‘“(๐‘ฅ๐‘‘, ๐‘ก) +๐ต(๐‘ฅ๐‘‘, ๐‘ก)๐‘ข๐‘‘) โ‰ค โˆ’2๐›ผ ๐ธ+๐‘ (6.29) whereM defines the NCM of Definition 6.2 that models the CCM of Theorem 4.6, andC๐‘ขis a convex set containing admissible๐‘ข. Then the bound(6.23)of Theorem 6.4 holds.

Furthermore, if ๐œ–โ„“ = 0for the NCM learning error ๐œ–โ„“ in(6.18), then the problem (6.28) with ๐‘ = 0is always feasible, and ๐‘ขโˆ— minimizes the deviation of๐‘ข from ๐‘ข๐‘‘ under the contraction constraint(6.29).

Proof. Since the right-hand side of (6.29) represents ๐ธยค of (6.27) if ๐‘‘๐‘ = 0 in (6.1), the comparison lemma of Lemma 2.1 gives (6.23) as in Theorem 6.4. The rest follows from the fact that๐‘ข of Theorem 4.6 is a feasible solution of (6.28) if ๐‘ =0 [14], [15].

Remark 6.4. As discussed in Remark 4.2, Theorem 6.5 can also be formulated in a stochastic setting as in Theorem 6.4 using the technique discussed in [8].

Theorems 6.4 and 6.5 provide a perspective on the NCM stability different from Theorems 6.1 and 6.3 written in terms of the constraint relaxation variable ๐‘, by directly using the NCM as a contraction metric instead of the CV-STEM contraction metric or the CCM. Since the CLF problems are feasible with ๐‘ =0 when we have zero NCM modeling error, it is implied that the upper bound of ๐‘ (i.e., ยฏ๐‘ in (6.23)

and (6.24)) decreases as we achieve a smaller learning error๐œ–โ„“ using more training data for verifying (6.18) (see Remark 5.1). This intuition is also formalized later in [16]. See [14], [15] for how we compute the minimizing geodesic ๐›พ and its associated Riemannian energy๐ธ in Theorem 6.5.

Example 6.3. If C๐พ = R๐‘šร—๐‘› and C๐‘ข = R๐‘š in(6.21) and(6.28), respectively, they can both be expressed as the following quadratic optimization problem:

๐‘ฃโˆ— =arg min

๐‘ฃโˆˆRv

1

2๐‘ฃโŠค๐‘ฃ+๐‘(๐‘ฅ , ๐‘ฅ๐‘‘, ๐‘ข๐‘‘, ๐‘ก)โŠค๐‘ฃ

s.t. ๐œ‘0(๐‘ฅ , ๐‘ฅ๐‘‘, ๐‘ข๐‘‘, ๐‘ก) +๐œ‘1(๐‘ฅ , ๐‘ฅ๐‘‘, ๐‘ข๐‘‘, ๐‘ก)โŠค๐‘ฃ โ‰ค 0

by defining๐‘ โˆˆRv, ๐œ‘0 โˆˆR, ๐œ‘1 โˆˆRv, and๐‘ฃ โˆˆRvappropriately. Applying the KKT condition [17, pp. 243-244] to this problem, we can show that

๐‘ฃโˆ— =

๏ฃฑ๏ฃด

๏ฃด

๏ฃฒ

๏ฃด๏ฃด

๏ฃณ

โˆ’๐‘โˆ’ ๐œ‘0โˆ’๐œ‘

โŠค 1๐‘ ๐œ‘โŠค

1๐œ‘1

๐œ‘1 if๐œ‘0โˆ’๐œ‘โŠค

1๐‘ > 0

โˆ’๐‘ otherwise.

This implies that the controller uses ๐‘ข๐‘‘ with the feedback ( (๐œ‘0โˆ’๐œ‘โŠค

1๐‘)/๐œ‘โŠค

1๐œ‘1)๐œ‘1 only if๐‘ฅยค = ๐‘“(๐‘ฅ , ๐‘ก) +๐ต(๐‘ฅ , ๐‘ก)๐‘ขwith๐‘ข=๐‘ข๐‘‘is not contracting, i.e.,๐œ‘0โˆ’๐œ‘โŠค

1๐‘ >0.

Example 6.4. When ๐‘ž is parameterized linearly by ๐œ‡ in Theorem 6.4, i.e., ๐‘ž = ๐œ‡๐‘ฅ+ (1โˆ’๐œ‡)๐‘ฅ๐‘‘, we have๐‘‰NCM= (๐‘ฅโˆ’๐‘ฅ๐‘‘)โŠคM (๐‘ฅโˆ’๐‘ฅ๐‘‘)as in the standard Lyapunov function formulation [11]โ€“[13]. Also, a linear input constraint,๐‘ขmin โ‰ค ๐‘ข โ‰ค ๐‘ขmax can be implemented by using

C๐พ ={๐พ โˆˆR๐‘šร—๐‘›|๐‘ข๐‘‘โˆ’๐‘ขmax โ‰ค ๐พe โ‰ค๐‘ข๐‘‘โˆ’๐‘ขmin} C๐‘ข ={๐‘ข โˆˆR๐‘š|๐‘ขmin โ‰ค ๐‘ข โ‰ค ๐‘ขmax}

in(6.21)and(6.28), respectively.

Remark 6.5. For the CV-STEM and NCM state estimation in Theorems 4.5, 6.2, and 6.3, we could design a similar Lyapunov function-based estimation scheme if its contraction constraints of Theorem 4.5 do not explicitly depend on the actual state ๐‘ฅ. This can be achieved by using๐ด(๐œš๐ด,๐‘ฅ , ๐‘กห† )and๐ถ(๐œš๐ถ,๐‘ฅ , ๐‘กห† )instead of๐ด(๐œš๐ด, ๐‘ฅ ,๐‘ฅ , ๐‘กห† ) and๐ถ(๐œš๐ถ, ๐‘ฅ ,๐‘ฅ , ๐‘กห† ) in Theorem 4.5 as in Theorem 3.2 [18], leading to exponential boundedness of system trajectories as derived in [18]โ€“[20].