• Tidak ada hasil yang ditemukan

Stability of NCM-based Control and Estimation

Chapter 6: Learning-based Safe and Robust Control and Estimation

6.1 Stability of NCM-based Control and Estimation

The NCM exhibits superior incremental robustness and stability due to its internal contracting property.

Theorem 6.1. Let M define the NCM in Definition 6.2, and let ๐‘ขโˆ— = ๐‘ข๐‘‘ โˆ’ ๐‘…โˆ’1๐ตโŠค๐‘€(๐‘ฅ โˆ’๐‘ฅ๐‘‘) for ๐‘€ and ๐‘… given in Theorem 4.2. Suppose that the systems (6.1)and(6.2)are controlled by๐‘ข๐ฟ, which computes the CV-STEM controller๐‘ขโˆ—of Theorem 4.2 replacing๐‘€ byM, i.e.,

๐‘ข๐ฟ =๐‘ข๐‘‘โˆ’๐‘…(๐‘ฅ , ๐‘ฅ๐‘‘, ๐‘ข๐‘‘, ๐‘ก)โˆ’1๐ต(๐‘ฅ , ๐‘ก)โŠคM (๐‘ฅ , ๐‘ฅ๐‘‘, ๐‘ข๐‘‘, ๐‘ก)e (6.7) for e = ๐‘ฅ โˆ’ ๐‘ฅ๐‘‘. Note that we could use the differential feedback control ๐‘ข = ๐‘ข๐‘‘+โˆซ1

0 ๐‘˜ ๐‘‘๐œ‡of Theorem 4.6 for๐‘ขโˆ— and๐‘ข๐ฟ. Defineฮ”๐ฟ of Theorems 5.2 and 5.3 as follows:

ฮ”๐ฟ(๐œ›, ๐‘ก)= ๐ต(๐‘ฅ , ๐‘ก) (๐‘ข๐ฟ(๐‘ฅ , ๐‘ฅ๐‘‘, ๐‘ข๐‘‘, ๐‘ก) โˆ’๐‘ขโˆ—(๐‘ฅ , ๐‘ฅ๐‘‘, ๐‘ข๐‘‘, ๐‘ก)) (6.8) where we use๐‘ž(0, ๐‘ก) =๐œ‰0(๐‘ก)=๐‘ฅ๐‘‘(๐‘ก),๐‘ž(1, ๐‘ก) =๐œ‰1(๐‘ก)=๐‘ฅ(๐‘ก), and๐œ› =[๐‘ฅโŠค, ๐‘ฅโŠค

๐‘‘

, ๐‘ขโŠค

๐‘‘]โŠค in the learning-based control formulation (5.1) and (5.2). If the NCM is learned

to satisfy the learning error assumptions of Theorems 5.2 and 5.3 forฮ”๐ฟ given by (6.8), then(5.11),(5.14), and(5.15)of Theorems 5.2 and 5.3 hold as follows:

โˆฅe(๐‘ก) โˆฅ โ‰ค ๐‘‰โ„“(0)

โˆš ๐‘š

๐‘’โˆ’๐›ผโ„“๐‘ก+ ๐œ–โ„“0+๐‘‘ยฏ๐‘ ๐›ผโ„“

โˆš๏ธ„

๐‘š ๐‘š

(1โˆ’๐‘’โˆ’๐›ผโ„“๐‘ก) (6.9)

E

โˆฅe(๐‘ก) โˆฅ2

โ‰ค E[๐‘‰๐‘ โ„“(0)]

๐‘š

๐‘’โˆ’2๐›ผโ„“๐‘ก+ ๐ถ๐ถ 2๐›ผโ„“

๐‘š ๐‘š

(6.10) P[โˆฅe(๐‘ก) โˆฅ โ‰ฅ ๐œ€] โ‰ค 1

๐œ€2

E[๐‘‰๐‘ โ„“(0)]

๐‘š

๐‘’โˆ’2๐›ผโ„“๐‘ก+ ๐ถ๐ถ 2๐›ผโ„“

๐‘š ๐‘š

(6.11) wheree =๐‘ฅโˆ’๐‘ฅ๐‘‘, ๐ถ๐ถ =๐‘”ยฏ2

๐‘(2๐›ผ๐บโˆ’1+1) +๐œ–2

โ„“0๐›ผโˆ’1

๐‘‘ , the disturbance upper bounds๐‘‘ยฏ๐‘ and๐‘”ยฏ๐‘ are given in(6.1)and(6.2), respectively, and the other variables are defined in Theorems 4.2, 5.2 and 5.3.

Proof. Letgof the virtual systems given in Theorems 5.2 and 5.3 beg= ๐œ, where ๐œis given by (3.13) (i.e., ๐œ = (๐ดโˆ’๐ต ๐‘…โˆ’1๐ตโŠค๐‘€) (๐‘žโˆ’๐‘ฅ๐‘‘) + ยค๐‘ฅ๐‘‘). By definition ofฮ”๐ฟ, this has๐‘ž =๐‘ฅ and๐‘ž =๐‘ฅ๐‘‘ as its particular solutions (see Example 5.1). As defined in the proof of Theorems 3.1, we have ๐‘‘ = ๐œ‡ ๐‘‘๐‘ and๐บ = ๐œ‡๐บ๐‘ for these virtual systems, resulting in the upper bounds of disturbances in the learning-based control formulation (5.11) and (5.14) given as

๐‘‘ยฏ=๐‘‘ยฏ๐‘and ยฏ๐‘”=๐‘”ยฏ๐‘. (6.12)

Since๐‘€and๐‘ขโˆ—are constructed to render๐‘žยค=๐œ(๐‘ž, ๐œ›, ๐‘ก)contracting as presented in Theorem 4.2, and we haveโˆฅฮ”๐ฟโˆฅ โ‰ค ๐œ–โ„“0+๐œ–โ„“1โˆฅ๐œ‰1โˆ’๐œ‰0โˆฅdue to the learning assumption (5.3), Theorem 5.2 implies the deterministic bound (6.9) and Theorem 5.3 implies the stochastic bounds (6.10) and (6.11). Note that using the differential feedback control of Theorem 4.6 as๐‘ขโˆ—also gives the bounds (6.9) โ€“ (6.11) following the same argument [8].

Due to the estimation and control duality in differential dynamics observed in Chapter 4, we have a similar result to Theorem 6.1 for the NCM-based state estimator.

Theorem 6.2. LetM define the NCM in Definition 6.2, and let๐ฟ(๐‘ฅ , ๐‘กห† ) = ๐‘€๐ถยฏโŠค๐‘…โˆ’1 for ๐‘€,๐ถยฏ and๐‘…given in Theorem 4.5. Suppose that the systems(6.4)and(6.5)are estimated with an estimator gain๐ฟ๐ฟ, which computes๐ฟof the CV-STEM estimator

ยคห†

๐‘ฅ = ๐‘“(๐‘ฅ , ๐‘กห† ) +๐ฟ(๐‘ฅ , ๐‘กห† ) (๐‘ฆโˆ’โ„Ž(๐‘ฅ , ๐‘กห† ))in Theorem 4.5 replacing๐‘€ byM, i.e.,

ยคห†

๐‘ฅ = ๐‘“(๐‘ฅ , ๐‘กห† ) +๐ฟ๐ฟ(๐‘ฅ , ๐‘กห† ) (๐‘ฆโˆ’ โ„Ž(๐‘ฅ , ๐‘กห† )) (6.13)

= ๐‘“(๐‘ฅ , ๐‘กห† ) + M (๐‘ฅ , ๐‘กห† )๐ถยฏ(๐œš๐‘,๐‘ฅ , ๐‘กห† )โŠค๐‘…(๐‘ฅ , ๐‘กห† )โˆ’1(๐‘ฆโˆ’โ„Ž(๐‘ฅ , ๐‘กห† ))

Defineฮ”๐ฟ of Theorems 5.2 and 5.3 as follows:

ฮ”๐ฟ(๐œ›, ๐‘ก)= (๐ฟ๐ฟ(๐‘ฅ , ๐‘กห† ) โˆ’๐ฟ(๐‘ฅ , ๐‘กห† )) (โ„Ž(๐‘ฅ , ๐‘ก) โˆ’โ„Ž(๐‘ฅ , ๐‘กห† )) (6.14) where we use ๐‘ž(0, ๐‘ก) = ๐œ‰0(๐‘ก) = ๐‘ฅ(๐‘ก), ๐‘ž(1, ๐‘ก) = ๐œ‰1(๐‘ก) = ๐‘ฅห†(๐‘ก), and ๐œ› = [๐‘ฅโŠค,๐‘ฅห†โŠค]โŠค in the learning-based control formulation (5.1) and (5.2). If the NCM is learned to satisfy the learning error assumptions of Theorems 5.2 and 5.3 forฮ”๐ฟ given by (6.14), then(5.11),(5.14), and(5.15)of Theorems 5.2 and 5.3 hold as follows:

โˆฅe(๐‘ก) โˆฅ โ‰ค

โˆš

๐‘š๐‘‰โ„“(0)๐‘’โˆ’๐›ผโ„“๐‘ก+ ๐‘‘ยฏ๐‘Ž

โˆš๏ธƒ

๐‘š

๐‘š +๐‘‘ยฏ๐‘๐‘š ๐›ผโ„“

(1โˆ’๐‘’โˆ’๐›ผโ„“๐‘ก) (6.15) E

โˆฅe(๐‘ก) โˆฅ2

โ‰ค ๐‘šE[๐‘‰๐‘ โ„“(0)]๐‘’โˆ’2๐›ผโ„“๐‘ก+ ๐ถ๐ธ 2๐›ผโ„“

๐‘š ๐‘š

(6.16) P[โˆฅe(๐‘ก) โˆฅ โ‰ฅ ๐œ€] โ‰ค 1

๐œ€2

๐‘šE[๐‘‰๐‘ โ„“(0)]๐‘’โˆ’2๐›ผโ„“๐‘ก+ ๐ถ๐ธ 2๐›ผโ„“

๐‘š ๐‘š

(6.17) wheree =๐‘ฅห†โˆ’๐‘ฅ,๐‘‘ยฏ๐‘Ž =๐œ–โ„“0+๐‘‘ยฏ๐‘’0,๐‘‘ยฏ๐‘= ๐œŒยฏ๐‘ยฏ๐‘‘ยฏ๐‘’1,๐ถ๐ธ =(๐‘”ยฏ2

๐‘’0+๐œŒยฏ2๐‘ยฏ2๐‘”ยฏ2

๐‘’1๐‘š2) (2๐›ผ๐บโˆ’1+1) + ๐œ–2

โ„“0๐›ผโˆ’1

๐‘‘ , the disturbance upper bounds ๐‘‘ยฏ๐‘’0 ๐‘‘ยฏ๐‘’1, ๐‘”ยฏ๐‘’0, and๐‘”ยฏ๐‘’1 are given in(6.4) and (6.5), and the other variables are defined in Theorems 4.5, 5.2 and 5.3.

Proof. Letgof the virtual systems given in Theorems 5.2 and 5.3 beg= ๐œ, where ๐œis given by (4.10) (i.e.,๐œ =(๐ดโˆ’๐ฟ๐ถ) (๐‘žโˆ’๐‘ฅ) + ๐‘“(๐‘ฅ , ๐‘ก)). By definition ofฮ”๐ฟ, this has๐‘ž=๐‘ฅห†and๐‘ž =๐‘ฅas its particular solutions (see Example 5.2). As defined in the proof Theorem 4.3, we have

๐‘‘ = (1โˆ’ ๐œ‡)๐‘‘๐‘’0+๐œ‡ ๐ฟ๐ฟ๐‘‘๐‘’1and๐บ = [(1โˆ’๐œ‡)๐บ๐‘’0, ๐œ‡ ๐ฟ๐ฟ๐บ๐‘’1]

for these virtual systems, resulting in the upper bounds of external disturbances in the learning-based control formulation (5.11) and (5.14) given as

ยฏ

๐‘‘ =๐‘‘ยฏ๐‘’0+๐œŒยฏ๐‘ยฏ๐‘‘ยฏ๐‘’1

โˆš๏ธƒ

๐‘š ๐‘šand ยฏ๐‘” =

โˆš๏ธƒ

ยฏ ๐‘”2

๐‘’0+๐œŒยฏ2๐‘ยฏ2๐‘”ยฏ2

๐‘’1๐‘š2.

Since๐‘Š = ๐‘€โˆ’1is constructed to render๐‘žยค = ๐œ(๐‘ž, ๐œ›, ๐‘ก)contracting as presented in Theorem 4.5, and we haveโˆฅฮ”๐ฟโˆฅ โ‰ค ๐œ–โ„“0+๐œ–โ„“1โˆฅ๐œ‰1โˆ’๐œ‰0โˆฅdue to the learning assumption (5.3), Theorem 5.2 implies the bound (6.15), and Theorem 5.3 implies the bounds (6.16) and (6.17).

Remark 6.2. As discussed in Examples 5.1 and 5.2 and in Remark 5.1, we have ๐œ–โ„“1 =0for the learning error๐œ–โ„“1of(5.3)as long as๐ตandโ„Žare bounded a compact set by the definition ofฮ”๐ฟ in(6.8)and(6.14). Ifโ„Žand๐‘ขโˆ—are Lipschitz with respect to๐‘ฅin the compact set, we have non-zero๐œ–โ„“1with๐œ–โ„“0 =0in(5.3)(see Theorem 6.3).

Using Theorems 6.1 and 6.2, we can relate the learning error of the matrix that defines the NCM,โˆฅM โˆ’๐‘€โˆฅ, to the error bounds (6.9) โ€“ (6.11) and (6.15) โ€“ (6.17), if๐‘€ is given by the CV-STEM of Theorems 4.2 and 4.5.

Theorem 6.3. LetM define the NCM in Definition 6.2, and letS๐‘  โІ R๐‘›,S๐‘ข โІ R๐‘š, and ๐‘ก โˆˆ S๐‘ก โІ Rโ‰ฅ0 be some compact sets. Suppose that the systems (6.1) โ€“ (6.5) are controlled by (6.7) and estimated by (6.13), respectively, as in Theorems 6.1 and 6.2. Suppose also thatโˆƒ๐‘,ยฏ ๐‘,ยฏ ๐œŒยฏ โˆˆRโ‰ฅ0s.t. โˆฅ๐ต(๐‘ฅ , ๐‘ก) โˆฅ โ‰ค ๐‘ยฏ, โˆฅ๐ถ(๐œš๐‘, ๐‘ฅ ,๐‘ฅ , ๐‘กห† ) โˆฅ โ‰ค ๐‘ยฏ, and โˆฅ๐‘…โˆ’1โˆฅ โ‰ค ๐œŒ,ยฏ โˆ€๐‘ฅ ,๐‘ฅห† โˆˆ S๐‘  and๐‘ก โˆˆ S๐‘ก, for ๐ตin(6.1),๐ถ in Theorem 4.5, and ๐‘…in (6.7)and(6.13). If we have for all๐‘ฅ , ๐‘ฅ๐‘‘,๐‘ฅห† โˆˆ S๐‘ ,๐‘ข๐‘‘ โˆˆ S๐‘ข, and๐‘ก โˆˆ S๐‘ก that

โˆฅM (๐‘ฅ , ๐‘ฅ๐‘‘, ๐‘ข๐‘‘, ๐‘ก) โˆ’ ๐‘€(๐‘ฅ , ๐‘ฅ๐‘‘, ๐‘ข๐‘‘, ๐‘ก) โˆฅ โ‰ค๐œ–โ„“ for control (6.18)

โˆฅM (๐‘ฅ , ๐‘กห† ) โˆ’๐‘€(๐‘ฅ , ๐‘กห† ) โˆฅ โ‰ค๐œ–โ„“ for estimation

where๐‘€(๐‘ฅ , ๐‘ฅ๐‘‘, ๐‘ข๐‘‘, ๐‘ก)and๐‘€(๐‘ฅ , ๐‘กห† )(or๐‘€(๐‘ฅ , ๐‘ก)and๐‘€(๐‘ฅ , ๐‘กห† ), see Theorem 3.2) are the CV-STEM solutions of Theorems 4.2 and 4.5 withโˆƒ๐œ–โ„“ โˆˆRโ‰ฅ0being the learning error, then the bounds (6.9)โ€“ (6.11)of Theorem 6.1 and (6.15)โ€“ (6.17) of Theorem 6.2 hold with ๐œ–โ„“0 = 0 and ๐œ–โ„“1 = ๐œŒยฏ๐‘ยฏ2๐œ–โ„“ for control, and ๐œ–โ„“0 = 0 and ๐œ–โ„“1 = ๐œŒยฏ๐‘ยฏ2๐œ–โ„“ for estimation, as long as๐œ–โ„“ is sufficiently small to satisfy the conditions(5.10)and (5.13)of Theorems 5.2 and 5.3 for deterministic and stochastic systems, respectively.

Proof. Forฮ”๐ฟ defined in (6.8) and (6.14) with the controllers and estimators given as๐‘ขโˆ— =๐‘ข๐‘‘ โˆ’๐‘…โˆ’1๐ตโŠค๐‘€(๐‘ฅ โˆ’๐‘ฅ๐‘‘), ๐‘ข๐ฟ = ๐‘ข๐‘‘โˆ’ ๐‘…โˆ’1๐ตโŠคM (๐‘ฅโˆ’๐‘ฅ๐‘‘), ๐ฟ = ๐‘€ ๐ถโŠค๐‘…โˆ’1, and ๐ฟ๐ฟ =M๐ถโŠค๐‘…โˆ’1, we have the following upper bounds:

โˆฅฮ”๐ฟโˆฅ โ‰ค

๏ฃฑ๏ฃด

๏ฃด

๏ฃฒ

๏ฃด๏ฃด

๏ฃณ

ยฏ

๐œŒ๐‘ยฏ2๐œ–โ„“โˆฅ๐‘ฅโˆ’๐‘ฅ๐‘‘โˆฅ (controller)

ยฏ

๐œŒ๐‘ยฏ2๐œ–โ„“โˆฅ๐‘ฅห†โˆ’๐‘ฅโˆฅ (estimator)

where the relation โˆฅโ„Ž(๐‘ฅ , ๐‘ก) โˆ’โ„Ž(๐‘ฅ , ๐‘กห† ) โˆฅ โ‰ค ๐‘ยฏโˆฅ๐‘ฅห†โˆ’๐‘ฅโˆฅ, which follows from the equality โ„Ž(๐‘ฅ , ๐‘ก) โˆ’โ„Ž(๐‘ฅ , ๐‘กห† ) =๐ถ(๐œš๐‘, ๐‘ฅ ,๐‘ฅ , ๐‘กห† ) (๐‘ฅโˆ’๐‘ฅห†)(see Lemma 3.1), is used to obtain the second inequality. This implies that we have๐œ–โ„“0=0 and๐œ–โ„“1=๐œŒยฏ๐‘ยฏ2๐œ–โ„“for control, and๐œ–โ„“0=0 and๐œ–โ„“1 = ๐œŒยฏ๐‘ยฏ2๐œ–โ„“ for estimation in the learning error of (5.3). The rest follows from Theorems 6.1 and 6.2.

Example 6.1. The left-hand side of Fig. 6.2 shows the state estimation errors of the following Lorenz oscillator perturbed by process noise ๐‘‘0 and measurement noise ๐‘‘1[1]:

ยค

๐‘ฅ= [๐œŽ(๐‘ฅ2โˆ’๐‘ฅ1), ๐‘ฅ1(๐œŒโˆ’๐‘ฅ3) โˆ’๐‘ฅ2, ๐‘ฅ1๐‘ฅ2โˆ’ ๐›ฝ๐‘ฅ3]โŠค+๐‘‘0(๐‘ฅ , ๐‘ก) ๐‘ฆ= [1,0,0]๐‘ฅ+๐‘‘2(๐‘ฅ , ๐‘ก)

where ๐‘ฅ = [๐‘ฅ1, ๐‘ฅ2, ๐‘ฅ3]โŠค, ๐œŽ = 10, ๐›ฝ = 8/3, and ๐œŒ = 28. It can be seen that the steady-state estimation errors of the NCM and CV-STEM are below the optimal steady-state upper bound of Theorem 4.5 (dotted line in the left-hand side of Fig. 6.2) while the EKF has a larger error compared to the other two. In this example, training data is sampled along100trajectories with uniformly distributed initial conditions (โˆ’10 โ‰ค ๐‘ฅ๐‘– โ‰ค 10, ๐‘– = 1,2,3) using Theorem 4.5, and then modeled by a DNN as in Definition 6.2. Additional implementation and network training details can be found in [1].

Example 6.2. Consider a robust feedback control problem of the planar spacecraft perturbed by deterministic disturbances, the unperturbed dynamics of which is given as follows [9]:

๏ฃฎ

๏ฃฏ

๏ฃฏ

๏ฃฏ

๏ฃฏ

๏ฃฏ

๏ฃฐ

๐‘š 0 0

0 ๐‘š 0

0 0 ๐ผ

๏ฃน

๏ฃบ

๏ฃบ

๏ฃบ

๏ฃบ

๏ฃบ

๏ฃป

ยฅ ๐‘ฅ =

๏ฃฎ

๏ฃฏ

๏ฃฏ

๏ฃฏ

๏ฃฏ

๏ฃฏ

๏ฃฐ

cos๐œ™ sin๐œ™ 0

โˆ’sin๐œ™ cos๐œ™ 0

0 0 1

๏ฃน

๏ฃบ

๏ฃบ

๏ฃบ

๏ฃบ

๏ฃบ

๏ฃป

๏ฃฎ

๏ฃฏ

๏ฃฏ

๏ฃฏ

๏ฃฏ

๏ฃฏ

๏ฃฐ

โˆ’1 โˆ’1 0 0 1 1 0 0

0 0 โˆ’1 โˆ’1 0 0 1 1

โˆ’โ„“ โ„“ โˆ’๐‘ ๐‘ โˆ’โ„“ โ„“ โˆ’๐‘ ๐‘

๏ฃน

๏ฃบ

๏ฃบ

๏ฃบ

๏ฃบ

๏ฃบ

๏ฃป ๐‘ข

where ๐‘ฅ = [๐‘๐‘ฅ, ๐‘๐‘ฆ, ๐œ™,๐‘ยค๐‘ฅ,๐‘ยค๐‘ฆ,๐œ™ยค]๐‘‡ with ๐‘๐‘ฅ, ๐‘๐‘ฆ, and ๐œ™ being the horizontal coordi- nate, vertical coordinate, and yaw angle of the spacecraft, respectively, with the other variables defined as๐‘š =mass of spacecraft,๐ผ =yaw moment of inertia,๐‘ข = thruster force vector,โ„“ =half-depth of spacecraft, and๐‘=half-width of spacecraft (see Fig. 8 of [9] for details).

As shown in the right-hand side of Fig. 6.2, the NCM keeps their trajectories within the bounded error tube of Theorem 4.2 (shaded region) around the target trajectory (dotted line) avoiding collision with the circular obstacles, even under the presence of disturbances, whilst requiring much smaller computational cost than the CV- STEM. Data sampling and the NCM training are performed as in Example 6.1 [1].

Remark 6.3. We could also simultaneously synthesize a controller๐‘ขand the NCM, and Theorem 6.1 can provide formal robustness and stability guarantees even for such cases [8], [10]. In Chapter 7, we generalize the NCM to learn contraction theory-based robust control using a DNN only taking๐‘ฅ, ๐‘ก, and a vector containing local environment information as its inputs, to avoid the online computation of๐‘ฅ๐‘‘ for the sake of automatic guidance and control implementable in real-time.