Chapter 6: Learning-based Safe and Robust Control and Estimation
6.2 NCMs as Lyapunov Functions
where ๐ฅ = [๐ฅ1, ๐ฅ2, ๐ฅ3]โค, ๐ = 10, ๐ฝ = 8/3, and ๐ = 28. It can be seen that the steady-state estimation errors of the NCM and CV-STEM are below the optimal steady-state upper bound of Theorem 4.5 (dotted line in the left-hand side of Fig. 6.2) while the EKF has a larger error compared to the other two. In this example, training data is sampled along100trajectories with uniformly distributed initial conditions (โ10 โค ๐ฅ๐ โค 10, ๐ = 1,2,3) using Theorem 4.5, and then modeled by a DNN as in Definition 6.2. Additional implementation and network training details can be found in [1].
Example 6.2. Consider a robust feedback control problem of the planar spacecraft perturbed by deterministic disturbances, the unperturbed dynamics of which is given as follows [9]:
๏ฃฎ
๏ฃฏ
๏ฃฏ
๏ฃฏ
๏ฃฏ
๏ฃฏ
๏ฃฐ
๐ 0 0
0 ๐ 0
0 0 ๐ผ
๏ฃน
๏ฃบ
๏ฃบ
๏ฃบ
๏ฃบ
๏ฃบ
๏ฃป
ยฅ ๐ฅ =
๏ฃฎ
๏ฃฏ
๏ฃฏ
๏ฃฏ
๏ฃฏ
๏ฃฏ
๏ฃฐ
cos๐ sin๐ 0
โsin๐ cos๐ 0
0 0 1
๏ฃน
๏ฃบ
๏ฃบ
๏ฃบ
๏ฃบ
๏ฃบ
๏ฃป
๏ฃฎ
๏ฃฏ
๏ฃฏ
๏ฃฏ
๏ฃฏ
๏ฃฏ
๏ฃฐ
โ1 โ1 0 0 1 1 0 0
0 0 โ1 โ1 0 0 1 1
โโ โ โ๐ ๐ โโ โ โ๐ ๐
๏ฃน
๏ฃบ
๏ฃบ
๏ฃบ
๏ฃบ
๏ฃบ
๏ฃป ๐ข
where ๐ฅ = [๐๐ฅ, ๐๐ฆ, ๐,๐ยค๐ฅ,๐ยค๐ฆ,๐ยค]๐ with ๐๐ฅ, ๐๐ฆ, and ๐ being the horizontal coordi- nate, vertical coordinate, and yaw angle of the spacecraft, respectively, with the other variables defined as๐ =mass of spacecraft,๐ผ =yaw moment of inertia,๐ข = thruster force vector,โ =half-depth of spacecraft, and๐=half-width of spacecraft (see Fig. 8 of [9] for details).
As shown in the right-hand side of Fig. 6.2, the NCM keeps their trajectories within the bounded error tube of Theorem 4.2 (shaded region) around the target trajectory (dotted line) avoiding collision with the circular obstacles, even under the presence of disturbances, whilst requiring much smaller computational cost than the CV- STEM. Data sampling and the NCM training are performed as in Example 6.1 [1].
Remark 6.3. We could also simultaneously synthesize a controller๐ขand the NCM, and Theorem 6.1 can provide formal robustness and stability guarantees even for such cases [8], [10]. In Chapter 7, we generalize the NCM to learn contraction theory-based robust control using a DNN only taking๐ฅ, ๐ก, and a vector containing local environment information as its inputs, to avoid the online computation of๐ฅ๐ for the sake of automatic guidance and control implementable in real-time.
0 20 40 timet
10 20 30 40
estimationerror||xโหx||
EKF CV-STEM NCM Mean of upper bounds
0 10 20
horizontal coordinate px
0 10 20
verticalcoordinatepy
start
goal LQR CV-STEM NCM Desired trajectory
Bounded error tube
Figure 6.2: Lorenz oscillator state estimation error smoothed using a 15-point moving average filter in Example 6.1 (left), and spacecraft positions (๐๐ฅ, ๐๐ฆ) on a planar field in Example 6.2 (right).
result different from Theorems 6.1 and 6.3. To this end, let us recall that designing optimal ๐ of ๐ข = ๐(๐ฅ , ๐ฅ๐, ๐ข๐, ๐ก) reduces to designing optimal ๐พ(๐ฅ , ๐ฅ๐, ๐ข๐, ๐ก) of ๐ข = ๐ข๐ โ๐พ(๐ฅ , ๐ฅ๐, ๐ข๐, ๐ก) (๐ฅโ๐ฅ๐) due to Lemma 3.2 of Chapter 3. For such ๐ข, the virtual system of (6.1)/(6.2) and (6.3) without any perturbation, which has ๐ = ๐ฅ and๐=๐ฅ๐as its particular solutions, is given as follows:
ยค
๐ =(๐ด(๐, ๐ฅ , ๐ฅ๐, ๐ข๐, ๐ก) โ๐ต(๐ฅ , ๐ก)๐พ(๐ฅ , ๐ฅ๐, ๐ข๐, ๐ก)) (๐โ๐ฅ๐) + ๐(๐ฅ๐, ๐ก) +๐ต(๐ฅ๐, ๐ก)๐ข๐ (6.19) where ๐ดis the SDC matrix of (6.1) in Lemma 3.1, i.e., ๐ด(๐, ๐ฅ , ๐ฅ๐, ๐ข๐, ๐ก) (๐ฅโ๐ฅ๐) = ๐(๐ฅ , ๐ก) +๐ต(๐ฅ , ๐ก)๐ข๐โ ๐(๐ฅ๐, ๐ก) โ๐ต(๐ฅ๐, ๐ก)๐ข๐. Using the result of Theorem 2.3, one of the Lyapunov functions for the virtual system (6.19) with unknown๐พ can be given as the following transformed squared length integrated over๐ฅ of (6.1)/(6.2) and๐ฅ๐ of (6.3):
๐NCM=
โซ ๐ฅ
๐ฅ๐
๐ฟ๐โคM (๐ฅ , ๐ฅ๐, ๐ข๐, ๐ก)๐ฟ๐ (6.20)
where M defines the NCM of Definition 6.2 modeling ๐ of the CV-STEM con- traction metric in Theorem 4.2.
Theorem 6.4. Suppose(6.1)and(6.2)are controlled by๐ข =๐ข๐โ๐พโ(๐ฅ , ๐ฅ๐, ๐ข๐, ๐ก)e, wheree=๐ฅโ๐ฅ๐, and that ๐พโis designed by the following convex optimization for given (๐ฅ , ๐ฅ๐, ๐ข๐, ๐ก):
(๐พโ, ๐โ) =arg min
๐พโC๐พ, ๐โR
โฅ๐พโฅ2๐น +๐2 (6.21)
s.t.
โซ ๐ฅ ๐ฅ๐
๐ฟ๐โค( ยคM +2 sym(M๐ดโ M๐ต๐พ))๐ฟ๐ โค โ2๐ผ
โซ ๐ฅ ๐ฅ๐
๐ฟ๐โค(M +๐ฝI)๐ฟ๐+๐ (6.22)
where M defines the NCM of Definition 6.2 that models ๐ of the CV-STEM con- traction metric in Theorem 4.2 as in(6.18),C๐พis a convex set containing admissible ๐พ,๐ผ is the contraction rate, and ๐ฝis as defined in theorem 3.1 (i.e., ๐ฝ = 0for de- terministic systems and ๐ฝ=๐ผ๐ for stochastic systems). Then Theorems 2.4 and 2.5 still hold for the Lyapunov function defined in(6.20), yielding the following bounds:
โฅe(๐ก) โฅ2 โค ๐NCM(0) ๐
๐โ2๐ผ๐๐ก+
ยฏ ๐ ๐ + ๐ยฏ2๐๐
๐ผ๐๐
2๐ผ๐
(1โ๐โ2๐ผ๐๐ก) (6.23) E
โฅe(๐ก) โฅ2
โค E[๐NCM(0)]
๐
๐โ2๐ผ๐ก +๐ถ๐ถ๐+๐ยฏ 2๐ผ๐
(6.24) P[โฅe(๐ก) โฅ โฅ ๐] โค 1
๐2
E[๐NCM(0)]
๐
๐โ2๐ผ๐ก+ ๐ถ๐ถ๐+๐ยฏ 2๐ผ๐
(6.25) where ๐ยฏ = sup๐ฅ ,๐ฅ
๐,๐ข๐,๐ก ๐, ๐I โชฏ M โชฏ ๐I as in (2.26), ๐ผ๐ โ R>0 is an arbitrary positive constant (see(6.26)) selected to satisfy๐ผ๐ =๐ผโ๐ผ๐/2 >0,๐ถ๐ถ =๐ยฏ2
๐(2๐ผ๐บโ1+ 1), ๐ผ๐บ โ R>0 is an arbitrary constant as in Theorem 2.5, ๐ โ Rโฅ0, and the disturbance terms๐ยฏ๐and๐ยฏ๐are given in(6.1)and(6.2), respectively. Note that the problem(6.21)is feasible due to the constraint relaxation variable๐.
Furthermore, if๐โ =0in(6.18),(6.21)with๐=0is always feasible, and the optimal feedback gain ๐พโ minimizes its Frobenius norm under the contraction constraint (6.22).
Proof. Computing๐ยคNCM for the deterministic system (6.1) along with the virtual dynamics (6.19) and the stability condition (6.22) yields
๐ยคNCM โค โ2๐ผ๐NCM+๐ยฏ๐
โ ๐
โ๏ธ
๐NCM+ ๐ยฏ โค โ2(๐ผโ๐ผ๐/2)๐NCM+๐ยฏ2
๐๐/๐ผ๐+๐ยฏ(6.26) where the relation 2๐ ๐ โค ๐ผโ1
๐ ๐2+๐ผ๐๐2, which holds for any๐, ๐ โRand๐ผ๐ โR>0
(๐ = ๐ยฏ๐
โ
๐ and๐ = โ
๐NCM in this case), is used to obtain the second inequality.
Since we can arbitrarily select ๐ผ๐ to have ๐ผ๐ = ๐ผโ ๐ผ๐/2 > 0, (6.26) gives the exponential bound (6.23) due to the comparison lemma of Lemma 2.1. Similarly, computingโ๐NCMfor the stochastic system (6.2) yields
โ๐NCM โค โ2๐ผ๐NCM+๐ถ๐๐+๐ยฏ
which gives (6.24) and (6.25) due to Lemma 2.2. If๐โ =0 in (6.18),๐พ =๐ โ1๐ตโค๐ satisfies (6.22) with๐ =0 as we haveM = ๐and๐defines the contraction metric of Theorem 4.2, which implies that (6.21) with ๐ =0 is always feasible.
We could also use the CCM-based differential feedback formulation of Theorem 4.6 in Theorem 6.4 (see [14], [15]). To this end, define๐ธNCM(Riemannian energy) as follows:
๐ธ =
โซ 1 0
๐พ๐(๐, ๐ก)โคM (๐พ(๐, ๐ก), ๐ก)๐พ๐(๐, ๐ก)๐๐ (6.27) where M defines the NCM of Definition 6.2 that models ๐ of the CCM in The- orem 4.6, ๐พ is the minimizing geodesic connecting ๐ฅ(๐ก) = ๐พ(1, ๐ก) of (6.1) and ๐ฅ๐(๐ก) =๐พ(0, ๐ก) of (6.3) (see Theorem 2.3), and๐พ๐ =๐ ๐พ/๐ ๐.
Theorem 6.5. Suppose๐ข = ๐ขโ of (6.1) is designed by the following convex opti- mization for given (๐ฅ , ๐ฅ๐, ๐ข๐, ๐ก):
(๐ขโ, ๐โ) =arg min
๐ขโC๐ข, ๐โR
โฅ๐ขโ๐ข๐โฅ2+๐2 (6.28)
s.t.
๐ ๐ธ
๐ ๐ก
+2๐พ๐(1, ๐ก)โคM (๐ฅ , ๐ก) (๐(๐ฅ , ๐ก) +๐ต(๐ฅ , ๐ก)๐ข)
s.t. โ2๐พ๐(0, ๐ก)โคM (๐ฅ๐, ๐ก) (๐(๐ฅ๐, ๐ก) +๐ต(๐ฅ๐, ๐ก)๐ข๐) โค โ2๐ผ ๐ธ+๐ (6.29) whereM defines the NCM of Definition 6.2 that models the CCM of Theorem 4.6, andC๐ขis a convex set containing admissible๐ข. Then the bound(6.23)of Theorem 6.4 holds.
Furthermore, if ๐โ = 0for the NCM learning error ๐โ in(6.18), then the problem (6.28) with ๐ = 0is always feasible, and ๐ขโ minimizes the deviation of๐ข from ๐ข๐ under the contraction constraint(6.29).
Proof. Since the right-hand side of (6.29) represents ๐ธยค of (6.27) if ๐๐ = 0 in (6.1), the comparison lemma of Lemma 2.1 gives (6.23) as in Theorem 6.4. The rest follows from the fact that๐ข of Theorem 4.6 is a feasible solution of (6.28) if ๐ =0 [14], [15].
Remark 6.4. As discussed in Remark 4.2, Theorem 6.5 can also be formulated in a stochastic setting as in Theorem 6.4 using the technique discussed in [8].
Theorems 6.4 and 6.5 provide a perspective on the NCM stability different from Theorems 6.1 and 6.3 written in terms of the constraint relaxation variable ๐, by directly using the NCM as a contraction metric instead of the CV-STEM contraction metric or the CCM. Since the CLF problems are feasible with ๐ =0 when we have zero NCM modeling error, it is implied that the upper bound of ๐ (i.e., ยฏ๐ in (6.23)
and (6.24)) decreases as we achieve a smaller learning error๐โ using more training data for verifying (6.18) (see Remark 5.1). This intuition is also formalized later in [16]. See [14], [15] for how we compute the minimizing geodesic ๐พ and its associated Riemannian energy๐ธ in Theorem 6.5.
Example 6.3. If C๐พ = R๐ร๐ and C๐ข = R๐ in(6.21) and(6.28), respectively, they can both be expressed as the following quadratic optimization problem:
๐ฃโ =arg min
๐ฃโRv
1
2๐ฃโค๐ฃ+๐(๐ฅ , ๐ฅ๐, ๐ข๐, ๐ก)โค๐ฃ
s.t. ๐0(๐ฅ , ๐ฅ๐, ๐ข๐, ๐ก) +๐1(๐ฅ , ๐ฅ๐, ๐ข๐, ๐ก)โค๐ฃ โค 0
by defining๐ โRv, ๐0 โR, ๐1 โRv, and๐ฃ โRvappropriately. Applying the KKT condition [17, pp. 243-244] to this problem, we can show that
๐ฃโ =
๏ฃฑ๏ฃด
๏ฃด
๏ฃฒ
๏ฃด๏ฃด
๏ฃณ
โ๐โ ๐0โ๐
โค 1๐ ๐โค
1๐1
๐1 if๐0โ๐โค
1๐ > 0
โ๐ otherwise.
This implies that the controller uses ๐ข๐ with the feedback ( (๐0โ๐โค
1๐)/๐โค
1๐1)๐1 only if๐ฅยค = ๐(๐ฅ , ๐ก) +๐ต(๐ฅ , ๐ก)๐ขwith๐ข=๐ข๐is not contracting, i.e.,๐0โ๐โค
1๐ >0.
Example 6.4. When ๐ is parameterized linearly by ๐ in Theorem 6.4, i.e., ๐ = ๐๐ฅ+ (1โ๐)๐ฅ๐, we have๐NCM= (๐ฅโ๐ฅ๐)โคM (๐ฅโ๐ฅ๐)as in the standard Lyapunov function formulation [11]โ[13]. Also, a linear input constraint,๐ขmin โค ๐ข โค ๐ขmax can be implemented by using
C๐พ ={๐พ โR๐ร๐|๐ข๐โ๐ขmax โค ๐พe โค๐ข๐โ๐ขmin} C๐ข ={๐ข โR๐|๐ขmin โค ๐ข โค ๐ขmax}
in(6.21)and(6.28), respectively.
Remark 6.5. For the CV-STEM and NCM state estimation in Theorems 4.5, 6.2, and 6.3, we could design a similar Lyapunov function-based estimation scheme if its contraction constraints of Theorem 4.5 do not explicitly depend on the actual state ๐ฅ. This can be achieved by using๐ด(๐๐ด,๐ฅ , ๐กห )and๐ถ(๐๐ถ,๐ฅ , ๐กห )instead of๐ด(๐๐ด, ๐ฅ ,๐ฅ , ๐กห ) and๐ถ(๐๐ถ, ๐ฅ ,๐ฅ , ๐กห ) in Theorem 4.5 as in Theorem 3.2 [18], leading to exponential boundedness of system trajectories as derived in [18]โ[20].