Chapter 2: Contraction Theory
2.1 Fundamentals
Consider the following smooth non-autonomous (i.e., time-varying) nonlinear sys- tem:
Β€
π₯(π‘) = π(π₯(π‘), π‘) (2.1)
where π‘ β Rβ₯0 is time, π₯ : Rβ₯0 β¦β Rπ the system state, and π : RπΓRβ₯0 β¦β Rπ is a smooth function. Note that the smoothness of π(π₯ , π‘) guarantees the existence and uniqueness of the solution to (2.1) for a givenπ₯(0) =π₯0at least locally [9, pp.
88-95].
Definition 2.1. A differential displacement,πΏπ₯, is defined as an infinitesimal dis- placement at a fixed time as used in the calculus of variation [10, p. 107], and(2.1) yields the following differential dynamics:
πΏπ₯Β€(π‘) = π π
π π₯
(π₯(π‘), π‘)πΏπ₯(π‘) (2.2)
where π(π₯(π‘), π‘) is given in(2.1).
Let us first present a special case of the comparison lemma [9, pp. 102-103, pp.
350-353] to be used extensively throughout this thesis.
πΏπ§
Two neighboring
πΏ αΆπ§
trajectories
π½
π‘
Lyapunov Function Incremental Stability
π‘
Figure 2.1: Illustration of contraction theory, where π is a differential Lyapunov functionπ = πΏπ₯β€π(π₯ , π‘)πΏπ₯, πΏ π§ = Ξ(π₯ , π‘)πΏπ₯, and π(π₯ , π‘) = Ξ(π₯ , π‘)Ξ(π₯ , π‘)β€ β» 0 defines a contraction metric (see Theorem 2.1).
Lemma 2.1. Suppose that a continuously differentiable function π£ β Rβ₯0 β¦β R satisfies the following differential inequality:
Β€
π£(π‘) β€ βπΎ π£(π‘) +π, π£(0) =π£0, βπ‘ βRβ₯0
whereπΎ βR>0,π βR, andπ£0βR. Then we have π£(π‘) β€ π£0πβπΎπ‘ + π
πΎ
(1βπβπΎπ‘), βπ‘ βRβ₯0.
Proof. See [9, pp. 659-660].
2.1.I Contraction Theory and Contraction Metric
In Lyapunov theory, nonlinear stability of (2.1) is studied by constructing a Lyapunov functionπ(π₯ , π‘), one example of which isπ =π₯β€π(π₯ , π‘)π₯. However, findingπ(π₯ , π‘) for general nonlinear systems is challenging asπ(π₯ , π‘)can be any scalar function of π₯ (e.g., a candidateπ(π₯ , π‘) can be obtained by solving a PDE [9, p. 211]). In con- trast, as summarized in Table 1.1, contraction theory uses a differential Lyapunov function that is always a quadratic function ofπΏπ₯, i.e.,π(π₯ , πΏπ₯ , π‘) = πΏπ₯β€π(π₯ , π‘)πΏπ₯, thereby characterizing a necessary and sufficient condition for incremental exponen- tial convergence of the multiple nonlinear system trajectories to one single trajectory.
Thus, the problem of findingπ for stability analysis boils down to finding a finite- dimensional positive-definite matrix π, as illustrated in Figure 2.1 [1]. These properties to be derived in Theorem 2.1, which hold both for autonomous (i.e.
time-invariant) and non-autonomous systems, epitomize significant methodological simplifications of stability analysis in contraction theory.
Theorem 2.1. If there exists a uniformly positive definite matrix given asπ(π₯ , π‘)= Ξ(π₯ , π‘)β€Ξ(π₯ , π‘) β» 0, βπ₯ , π‘, whereΞ(π₯ , π‘) defines a smooth coordinate transforma-
tion of πΏπ₯, i.e., πΏ π§ = Ξ(π₯ , π‘)πΏπ₯, s.t. either of the following equivalent conditions holds forβπΌβR>0,βπ₯ , π‘:
πmax(πΉ(π₯ , π‘)) =πmax ΞΒ€ +Ξπ π
π π₯
Ξβ1
β€ βπΌ (2.3)
πΒ€ +π
π π
π π₯ + π π
π π₯
β€
π βͺ― β2πΌ π (2.4)
where the arguments (π₯ , π‘) of π(π₯ , π‘) and Ξ(π₯ , π‘) are omitted for notational sim- plicity, then all the solution trajectories of (2.1) converge to a single trajectory exponentially fast regardless of their initial conditions (i.e., contracting, see Defini- tion 2.3), with an exponential convergence rateπΌ. The converse also holds.
Proof. The proof of this theorem can be found in [1], but here we emphasize the use of the comparison lemma given in Lemma 2.1. Taking the time-derivative of a differential Lyapunov function ofπΏπ₯(orπΏ π§),π =πΏ π§β€πΏ π§ =πΏπ₯β€π(π₯ , π‘)πΏπ₯, using the differential dynamics (2.2), we have
πΒ€(π₯ , πΏπ₯ , π‘)=2πΏ π§β€πΉ πΏ π§=πΏπ₯β€
πΒ€ + π π
π π₯
β€
π+π
π π
π π₯
πΏπ₯
β€ β2πΌπΏ π§β€πΏ π§ =β2πΌπΏπ₯β€π(π₯ , π‘)πΏπ₯
where the conditions (2.3) and (2.4) are used with the generalized Jacobian πΉ in (2.3) obtained from πΏπ§Β€ = ΞΒ€πΏπ₯ +ΞπΏπ₯Β€ = πΉ πΏ π§. We get πβ₯πΏ π§β₯/π π‘ β€ βπΌβ₯πΏ π§β₯ by πβ₯πΏ π§β₯2/π π‘ = 2β₯πΏ π§β₯πβ₯πΏ π§β₯/π π‘, which then yields β₯πΏ π§(π‘) β₯ β€ β₯πΏ π§0β₯πβπΌπ‘ by the comparison lemma of Lemma 2.1. Hence, any infinitesimal length β₯πΏ π§(π‘) β₯ and
β₯πΏπ₯(π‘) β₯, as well as πΏ π§ andπΏπ₯, tend to zero exponentially fast. By path integration (see Definition 2.2 and Theorem 2.3), this immediately implies that the length of any finite path converges exponentially to zero from any initial conditions.
Conversely, consider an exponentially convergent system, which implies the follow- ing forβπ½ >0 andβπ β₯ 1:
β₯πΏπ₯(π‘) β₯2 β€ βπβ₯πΏπ₯(0) β₯2πβ2π½π‘ (2.5)
and define a matrix-valued functionΞ(π₯(π‘), π‘) β RπΓπ(not necessarilyΞβ» 0) as Ξ =Β€ β2π½ΞβΞπ π
π π₯
β π π
π π₯
β€
Ξ, Ξ(π₯(0),0) =πI. (2.6)
Note that, forπ =πΏπ₯β€ΞπΏπ₯, (2.6) givesπΒ€ =β2π½π, resulting inπ =βπβ₯πΏπ₯(0) β₯2πβ2π½π‘. Substituting this into (2.5) yields
β₯πΏπ₯(π‘) β₯2 =πΏπ₯(π‘)β€πΏπ₯(π‘) β€π =πΏπ₯(π‘)β€Ξ(π₯(π‘), π‘)πΏπ₯(π‘) (2.7)
which indeed implies that Ξ βͺ° I β» 0 as (2.7) holds for any πΏπ₯. Thus, Ξsatisfies the contraction condition (2.4), i.e., Ξ defines a contraction metric (see Defini- tion 2.3) [1].
Remark 2.1. Since π of Theorem 2.1 is positive definite, i.e.,π£β€π π£ = β₯Ξπ£β₯2 β₯ 0, βπ£ β Rπ, and β₯Ξπ£β₯2 = 0if and only if π£ = 0, the equationΞπ£ = 0only has a trivial solution π£ = 0. This implies thatΞ is always non-singular (i.e., Ξ(π₯ , π‘)β1 always exists).
Definition 2.2. Letπ0(π‘) andπ1(π‘) denote some solution trajectories of (2.1). We say that(2.1) is incrementally exponentially stable ifβπΆ , πΌ > 0 s.t. the following holds [11]:
β₯π1(π‘) βπ0(π‘) β₯ β€πΆ πβπΌπ‘β₯π0(0) βπ1(0) β₯
for any π0(π‘) andπ1(π‘). Note that, since we have β₯π1(π‘) βπ0(π‘) β₯ = β₯β« π1
π0
πΏπ₯β₯ (see Theorem 2.3), Theorem 2.1 implies incremental stability of the system(2.1).
Definition 2.3. The system(2.1)satisfying the conditions in Theorem 2.1 is said to be contracting, and a uniformly positive definite matrixπthat satisfies(2.4)defines a contraction metric. As to be discussed in Theorem 2.3 of Sec. 2.2, a contracting system is incrementally exponentially stable in the sense of Definition 2.2.
Example 2.1. One of the distinct features of contraction theory in Theorem 2.1 is incremental stability with exponential convergence. Consider the example given in [12]:
π π π‘
"
π₯1 π₯2
#
=
"
β1 π₯1
βπ₯1 β1
# "
π₯1 π₯2
#
. (2.8)
A Lyapunov functionπ = β₯π₯β₯2/2 for(2.8), where π₯ = [π₯1, π₯2]β€, yieldsπΒ€ β€ β2π. Thus,(2.8)is exponentially stable with respect to π₯ =0. The differential dynamics of(2.8)is given as
π π π‘
"
πΏπ₯1 πΏπ₯2
#
=
"
β1+π₯2 π₯1
β2π₯1 β1
# "
πΏπ₯1 πΏπ₯2
#
, (2.9)
and the contraction condition(2.4)for(2.9)can no longer be proven byπ =β₯πΏπ₯β₯2/2, due to the lack of the skew-symmetric property of (2.8) in (2.9). This difficulty illustrates the difference between Lyapunov theory and contraction theory, where the former considers stability of (2.8) with respect to the equilibrium point, while the latter analyzes exponential convergence of any couple of trajectories in (2.8) with respect to each other (i.e., incremental stability in Definition 2.2) [1], [12].
Example 2.2. Contraction defined by Theorem 2.1 guarantees incremental stability of their solution trajectories but does not require the existence of stable fixed points.
Let us consider the following system forπ₯ :Rβ₯0β¦βR:
Β€
π₯ =βπ₯+ππ‘. (2.10)
Using the constraint(2.4) of Theorem 2.1, we can easily verify that(2.10) is con- tracting as π = I defines its contraction metric with the contraction rate πΌ = 1.
However, since(2.10)hasπ₯(π‘)=ππ‘/2+ (π₯(0) β1/2)πβπ‘as its unique solution, it is not stable with respect to any fixed point.
Example 2.3. Consider a Linear Time-Invariant (LTI) system, π₯Β€ = π(π₯) = π΄π₯. Lyapunov theory states that the origin is globally exponentially stable if and only if there exists a constant positive-definite matrixπ βRπΓπs.t. [13, pp. 67-68]
βπ > 0 s.t. π π΄+π΄β€π βͺ― βπI. (2.11)
Now, let π = β₯πβ₯. SinceβI βͺ― βπ/π, (2.11)implies that π π΄+ π΄β€π β€ β(π/π)π, which shows that π = π with πΌ = π/(2π) satisfies (2.4) due to the relation
π π/π π₯ = π΄.
The contraction condition(2.4) can thus be viewed as a generalization of the Lya- punov stability condition(2.11) (see the generalized Krasovskiiβs theorem [14, pp.
83-86]) in a nonlinear non-autonomous system setting, expressed in the differen- tial formulation that permits a non-constant metric and pure differential coordi- nate change [1]. Furthermore, if π(π₯ , π‘) = π΄(π‘)π₯ and π = I, (2.4) results in π΄(π‘) +π΄(π‘)β€ βͺ― β2πΌI(i.e., all the eigenvalues of the symmetric matrix π΄(π‘) +π΄(π‘)β€ remain strictly in the left-half complex plane), which is a known sufficient condition for stability of Linear Time-Varying (LTV) systems [14, pp. 114-115].
Example 2.4. Most of the learning-based techniques involving neural networks are based on optimizing their hyperparameters by gradient descent [15]. Contraction theory provides a generalized view on the analysis of such continuous-time gradient- based optimization algorithms [16].
Let us consider a twice differentiable scalar output function π : RπΓ R β¦β R, a matrix-valued functionπ :Rπ β¦βRπΓπwithπ(π₯) β»0, βπ₯ βRπ, and the following natural gradient system [17]:
Β€
π₯ =β(π₯ , π‘) =βπ(π₯)β1βπ₯π(π₯ , π‘). (2.12)
Then, π is geodesicallyπΌ-strongly convex for eachπ‘in the metric defined by π(π₯) (i.e., π»(π₯) βͺ° πΌ π(π₯) with π»(π₯) being the Riemannian Hessian matrix of π with respect to π [18]), if and only if (2.12) is contracting with rate πΌ in the metric defined byπ as in(2.4)of Theorem 2.1, where π΄ =π β/π π₯. More specifically, the Riemannian Hessian verifiesπ»(π₯) =β( Β€π +π π΄+ π΄β€π)/2. See [16] for details.
Remark 2.2. Theorem 2.1 can be applied to other vector norms ofβ₯πΏ π§β₯πwith, e.g., π =1orπ =β[1]. It can also be shown that for a contracting autonomous system of the formπ₯Β€ = π(π₯), all trajectories converge to an equilibrium point exponentially fast.
2.1.II Partial Contraction
Although satisfying the condition (2.4) of Theorem 2.1 guarantees exponential convergence of any couple of trajectories in (2.1), proving their incremental stability with respect to a subset of these trajectories possessing a specific property could be sufficient for some cases [6], [8], [19], [20], leading to the concept of partial contraction [3].
Theorem 2.2. Consider the following nonlinear system with the stateπ₯ βRβ₯0β¦βRπ and the auxiliary or virtual system with the stateπ βRβ₯0 β¦βRπ:
Β€
π₯(π‘) =g(π₯(π‘), π₯(π‘), π‘) (2.13)
Β€
π(π‘) =g(π(π‘), π₯(π‘), π‘) (2.14)
where g : Rπ Γ Rπ Γ Rβ₯0 β Rπ is a smooth function. Suppose that (2.14) is contracting with respect to π. If a particular solution of(2.14) verifies a smooth specific property, then all trajectories of(2.13)verify this property exponentially.
Proof. The theorem statement follows from Theorem 2.1 and the fact thatπ =π₯and a trajectory with the specific property are particular solutions of (2.14) (see [3] for details).
The importance of this theorem lies in the fact that we can analyze contraction of some specific parts of the system (2.13) while treating the rest as a function of the time-varying parameterπ₯(π‘). Strictly speaking, the system (2.13) is said to be partially contracting, but we will not distinguish partial contraction from contraction of Definition 2.3 in this thesis for simplicity. Instead, we will use the variable π when referring to partial contraction of Theorem 2.2.
Examples of a trajectory with a specific property include the target trajectoryπ₯π of feedback control in (3.3), or the trajectory of actual dynamicsπ₯ for state estimation in (4.3) and (4.4) to be discussed in the subsequent chapters. Note that contraction can be regarded as a particular case of partial contraction.
Example 2.5. Let us illustrate the role of partial contraction using the following nonlinear system [12]:
Β€
π₯ =βπ·(π₯)π₯+π’, π₯Β€π =βπ·(π₯π)π₯π
whereπ₯ is the system state,π₯πis the target state,π’is the control input designed as π’ =βπΎ(π₯) (π₯βπ₯π) + (π·(π₯) βπ·(π₯π))π₯π, andπ·(π₯) +πΎ(π₯) β»0. We could define a virtual system, which hasπ =π₯andπ =π₯π as its particular solutions, as follows:
Β€
π =β(π·(π₯) +πΎ(π₯)) (πβπ₯π) βπ·(π₯π)π₯π. (2.15) Since we haveπΏπΒ€=β(π·(π₯) +πΎ(π₯))πΏπandπ·(π₯) +πΎ(π₯) β»0,(2.15)is contracting withπ =Iin(2.4). However, if we consider the following virtual system:
Β€
π =β(π·(π) +πΎ(π)) (πβπ₯π) βπ·(π₯π)π₯π (2.16) which also has π = π₯ and π = π₯π as its particular solutions, proving contraction is no longer straightforward because of the terms π π·/π ππ and π πΎ/π ππ in the differential dynamics of (2.16). This is due to the fact that, in contrast to (2.15), (2.16) has particular solutions nonlinear in π in addition to π = π₯ and π = π₯π, and the condition(2.4) becomes more involved for(2.16) as it is for guaranteeing exponential convergence of any couple of these particular solution trajectories [12].
Example 2.6. As one of the key applications of partial contraction given in Theo- rem 2.2, let us consider the following closed-loop Lagrangian system [14, p. 392]:
H (q) Β₯q+ C (q,q) €€ q+ G (q) =π’(q,qΒ€, π‘) (2.17) π’(q,qΒ€, π‘) =H (q) Β₯qπ + C (q,q) €€ qπ + G (q) β K (π‘) ( Β€qβ Β€qπ) (2.18) whereq,qΒ€ βRπ,qΒ€π =qΒ€π(π‘) βΞ(π‘) (qβqπ(π‘)),H :Rπ β¦βRπΓπ,C :RπΓRπβ¦β RπΓπ, G : Rπ β¦β Rπ, K : Rβ₯0 β¦β RπΓπ, Ξ : Rβ₯0 β¦β RπΓπ, and (qπ,qΒ€π) is the target trajectory of the state (q,q). Note thatΒ€ K,Ξβ» 0are control gain matrices (design parameters), andH βΒ€ 2Cis skew-symmetric withH β»0by construction.
By comparing with (2.17)and(2.18), we define the following virtual observer-like system ofπ(notq) that hasπ =qΒ€ andπ =qΒ€π as its particular solutions:
H (q) Β€π+ C (q,q)Β€ π+ K (π‘) (πβ Β€q) + G (q)=π’(q,qΒ€, π‘) (2.19)
which givesH (q)πΏπΒ€+ (C (q,q) + K (Β€ π‘))πΏπ =0. We thus have that π
π π‘
(πΏπβ€H (q)πΏπ) =πΏπβ€( Β€H β2C β2K)πΏπ =β2πΏπβ€KπΏπ (2.20) where the skew-symmetry ofH βΒ€ 2C is used to obtain the second equality. Since K β» 0, (2.20) indicates that the virtual system (2.19)is partially contracting in π withH defining its contraction metric. Contraction of the full state (q,q)Β€ will be discussed in Example 2.7.
Note that if we treat the arguments(q,q)Β€ ofH andCalso as the virtual stateπ, we end up having additional terms such asπH /π ππ, which makes proving contraction analytically more involved as in Example 2.5.
As can be seen from Examples 2.5 and 2.6, the role of partial contraction in theo- rem 2.2 is to provide some insight on stability even for cases where it is difficult to prove contraction for all solution trajectories as in Theorem 2.1. Although finding a contraction metric analytically for general nonlinear systems is challenging, we will see in Chapter 3 and Chapter 4 that the convex nature of the contraction condition (2.4) helps us find it numerically.