Methods - Neural Fly for Fault Tolerance - Methods for Robust Learning-Based Control

Chapter 6: Neural Fly for Fault Tolerance

6.3 Methods

Control allocation through online optimization

First, we introduce two common choices for control allocation matrix, one prior approach, and then propose a novel control allocation algorithm that maximizes control authority. Because these methods do not explicitly consider the time-varying nature of the system, we will simply denote the control actuation matrix of interest as𝐵.

Moore-Penrose Pseudoinverse Allocation

A natural choice for the control allocation matrix, 𝐴, is the Moore-Penrose right pseudoinverse, given by

𝐴= 𝐵^†= 𝐵^> 𝐵 𝐵^>−1 (6.9) Note that for controllable overactuated systems, i.e., the type of system we are considering,𝐵is a wide, full row rank matrix, so(𝐵 𝐵^>)⁻¹is well-defined. This choice of control allocation matrix yields the minimum norm control input given any de- sired torque command, that is,

𝐴_pinv =argmin𝐴maxk𝜏k₂=1 k𝐴𝜏k₂

s.t. 𝐵 𝐴=I

. (6.10)

However, the minimum norm solution does not account for actual power usage or control saturation. Thus, it is not often the best choice.

Maximum Control Authority Allocation

For a symmetric multirotor, we can design a control allocation matrix that maximizes control authority by choosing thrust and torque factors that independently cre- ate the maximum thrust and moments. A multirotor is symmetric when𝐵sign(𝐵^>) is diagonal. The maximum torque along the 𝑖’th axis is produced when𝑢_𝜏_max

,𝑖 = max sign(𝐵^>)_(·,𝑖),0, where(𝐵^>)_(·,𝑖)is the𝑖’th column of𝐵^>andmaxis the element-

wise maximum, here. Thus, the allocation matrix that yields maximum control authority along each control axis, independently, is

𝐴_mca =sign(𝐵^>) (6.11)

On most multirotors and every a symmetric multirotor, this allocation scheme will not work under a single motor failure. For example, consider a perfectly sensed motor failure, such that 𝐵 = 𝐵₀𝐻(𝑡), where 𝐵₀represents the nominal, symmetric system. Any single motor failure will cause (𝐵₀𝐻)sign( (𝐵₀𝐻)^>) to become non- diagonal. This leads to cross coupling in the different control axis and significantly degraded tracking performance. Thus, for fault-tolerant control, we must consider more sophisticated allocation algorithms.

Kim’s Control Allocation

[5] proposes the following allocation algorithm:

𝐴_kim =argmin𝐴 k𝐴k𝐹 + ¹

𝑚

Í^𝑚

𝑖=1|𝐴₍₀_,𝑖) −mean(𝐴₍₀_,·)) |

s.t. 𝐵 𝐴=I, 𝐴₍₀_,·) ≥ 0 (6.12)

The first term in the cost function, k𝐴k𝐹, is the Frobenius norm of 𝐴, which is used as a surrogate for the control effort. The second term distributes the thrust among the motors as evenly as possible. The constraints ensure that the solution is a valid control allocation matrix for 𝐵and that the thrust factors are non-negative.

However, we find that under an outboard motor failure for our system in Sec. 6.4, some thrust factors are 0 with non-zero torque factors. Thus, there are infinitesimally small torque commands can cause the control to saturate.

Proposed Allocation Algorithm

Do to the limitations of prior approaches, we propose the following allocation algorithm. This method directly maximizes the control authority at a nominal operating point, where the thrust equals the (scaled) weight of the vehicle,𝑚. Furthermore, this formulation is not only convex, but also it is a linear program. Thus, it can be solved efficiently using, for example using the [25], [26]

The thrust for a given set of motor speeds is given by 𝐵₀₍₁

,·)𝑢. Thus, to achieve the maximum thrust with no torque, that is𝜏_𝑑 = [1; 0; 0; 0], we must have𝑢_𝑑 = 𝐴𝜏_𝑑 = 𝐴_(·_,₁₎. Similarly, to achieve the maximum torque along the𝑖th axis while producing 𝑚thrust, we must have𝑢_𝑑 = 𝐴𝜏_𝑑 =𝑚 𝐴_(·_,₁₎+𝐴_(·_,𝑖₎. Since the vehicle is asymmetric,

we must consider both the positive and negative torque along each axis. Accounting for actuation limits, this leads

𝐴¯_NFF =argmax 𝐵₍₁_,·)𝐴_(·,₁₎ +Í^𝑛

𝑖=2𝐵_(𝑖,·) 𝑚 𝐴_(·,₁₎ +𝐴_(·,𝑖)

𝐴 −Í^𝑛

𝑖=2𝐵_(𝑖,·) 𝑚 𝐴_(·,₁₎ − 𝐴_(·,𝑖) s.t. 𝐵_(2:4_,_·)𝐴_(·_,₀₎ =0, 𝐴_(·_,₁₎ ≥ 0, 𝐴_(·_,₁₎ ≤1,

𝐵_{( (1}_,₃_,₄₎_,_·)𝐴_(·_,₂₎ = [𝑚,0,0]^>, 0 ≤ 𝑚 𝐴_(·_,₁₎ +𝐴_(·_,𝑖₎

≤ 1,0 ≤ 𝑚 𝐴_(·_,₁₎ − 𝐴_(·_,𝑖₎

≤ 1, 𝐵_{( (1}_,₂_,₄₎_,_·)𝐴_(·_,₃₎ = [𝑚,0,0]^>,

0 ≤ 𝑚 𝐴_(·_,₁₎ +𝐴_(·_,𝑖₎

≤ 1,0 ≤ 𝑚 𝐴_(·_,₁₎ − 𝐴_(·_,𝑖₎

≤ 1, 𝐵_{( (}₁_,₂_,₃_),·)𝐴_(·,₄₎ = [𝑚,0,0]^>,

0 ≤ 𝑚 𝐴_(·,₁₎ +𝐴_(·,𝑖)

≤ 1,0≤ 𝑚 𝐴_(·,₁₎ − 𝐴_(·,𝑖)

≤ 1

(6.13)

For failure scenarios, 𝐵𝐴¯_NFF ≠ I due to the reduced control authority, however, 𝐵𝐴¯_NFF is diagonal. Thus, we simply must rescale 𝐴¯_NFF to ensure that𝐵𝐴¯_NFF = I.

Thus, the final control allocation matrix is given by

𝐴_NFF = 𝐴¯_NFF 𝐵𝐴¯_NFF⁻1 (6.14) Under nominal conditions, this exactly reproduces the solution from (6.11). Further- more, under a single motor failure, this algorithm will maintain maximum control authority while maintaining the nominal performance characteristics of the system.

Motor Efficiency Adaptation as an Extension of Learned Dynamics

Consider the following learning architectures and control laws, which are all models for the error in the nominal model, (6.5).

𝑔_NF(𝑥 , 𝑢) =𝜙(𝑥 , 𝑢)𝑎,ˆ 𝑢_NF= (𝐵₀)⁻¹_𝑅 (−𝑓(𝑥) −𝐾𝑥˜−𝑔ˆ_NF) (6.15) ˆ

𝑔_B(𝑥 , 𝑢) =(𝐵ˆ−𝐵₀)𝑢, 𝑢_B=𝐵ˆ⁻¹

𝑅 (−𝑓(𝑥) −𝐾𝑥˜) (6.16) ˆ

𝑔_eff(𝑥 , 𝑢) =𝐵₀(𝐻ˆ −I)𝑢, 𝑢_eff= (𝐵₀𝐻ˆ)⁻¹_𝑅 (−𝑓(𝑥) −𝐾𝑥˜) (6.17) ˆ

𝑔_NFF(𝑥 , 𝑢) =𝐵₀(𝐻ˆ −I)𝑢+𝜙(𝑥 , 𝑢)𝑎,ˆ

𝑢_NFF =(𝐵₀𝐻ˆ)⁻¹_𝑅 (−𝑓(𝑥) −𝐾𝑥˜−𝜙(𝑥 , 𝑢)𝑎ˆ) (6.18) ˆ

𝑔_NFis the learned dynamics model from [17],𝑔ˆ_Bis full actuation matrix adaptation, ˆ

𝑔_effis motor efficiency adaptation, and𝑔ˆ_NFFis the proposed method, which combines

motor efficiency adaptation and learned dynamics. In the next section, Sec. 6.3, we will discuss online adaptation of the full control actuation matrix, (6.16) and some challenges of this approach. Then, we will continue our analysis only for (6.18), since (6.17) and (6.15) are special cases of (6.18).

Full Actuation Matrix Adaptation

To simplify the notation, define 𝐵¯ = 𝐵ˆ − 𝐵₀. Consider the continuous time cost function

𝐽(𝐵¯)=

∫ 𝑡 0

e^{−(𝑡−𝑟)/𝜆}¹ 𝑦−𝐵𝑢¯

2d𝑟 +𝜆₂k𝐵¯k²_𝐹 (6.19)

∫ ^𝑡

e⁻⁽^𝑡⁻^𝑟^)/^𝜆¹tr

(𝑦−𝐵𝑢¯ ) (𝑦−𝐵𝑢¯ )^>d𝑟+𝜆₂tr ¯𝐵𝐵¯^> (6.20)

∫ 𝑡 0

e^{−(𝑡−𝑟)/𝜆}¹tr

𝑦 𝑦^>−2 ¯𝐵𝑢 𝑦^>+𝐵𝑢𝑢¯ ^>𝐵¯

d𝑟+𝜆₂tr ¯𝐵𝐵¯^> (6.21) Since this is convex and quadratic in 𝐵, we can easily find the solution by looking for the critical point. We are using the following notation: h

𝜕 𝐽

𝜕𝐵¯

𝑖 𝑗

= ^{𝜕 𝐽}

𝜕𝐵¯𝑖 𝑗.

𝜕 𝐽

𝜕𝐵¯

∫ ^𝑡

e⁻⁽^𝑡⁻^𝑟^)/^𝜆¹

0−2𝑦𝑢^>+2 ¯𝐵𝑢𝑢^>d𝑟+𝜆₂2 ¯𝐵 (6.22)

¯ 𝐵

𝜆₂I+

∫ 𝑡 0

e⁻⁽^𝑡⁻^𝑟^)/^𝜆¹𝑢𝑢^>d𝑟

∫ 𝑡 0

e⁻⁽^𝑡⁻^𝑟^)/^𝜆¹𝑦𝑢^>d𝑟 (6.23)

¯ 𝐵 =

∫ ^𝑡

e^{−(𝑡−𝑟)/𝜆}¹𝑦𝑢^>d𝑟

𝜆₂I+

∫ ^𝑡

e^{−(𝑡−𝑟)/𝜆}¹𝑢𝑢^>d𝑟 ⁻1

| {z }

𝑃

(6.24)

Now we can derive a recursive update law for𝐵¯. Starting with𝑃, 𝑃¤=−𝑃d 𝑃⁻¹

d𝑡

𝑃 (6.25)

=−𝑃

𝑢𝑢^>− 1 𝜆₁

∫ 𝑡 0

e^{−(𝑡−𝑟}^)/𝜆¹𝑢𝑢^>d𝑟

𝑃 (6.26)

=−𝑃

𝑢𝑢^>− 1 𝜆₁

𝑃⁻¹−𝜆₂I

𝑃 (6.27)

𝑃¤= 1 𝜆₁

𝑃−𝑃 𝜆₂

𝜆₁I+𝑢𝑢^>

𝑃. (6.28)

Then we can compute 𝐵¤¯.

¤¯ 𝐵=

𝑦𝑢^>− 1 𝜆₁

∫ 𝑡 0

e^{−(𝑡−𝑟)/𝜆}¹𝑦𝑢^>d𝑟

𝑃

e⁻⁽^𝑡⁻^𝑟^)/^𝜆¹⁾𝑦𝑢^>d𝑟 1 𝜆₁

𝑃−𝑃 ²

𝜆₁I+𝑢𝑢^> 𝑃 (6.29)

=𝑦𝑢^>𝑃−𝐵¯

𝑢𝑢^>+ 𝜆₂ 𝜆₁I

𝑃 (6.30)

¤¯

𝐵=− 𝐵𝑢¯ −𝑦

𝑢^>𝑃− 𝜆₂ 𝜆₁

𝐵 𝑃 (6.31)

As we will see later, it is useful to consider a composite adaptation law, that is an adaptation law that depends on both𝐵𝑢¯ −𝑦and𝑠, given by

¤¯

𝐵=− 𝐵𝑢¯ −𝑦

𝑢^>𝑃− 𝜆₂ 𝜆₁

𝐵 𝑃+𝑠𝑢^>𝑃 (6.32) The closed loop dynamics are given by

𝑀(𝑞) ¥𝑞+𝐶(𝑞,𝑞¤) ¤𝑞+𝑔(𝑞) =𝐵𝑢 (6.33) 𝑢= 𝐵¯^†(𝑀(𝑞) ¥𝑞_𝑟 +𝐶(𝑞,𝑞¤) ¤𝑞_𝑟 +𝑔(𝑞) −𝐾 𝑠) (6.34)

𝐵= 𝐵¯+𝐵₀− 𝐵↔ 𝐵=𝐵¯+𝐵₀−𝐵˜ (6.35) 𝑀𝑠¤+ (𝐶+𝐾)𝑠=−𝐵˜(𝐵¯+𝐵₀) (𝑀𝑞¥_𝑟 +𝐶𝑞¤+𝑔(𝑞) −𝐾 𝑠) (6.36) Take the following Lyapunov function

V =𝑠^>𝑀(𝑞)𝑠+ k𝐵˜k²_F

,𝑃⁻¹ (6.37)

k𝐵˜k²_F

,𝑃⁻¹ , tr𝐵 𝑃˜ ⁻¹𝐵˜^>

(6.38) V (𝑠,𝐵˜) =𝑠^>𝑀(𝑞)𝑠+tr𝐵 𝑃˜ ⁻¹𝐵˜^>

(6.39) then

V¤ =2𝑠^>𝑀𝑠¤+𝑠> ¤𝑀 𝑠+2tr

˜ 𝐵 𝑃⁻¹𝐵¤˜^>

+tr

˜ 𝐵d

d𝑡

𝑃⁻¹ 𝐵˜^>

(6.40)

=2𝑠^> −(𝐶+𝐾)𝑠−𝐵𝑢˜

+𝑠^>𝑀 𝑠 (6.41)

+2tr

˜ 𝐵 𝑃⁻¹

− 𝐵𝑢˜

𝑢^>𝑃− 𝜆₂ 𝜆₁

𝐵 𝑃+𝑠𝑢^>𝑃 >

+tr

˜ 𝐵

𝑢𝑢^>− 1 𝜆₁

𝑃⁻¹−𝜆₂I

˜ 𝐵^>

=−2𝑠^>𝐾 𝑠−2𝑠^>𝐵𝑢˜ +2tr

𝐵 𝑃˜ ⁻¹ 𝑠𝑢^>𝑃>

(6.42) +2tr

˜ 𝐵

−𝑢𝑢^>𝐵˜^>− 𝜆₂ 𝜆₁

¯ 𝐵^>

+2tr ² 𝜆₁

𝐵(𝐵−𝐵₀)^> −2tr ² 𝜆₁

𝐵(𝐵−𝐵₀)^>

+tr

𝐵𝑢𝑢^>𝐵˜^>− 1 𝜆₁

𝐵 𝑃⁻¹𝐵˜^>+ 𝜆₂ 𝜆₁

˜ 𝐵𝐵˜^>

=−2𝑠^>𝐾 𝑠−2tr 𝜆₂

𝜆₁

𝐵(𝐵−𝐵₀)^>

(6.43)

+tr

−𝐵𝑢𝑢˜ ^>𝐵˜^>− 1 𝜆₁

𝐵 𝑃⁻¹𝐵˜^>− 𝜆₂ 𝜆₁

˜ 𝐵𝐵˜^>

V¤ =−2𝑠^>𝐾 𝑠−tr

˜ 𝐵

𝑢𝑢^>+ 1 𝜆₁

𝑃⁻¹+𝜆₂ 𝜆₁I

˜ 𝐵^>

(6.44)

−2tr 𝜆₂

𝜆₁

𝐵(𝐵−𝐵₀)^>

Lemma 6.3.1. Note that for matrices 𝐵 ∈ R^𝑛^×^𝑚, 𝐶 ∈ R^𝑚^×^𝑚, and 𝐷 ∈ R^𝑚^×^𝑚, if 𝐷 > 𝐶and rank(𝐵) =𝑛, thentr(𝐵(𝐷−𝐶)𝐵^>) >0.

Proof. For any𝑥 ∈ R^𝑛, if𝑥 ≠ 0and rank(𝐵) =𝑛then𝐵^>𝑥 ≠ 0. When𝐷 −𝐶 > 0, we also have the𝑥^>𝐵(𝐷−𝐶)𝐵^>𝑥 >0, and thus𝐵(𝐷−𝐶)𝐵^> > 0. Since the trace of a matrix is equal to the sum of the eigenvalues of a matrix, and all the eigenvalues of a positive definite matrix are positive,tr(𝐵(𝐷−𝐶)𝐵^>) > 0.

Define𝛼 > 0as the exponential convergence rate of the system such that

𝑢𝑢^>+ 1 𝜆₁

𝑃⁻¹+𝜆₂ 𝜆₁I

> 2𝛼 𝑃⁻¹ and (6.45)

𝐾 > 𝛼 𝑀 . (6.46)

Aside. We can slightly tighten the convergence bound since𝐷−𝐶 >0is sufficient but not necessary for tr(𝐵(𝐷−𝐶)𝐵^>) > 0. In particular, (6.45) can be loosened to

𝑖

eig𝑖 𝑢𝑢^>+ 1 𝜆₁

𝑃⁻¹+ 𝜆₂ 𝜆₁I

−2𝛼 𝑃⁻¹

>0 (6.47)

whereeig𝑖is the𝑖’th eigenvalue of the matrix.

Define𝐷 = ^𝜆²

𝜆₁

𝑃^1/2(𝐵−𝐵₀)^>

. Then

V ≤ −2¤ 𝛼V +2

√

V𝐷 (6.48)

Consider the related system W where W =

√

V and 2W ¤W = 𝑉¤. Then, from (6.48),

2W ¤W ≤ −2𝛼W²+2W𝐷 (6.49)

W ≤ −¤ 𝛼W +𝐷 . (6.50) Consider another related system,𝑤(𝑡), defined by𝑤¤(𝑡) =−𝛼𝑤(𝑡) +𝐷(𝑡)and𝑤(0)= W (0). The solution to𝑤(𝑡)is

𝑤(𝑡) =e⁻^𝛼𝑡𝑤(0) +

∫ ^𝑡

e⁻^𝛼⁽^𝑡⁻^𝑟⁾𝐷(𝑟)d𝑟 , (6.51) which can be bounded by

𝑤(𝑡) ≤ e^−𝛼𝑡

𝑤(0) −sup

𝑡

𝐷(𝑡) 𝛼

+sup

𝑡

𝐷(𝑡)

𝛼 (6.52)

By the Comparison Lemma [27],

√

V =W ≤𝑤(𝑡), (6.53)

thus√

V and alsok𝑥˜k exponentially converges to the ball k𝑥˜k ≤

√

V ≤ 𝑠𝑢 𝑝_𝑡 𝐷

𝛼 (6.54)

While this shows stability of the system, convergence of the system can be slow. This is an inherent limitation of directly adapting all parameters of the control actuation matrix. Furthermore, this method can be sensitive to noise or non-zero 𝑓_res, which we have not considered here. In the next section, we will focus on adaptation of the efficiency factors, which enables faster adaptation, and therefore faster convergence.

Kalman Filter Based Adaptation and ℓ2 Regularized Least Squares

With some simple rearrangements, we can write the Kalman Filter based composite adaptation law following [17]. To see that the Kalman filter adaptation will follow [17], consider the following rearrangements.

𝑔_NFF = h

𝐵₀𝑈 𝜙 i

ˆ 𝜂−1

ˆ 𝑎

(6.55)

where 𝜂ˆ=diag(𝐻ˆ), and (6.56)

𝑈 =diag(𝑢) (6.57)

Then, the Kalman filter based adaptation law is given by

¤ˆ 𝜂

¤ˆ 𝑎

=−𝜔_𝑓

ˆ 𝜂−1

ˆ 𝑎

# +𝑃

𝐵₀𝑈 𝜙 i>

𝑅⁻¹ 𝑦− h

𝐵₀𝑈 𝜙 i

ˆ 𝜂−1

ˆ 𝑎

# !

+𝑃 𝐵₀𝑈 𝜙 𝑥˜ (6.58) 𝑃¤ =−2𝜔_𝑓𝑃+𝑄−𝑃

𝐵₀𝑈 𝜙 i>

𝑅⁻¹ h

𝐵₀𝑈 𝜙 i

𝑃 (6.59)

A similarℓ2-regularized least squares with exponential forgetting formulation can also be derived, which takes the form

¤ˆ 𝜂

¤ˆ 𝑎

=−𝛾

ˆ 𝜂−1

ˆ 𝑎

# +𝑃

𝐵₀𝑈 𝜙 i>

𝑅⁻¹ 𝑦− h

𝐵₀𝑈 𝜙 i

ˆ 𝜂−1

ˆ 𝑎

# !

+𝑃 h

𝐵₀𝑈 𝜙 i>

𝑥 (6.60)

𝑃¤ =𝜔_𝑓𝑃−𝑃 h

𝐵₀𝑈 𝜙 i> h

𝐵₀𝑈 𝜙 i

+Γ

𝑃 (6.61)

whereΓis a diagonal positive definite matrix that controls the regularization cost in the least squares problem. Note that the closed from solution for𝑃is given by

𝑃 ≡

∫ ^𝑡

𝑒^−𝜔^𝑓^{(𝑡−𝑟)} h

𝐵₀𝑈 𝜙 i> h

𝐵₀𝑈 𝜙 i

𝑑𝑟 +Γ ⁻1

. (6.62)

The proof of stability largely follows that of [17] once the closed loop dynamics have been sufficiently rearranged, as we do below in (6.85). There are two added complexities in the proof, which are that the disturbance term becomes a function of 𝑈^>𝑑and that uniform boundedness of𝑃now depends on uniform boundedness of𝑈. Although we will omit the proof, and address these challenges for theℓ1-regularized adaptation law in the next section; the proof for theℓ2-regularized adaptation law and Kalman filter adaptation law follows exactly the same form, except for the form of the regularization term and𝑃update equation.

Theℓ2-regularized and Kalman-filter-based methods are not necessarily able to cor- rectly identify the underlying faults, but the estimated efficiencies vector is sufficient to stabilize and re-balance the system. Both of these results are a result of a lack of persistent excitation. Because we are considering an over-actuated system, and because we control the design of the control allocation matrix, the control allocation can be perturbed to obtain persistent excitation without affecting tracking performance. This would require constantly updating the allocation scheme to excite different modes in the system, while still satisfying the key allocation constraint, (6.6).

Instead, we will consider an alternate regularization method in the following section, which encourages sparse failure identification.

Sparse Failure Identification

In this section, we will consider an update policy similar to theℓ2-regularized adap- tive update law in the last section, except we will use anℓ₁-regularized update policy. This is a common regularization term for sparse parameter estimation, because it encourages sparse solutions without requiring a hard constraint on the number of non-zero parameters or iteration through many non-zero parameter combinations.

Discrete Update Law

Consider the following least squares loss function.

𝐽_𝑘(𝜂,ˆ 𝑎ˆ) =

𝑘

𝑖=0

e⁽⁻^𝜔^𝑓⁽^𝑡^𝑘⁻^𝑡^𝑖⁾⁾k𝑦_𝑖−𝑔ˆ_NFFk²₂+𝛾_𝜂k𝜂ˆ−1k₁+𝛾_𝑎k𝑎ˆk²₂ (6.63) First, simplify the loss function by moving𝜂ˆand𝑎ˆoutside the summation and defin- ing𝜂¯=𝜂ˆ−1.

𝐽_𝑘(𝜂,ˆ 𝑎ˆ) =

𝑘

𝑖=0

e^(−𝜔^𝑓^(𝑡^𝑘^−𝑡^𝑖⁾⁾ (𝑦_𝑖−𝑔ˆ)^>(𝑦_𝑖−𝑔ˆ) +𝛾_𝜂k𝜂¯k₁+𝛾_𝑎k𝑎ˆk²₂ (6.64)

𝑘

𝑖=0

e⁽⁻^𝜔^𝑓⁽^𝑡^𝑘⁻^𝑡^𝑖⁾⁾ 𝑦^>

𝑖 𝑦_𝑖−2𝑦^>

𝑖 𝑔ˆ+𝑔ˆ^>𝑔ˆ

+𝛾_𝜂k𝜂¯k₁+𝛾_𝑎k𝑎ˆk²₂ (6.65)

𝑘

𝑖=0

e^(−𝜔^𝑓^(𝑡^𝑘^−𝑡^𝑖⁾⁾𝑦^>

𝑖 𝑦_𝑖

−2

𝑘

𝑖=0

e⁽⁻^𝜔^𝑓⁽^𝑡^𝑘⁻^𝑡^𝑖⁾⁾𝑦^>

𝑖

𝐵₀𝑈_𝑖 𝜙_𝑖 i

! "

¯ 𝜂 ˆ 𝑎

+ h

¯ 𝜂^> 𝑎ˆ^>

𝑘

𝑖=0

e^(−𝜔^𝑓^(𝑡^𝑘^−𝑡^𝑖⁾⁾

𝑈_𝑖𝐵^>

0𝐵₀𝑈_𝑖 𝑈_𝑖𝐵^>

0𝜙_𝑖 𝜙^>

𝑖 𝐵₀𝑈_𝑖 𝜙^>

𝑖 𝜙_𝑖

# ! "

¯ 𝜂 ˆ 𝑎

+𝛾_𝜂k𝜂¯k₁+𝛾_𝑎k𝑎ˆk²₂ (6.66)

This is a convex function of𝜂¯and𝑎ˆ, so we can easily solve for the optimal𝜂¯and𝑎ˆ using a number of numerical solving tools. During online computation, we can quickly incorporate a new measurement by scaling the old summation terms by exp(−𝜔_𝑓(𝑡_𝑘−𝑡_𝑘₋₁)) and adding the k’th term. Then we can solve for the optimal

𝜂and 𝑎ˆusing a numerical solver as needed. Note, that we can also easily derive a recursive solution for the optimal𝑎ˆ, with a simplified update step to quickly incorporate new measurements and a step for computation of the optimal𝑎ˆgiven the most recently computed𝜂¯, which would require solving a linear system of equations.

Continuous Update Law

The analogous continuous time adaptation law for𝜂ˆcan be found from the following cost function:

𝐽(𝜂¯)=

∫ 𝑡 0

e^−𝜔^𝑓^(𝑡−𝑟⁾k𝑦−𝐵𝑈(𝑟)𝜂¯k²d𝑟 +2𝛾k𝜂¯k₁ (6.67)

∫ ^𝑡

e^−𝜔^𝑓^(𝑡−𝑟⁾ 𝑦^>𝑦−2𝑦^>𝐵𝑈𝜂¯+𝜂¯^>𝑈 𝐵^>𝐵𝑈𝜂¯d𝑟 +2𝛾k𝜂¯k₁ (6.68) Note that we have dropped back to the simpler case of (6.17), though the full continuous time stability analysis for (6.18) follows similarly under the assumption of Lipschitz boundedness of𝜙.

First, approximate theℓ₁norm such that k𝜂¯k₁ ≈Õ

𝑖

¯ 𝜂²

𝑖 +𝜖 = k𝜂¯k₁,𝜖 and lim

𝜖→0

𝑖

¯ 𝜂²

𝑖 +𝜖 =k𝜂¯k₁ (6.69) Then the cost function in (6.68) is approximated𝐽(𝜂, 𝜖¯ ) such thatlim𝜖→0𝐽(𝜂, 𝜖¯ ) = 𝐽(𝜂¯), where𝐽(𝜂, 𝜖¯ )is given by

𝐽(𝜂¯) ≈ 𝐽(𝜂, 𝜖¯ )=

∫ ^𝑡

e^−𝜔^𝑓^(𝑡−𝑟⁾ 𝑦^>𝑦−2𝑦^>𝐵₀𝑈𝜂¯+𝜂¯^>𝑈 𝐵^>

0𝐵₀𝑈𝜂¯d𝑟 +2𝛾

𝑖

¯ 𝜂²

𝑖 +𝜖 (6.70)

Since this cost function is convex in𝜂¯, the minimum value is obtained when ^{𝜕 𝐽}𝜕𝜂¯ =0, as follows.

𝜕 𝐽

𝜕𝜂¯

∫ ^𝑡

e^−𝜔^𝑓^(𝑡−𝑟⁾ −2𝑈 𝐵^>

0𝑦+2𝑈 𝐵^>

0𝐵₀𝑈𝜂¯d𝑟+2𝛾







¯ 𝜂₁ q

¯ 𝜂²

1+𝜖 , 𝜂¯₂

¯ 𝜂²

2+𝜖 ,· · ·





 (6.71)

𝜕 𝐽

𝜕𝜂¯

𝛾diag©





 1 q

¯ 𝜂²

1+𝜖 ,

1 q

¯ 𝜂²

2+𝜖 ,· · ·





 ª

¬ +

∫ ^𝑡

e⁻^𝜔^𝑓⁽^𝑡⁻^𝑟⁾𝑈(𝑟)𝐵^>

0𝐵₀𝑈(𝑟)d𝑟 ª

¯ 𝜂

∫ ^𝑡

e^−𝜔^𝑓^{(𝑡−𝑟)}𝑈(𝑟)𝐵^>

0𝑦d𝑟 (6.72)

¯ 𝜂 =©

𝛾diag©





 1 q

¯ 𝜂²

1+𝜖

, 1

¯ 𝜂²

2+𝜖 ,· · ·





 ª

¬ +

∫ ^𝑡

e^−𝜔^𝑓^{(𝑡−𝑟)}𝑈(𝑟)𝐵^>

0𝐵₀𝑈(𝑟)d𝑟ª

| {z }¬

𝑃

∫ ^𝑡

e⁻^𝜔^𝑓⁽^𝑡⁻^𝑟⁾𝑈(𝑟)𝐵^>

0𝑦d𝑟 (6.73)

Note for some 𝑖, if 𝜂¯_𝑖 = 0 (typically forℓ₁ norm minimization), then ^q¹

¯ 𝜂²

𝑖+𝜖

= ^√¹

𝜖. Otherwise, aslim𝜖→0 1

¯ 𝜂²

𝑖+𝜖

= _|¹

¯ 𝜂𝑖|.

Now we can derive a recursive update law for𝜂¯. Starting with𝑃: 𝑃¤ =−𝑃

d 𝑃⁻¹ d𝑡

𝑃

=−𝑃

𝛾diag©







¯ 𝜂₁

¯ 𝜂²

1+𝜖

3/2,· · ·





 ª

+𝑈(𝑡)𝐵^>

0𝐵₀𝑈(𝑡)

−𝜔_𝑓

∫ ^𝑡

e⁻^𝜔^𝑓⁽^𝑡⁻^𝑟⁾𝑈(𝑟)𝐵^>

0𝐵₀𝑈(𝑟)d𝑟

𝑃

=−𝑃

𝛾diag©







¯ 𝜂₁

¯ 𝜂²

1+𝜖

3/2,· · ·





 ª

+𝑈(𝑡)𝐵^>

0𝐵₀𝑈(𝑡)

−𝜔_𝑓

𝑃⁻¹−𝛾diag©





 1 q

¯ 𝜂²

1+𝜖 ,

1 q

¯ 𝜂²

2+𝜖 ,· · ·





 ª

¬ ª

𝑃 (6.74)

𝑃¤ =𝜔_𝑓𝑃−𝑃

𝛾diag©







𝜂₁+𝜔_𝑓𝜂¯²

1+𝜔_𝑓𝜖

¯ 𝜂²

1+𝜖

3/2 ,· · ·





 ª

+𝑈(𝑡)^>𝐵^>

0𝐵₀𝑈(𝑡)ª

𝑃. (6.75)

Then we can compute𝜂¤¯.

¤¯ 𝜂=𝑃

𝑈(𝑡)^>𝐵^>

0𝑦(𝑡) −𝜔_𝑓

∫ ^𝑡

e⁻^𝜔^𝑓⁽^𝑡⁻^𝑟⁾𝑈(𝑟)𝐵^>

0𝑦(𝑟)d𝑟

+©

𝜔_𝑓𝑃−𝑃©

𝛾diag©







𝜂₁+𝜔_𝑓𝜂¯²

1+𝜔_𝑓𝜖

¯ 𝜂²

1+𝜖

3/2 ,· · ·





 ª

+𝑈(𝑡)𝐵^>

0𝐵₀𝑈(𝑡)ª

¬ 𝑃ª

∫ ^𝑡

e^−𝜔^𝑓^{(𝑡−𝑟)}𝑈(𝑟)𝐵^>

0𝑦(𝑟)d𝑟

(6.76)

=𝑃 𝑈(𝑡)^>𝐵^>

0𝑦−𝜔_𝑓𝑃⁻¹𝜂¯ (6.77)

+©

𝜔_𝑓𝑃−𝑃

𝛾diag©







𝜂₁+𝜔_𝑓𝜂¯²

1+𝜔_𝑓𝜖

¯ 𝜂²

1+𝜖

3/2 ,· · ·





 ª

+𝑈(𝑡)𝐵^>

0𝐵₀𝑈(𝑡)ª

¬ 𝑃

¬ 𝑃⁻¹𝜂¯

=𝑃𝑈(𝑡)^>𝐵^>

0𝑦−𝜔_𝑓𝜂¯+𝜔_𝑓𝜂¯

−𝑃©

𝛾diag©







𝜂₁+𝜔_𝑓𝜂¯²

1+𝜔_𝑓𝜖

¯ 𝜂²

1+𝜖

3/2 ,· · ·





 ª

+𝑈(𝑡)𝐵^>

0𝐵₀𝑈(𝑡)ª

𝜂 (6.78)

¤¯

𝜂=𝑃𝑈(𝑡)^>𝐵^>

0(𝑦−𝐵₀𝑈(𝑡)𝜂¯) −𝛾 𝑃·diag©







𝜂₁+𝜔_𝑓𝜂¯²

1+𝜔_𝑓𝜖

¯ 𝜂²

1+𝜖

3/2 ,· · ·





 ª

·𝜂¯ (6.79)

Stability Analysis

Again, for the continuous stability analysis we will focus on (6.17), however, the analysis for (6.18) follows similarly. Assume that we design a control allocation matrix 𝐴such that𝐵₀𝐻 𝐴ˆ =I. The closed loop dynamics are

𝑥 = 𝑓_𝑛(𝑥) +𝐵𝑢_eff+𝑔(𝑥 , 𝑢, 𝑡) (6.80)

= 𝑓_𝑛(𝑥) +𝐵₀𝐻 𝑢+𝑑(𝑡) (6.81)

= 𝑓_𝑛(𝑥) +𝐵₀𝐻(𝐵₀𝐻ˆ)⁻¹_𝑅 (−𝐾(𝑥−𝑥_𝑑) + ¤𝑥_𝑑) +𝑑(𝑡) (6.82)

= 𝑓_𝑛(𝑥) +

𝐵₀𝐻ˆ(𝐵₀𝐻ˆ)⁻¹_𝑅 +𝐵₀(𝐻−𝐻ˆ) (𝐵₀𝐻ˆ)⁻¹_𝑅

(−𝐾(𝑥−𝑥_𝑑) + ¤𝑥_𝑑− 𝑓_𝑛(𝑥)) (6.83)

=−𝐾(𝑥−𝑥_𝑑) + ¤𝑥_𝑑−𝐵₀𝐻 𝑢˜ (6.84)

¤˜

𝑥 =−𝐾𝑥˜− 𝐵₀𝑈𝜂˜+𝑑(𝑡) (6.85)

where𝜂˜ =𝜂ˆ−𝜂,𝑥˜=𝑥−𝑥_𝑑, and𝑔(𝑥 , 𝑢, 𝑡)has been lumped into𝑑(𝑡).

As we will see later, it is useful to consider a composite adaptation law, which is a modification of the adaptation law from (6.79) that includes both𝐵₀𝑈𝜂¯−𝑦 and𝑥˜, given by

¤¯

𝜂=𝑃𝑈 𝐵^>

0(𝑦−𝐵₀𝑈𝜂¯) −𝛾 𝑃diag©







𝜂₁+𝜔_𝑓𝜂¯²

1+𝜔_𝑓𝜖

¯ 𝜂²

1+𝜖

3/2 ,· · ·





 ª

·𝜂¯+𝑃𝑈 𝐵^>

0𝑥˜ (6.86) Stability can be shown with the following Lyapunov function:

V =𝑥˜^>𝑥˜+𝜂˜^>𝑃⁻¹𝜂˜= h

˜ 𝑥 𝜂˜

i^>

M h

˜ 𝑥 𝜂˜

, (6.87)

where M = I 0 0 𝑃⁻¹

(6.88) The derivative is computed as follows:

V¤ =2 ˜𝑥^>𝑥¤˜+2 ˜𝜂^>𝑃⁻¹𝜂¤˜+𝜂˜^>

d 𝑃⁻¹

d𝑡 𝜂˜ (6.89)

=−2 ˜𝑥^>𝐾𝑥˜−2 ˜𝑥^>𝐵₀𝑈𝜂˜

2 ˜𝜂^>𝑃⁻¹

« 𝑃𝑈 𝐵^>







𝜂₁+𝜔_𝑓𝜂¯²

1+𝜔_𝑓𝜖

¯ 𝜂²

1+𝜖

3/2 ,· · ·





 ª

·𝜂¯

+𝑃𝑈 𝐵^>

0𝑥˜− ¤𝜂 +𝜂˜^>©

−𝜔_𝑓𝑃⁻¹+𝛾diag©







𝜂₁+𝜔_𝑓𝜂¯²

1+𝜔_𝑓𝜖

¯ 𝜂²

1+𝜖

3/2 ,· · ·





 ª

+𝑈(𝑡)𝐵^>

0𝐵₀𝑈(𝑡)ª

˜ 𝜂 (6.90)

=−2 ˜𝑥^>𝐾𝑥˜−2 ˜𝑥^>𝐵₀𝑈𝜂˜+2 ˜𝑥^>𝑑

+2 ˜𝜂^>𝑈 𝐵^>







𝜂₁+𝜔_𝑓𝜂¯²

1+𝜔_𝑓𝜖

¯ 𝜂²

1+𝜖

3/2 ,· · ·





 ª

·𝜂¯

+2 ˜𝜂^>𝑈 𝐵^>







𝜂₁+𝜔_𝑓𝜂¯²

1+𝜔_𝑓𝜖

¯ 𝜂²

1+𝜖

3/2 ,· · ·





 ª

𝜂+𝜂˜^>𝑈 𝐵^>

0𝐵₀𝑈𝜂˜ (6.91)

=−2 ˜𝑥^>𝐾𝑥˜+2 ˜𝑥^>𝑑

+2 ˜𝜂^>𝑈 𝐵^>







𝜂₁+𝜔_𝑓𝜂¯²

1+𝜔_𝑓𝜖

¯ 𝜂²

1+𝜖

3/2 ,· · ·





 ª

· (𝜂˜+𝜂)







𝜂₁+𝜔_𝑓𝜂¯²

1+𝜔_𝑓𝜖

¯ 𝜂²

1+𝜖

3/2 ,· · ·





 ª

𝜂 (6.92)

=−2 ˜𝑥^>𝐾𝑥˜+2 ˜𝑥^>𝑑

−𝜂˜^>©

« 𝑈 𝐵^>

0𝐵₀𝑈−𝛾diag©







𝜂₁+𝜔_𝑓𝜂¯²

1+𝜔_𝑓𝜖

¯ 𝜂²

1+𝜖

3/2 ,· · ·





 ª

−𝜔_𝑓𝑃⁻¹ª

˜ 𝜂

+2 ˜𝜂^>©

« 𝑈 𝐵^>

0𝑑−𝛾diag©







𝜂₁+𝜔_𝑓𝜂¯²

1+𝜔_𝑓𝜖

¯ 𝜂²

1+𝜖

3/2 ,· · ·





 ª

·𝜂−𝑃⁻¹𝜂¤ª

(6.93)

=−

˜ 𝑥

˜ 𝜂

#> 





2𝐾 0

0 𝑈 𝐵^>

0𝐵₀𝑈+𝛾diag "

¯ 𝜂₁+𝜔𝑓𝜂¯²

1+𝜔𝑓𝜖

¯ 𝜂²

1+𝜖3/2 ,· · ·

# !

+𝜔_𝑓𝑃⁻¹







˜ 𝑥

˜ 𝜂

˜ 𝑥

˜ 𝜂

#> 





𝑑 𝑈 𝐵^>

0𝑑−𝛾diag "

¯ 𝜂₁+𝜔𝑓𝜂¯²

1+𝜔𝑓𝜖

¯ 𝜂²

1+𝜖

3/2 ,· · ·

# !

·𝜂−𝑃⁻¹𝜂¤







(6.94)

Following from the definition of𝑃⁻¹, 𝑃⁻¹ is bounded and uniformly positive definite. Thus, there exists some𝛼 > 0such that

−







2𝐾 0

0 𝑈 𝐵^>

0𝐵₀𝑈+𝛾diag "

¯ 𝜂₁+𝜔𝑓𝜂¯²

1+𝜔𝑓𝜖

¯ 𝜂²

1+𝜖3/2 ,· · ·

# !

+𝜔_𝑓𝑃⁻¹







≤ −2𝛼

I 0 0 𝑃⁻¹

(6.95) Note that, when𝑃⁻¹is symmetric, uniformly bounded, and uniformly positive definite, 𝑃^−1/2and 𝑃^1/2exist and are symmetric, uniformly positive definite, and uniformly bounded. Using (6.94) and (6.95) and the Cauchy-Schwartz inequality, V¤ can be bounded, as follows:

V ≤ −2¤ 𝛼

˜ 𝑥

˜ 𝜂

#>"

I 0 0 𝑃⁻¹

# "

˜ 𝑥

˜ 𝜂

I 0

0 𝑃^−1/2

# "

˜ 𝑥

˜ 𝜂







𝑑 𝑃^1/2𝑈 𝐵^>

0𝑑−𝛾 𝑃^1/2diag "

¯ 𝜂₁+𝜔𝑓𝜂¯²

1+𝜔𝑓𝜖

¯ 𝜂²

1+𝜖3/2 ,· · ·

# !

·𝜂−𝑃^−1/2𝜂¤





 (6.96)

=−2𝛼𝑉 +2

√

𝑉 𝐷 (6.97)

where 𝐷 =







𝑑 𝑃^1/2𝑈 𝐵^>

0𝑑−𝛾 𝑃^1/2diag "

¯ 𝜂₁+𝜔_𝑓𝜂¯²

1+𝜔_𝑓𝜖

¯ 𝜂²

1+𝜖

3/2 ,· · ·

# !

·𝜂−𝑃^−1/2𝜂¤







Consider the related system W where W = V and 2W ¤W = 𝑉¤. Then, from (6.97),

2W ¤W ≤ −2𝛼W²+2W𝐷 (6.98)

W ≤ −¤ 𝛼W +𝐷 . (6.99)

Consider another related system,𝑤(𝑡), defined by𝑤¤(𝑡) =−𝛼𝑤(𝑡) +𝐷(𝑡)and𝑤(0)= W (0). The solution to𝑤(𝑡)is

𝑤(𝑡) =e^−𝛼𝑡𝑤(0) +

∫ 𝑡 0

e^{−𝛼(𝑡−𝑟)}𝐷(𝑟)d𝑟 , (6.100)

which can be bounded by

𝑤(𝑡) ≤ e^−𝛼𝑡

𝑤(0) −sup

𝑡

𝐷(𝑡) 𝛼

+sup

𝑡

𝐷(𝑡)

𝛼 (6.101)

By the Comparison Lemma [27],

√

V =W ≤𝑤(𝑡), (6.102)

thus√

V and alsok𝑥˜k exponentially converges to the ball k𝑥˜k ≤

√

V ≤ 𝑠𝑢 𝑝_𝑡 𝐷

𝛼 (6.103)

Seemingly, the proof is complete at this point. However, we have not yet shown that 𝐷 is bounded. By assumption,𝜂, 𝑑, and𝜂¤are uniformly bounded, and 𝐵₀, 𝛾, 𝜔_𝑓, and𝜖 are constants. 𝐷 is uniformly bounded if 𝑃^1/2, 𝑃^−1/2, 𝜂¯, and𝑈 are initially bounded and continuous.

𝑃^1/2and𝑃^−1/2are uniformly bounded if𝑃⁻¹is uniformly positive definite and uniformly bounded, respectively. Uniform positive definiteness is guaranteed by uniform positive definiteness of 𝜂¯. Uniform boundedness is guaranteed by uniform boundedness of𝑈and𝜂¯.

𝜂is uniformly bounded if𝜂is uniformly bounded and𝜂˜is uniformly bounded.

𝑈is uniformly bounded if𝑥˜is uniformly bounded and(𝐵₀𝜂¯)⁻¹_𝑅 is uniformly bounded.

For the case of (6.18)𝑈 also is a function of 𝜙, leading to the additional condition that𝜙be bounded, which can be guaranteed if𝜙is Lipschitz bounded.

While precise conditions for uniform boundedness of (𝐵₀𝜂¯)⁻¹_𝑅 is difficult to write out, it is clear that 𝜂¯ → 0 as 𝛾 → ∞. We also observe that 𝑃^1/2 ∼ 𝑈 ,𝜂¯, so

Figure 6.1: The test aircraft vehicle design(Left) picture of the vehicle. (Right) schematic of the implemented system. This figure was provided by Joshua Cho.

for small 𝛾, 𝐷 will be dominated by the term 𝑃¹^/²𝑈 𝐵^>

0𝑑. For sufficiently large 𝛾, (𝐵₀𝜂¯)⁻¹_𝑅 is bounded. Lastly, for very large𝛾, no adaptation will occur, and the system will maintain the baseline performance. Thus, there is an inherent design trade off between the degree of regularization and the nominal modeling errors not captured by the efficiency adaptation model, in the case of (6.17), or the learning representation error in the case of (6.18).

6.4 Experimental Validation

Dalam dokumen Methods for Robust Learning-Based Control (Halaman 112-127)