• Tidak ada hasil yang ditemukan

Non-existence results

Dalam dokumen Regret-Optimal Control (Halaman 51-54)

Chapter IV: Regret-Optimal Measurement-Feedback Control

4.1 Non-existence results

Non-existence of a controller with bounded competitive ratio

In this section we establish that in unstable systems there is no controller with bounded competitive ratio. The key observation is that the competitive ratio bound implies that the online controller must incur zero cost on every disturbance sequences on which the optimal noncausal controller incurs zero cost. In particular, in order to have a bounded competitive ratio, a controller must always set the control 𝑒 to be zero when𝑀 =0. In the measurement-feedback setting the controller is unable to directly observe𝑀, and hence must set𝑒 =0 at all times. The only setting under which this β€œzero controller” can be competitive is when the system is stable.

Theorem 10. Fix𝛾 > 0and suppose 𝐴is unstable. There does not exist a causal measurement-feedback controllerπœ‹ such that

𝐽(πœ‹, 𝑀 , 𝑣) 𝐽(πœ‹

0, 𝑀) < 𝛾2 (4.2)

for all driving disturbances 𝑀 and and all measurement disturbances 𝑣. If 𝐴 is stable, then the only competitive controller is the β€œzero controller” which always sets𝑒 =0. This controller has competitive ratio1+ βˆ₯𝐹βˆ₯2.

Proof. We restrict our attention to controllers πœ‹ which set πœ‹(𝑦) = 𝐾 𝑦 for some causal operator𝐾. Condition (4.2) can be rewritten as

π‘‡βˆ—

𝐾𝑇𝐾 β‰Ί 𝛾2π‘‡βˆ—

𝐾0

𝑇𝐾

0. (4.3)

The (2, 2) block of π‘‡βˆ—

𝐾𝑇𝐾 is π‘„βˆ—(𝐼 + 𝐹 πΉβˆ—)𝑄. This operator is clearly positive- semidefinite, and zero only when𝐾 =0. If𝐾 β‰  0 then condition (4.3) clearly cannot hold because the (2, 2) block ofπ‘‡βˆ—

𝐾0

𝑇𝐾

0 is zero. If𝐾 =0, then the competitive ratio is simply

sup

π‘€βˆˆβ„“

2

βˆ₯𝐺 𝑀βˆ₯2

βˆ₯Ξ”βˆ’1𝐺 𝑀βˆ₯2,

whereΞ”is defined in Theorem 2. Let𝑀

2= Ξ”βˆ’1𝐺 𝑀. The competitive ratio becomes sup

𝑀2βˆˆβ„“2

βˆ₯Δ𝑀

2βˆ₯2

βˆ₯𝑀

2βˆ₯2 = sup

𝑀2βˆˆβ„“2

βˆ₯Ξ”βˆ—π‘€

2βˆ₯2

βˆ₯𝑀

2βˆ₯2

= sup

𝑀2βˆˆβ„“

2

π‘€βˆ—

2(𝐼+𝐹 πΉβˆ—)𝑀

2

βˆ₯𝑀

2βˆ₯2

= 1+ sup

𝑀2βˆˆβ„“2

π‘€βˆ—

2𝐹 πΉβˆ—π‘€

2

βˆ₯𝑀

2βˆ₯2

= 1+ βˆ₯𝐹βˆ₯2.

It is easy to check that𝐹 is bounded if and only if 𝐴is stable. β–‘ Non-existence of a controller with regret bounded by the pathlength of the driving disturbance and the measurement disturbance

In this section we establish that there is no controller whose regret bounded by the joint pathlength of the driving disturbance and the measurement disturbance;

this non-existence result holds for all linear systems, stable or unstable. The key observation is that the regret bound implies that the online controller must incur zero cost whenever𝑀and𝑣 are constant. In the measurement-feedback setting the controller is unable to directly observe 𝑀 and 𝑣, and hence must set 𝑒 = 0 at all times. However, there exist choices of𝑀 and𝑣 with zero pathlength such that this

β€œzero controller” incurs positive regret, contradicting the pathlength bound.

Theorem 11. Fix𝛾 >0. There does not exist a causal controllerπœ‹ such that 𝐽(πœ‹, 𝑀 , 𝑣) βˆ’π½(πœ‹

0, 𝑀) < 𝛾2

βˆ₯𝐷𝑝𝑀βˆ₯2

2+ βˆ₯π·π‘Ÿπ‘£βˆ₯2

2

(4.4) for all driving disturbances𝑀and and all measurement disturbances𝑣.

Proof. We restrict our attention to controllers πœ‹ which set πœ‹(𝑦) = 𝐾 𝑦 for some causal operator𝐾. We argue by way of contradiction; suppose there is some causal 𝐾 satisfying (4.4). Let us first assume 𝐾 β‰  0. Let 𝑀 = 0 and let 𝑣 be nonzero constant sequence such that (πΌβˆ’π» 𝐾)βˆ’1𝑣is orthogonal to the nullspace of𝐾. The left-hand side of (4.4) is π‘£βˆ—π‘„βˆ—(𝐼 + 𝐹 πΉβˆ—)𝑄 𝑣, which is strictly positive due to our assumption on 𝑣. The right-hand side of (4.4) is clearly zero for all choices of𝛾, therefore condition (4.4) cannot hold. Now assume that𝐾 =0. The left-hand side of (4.4) is

βˆ₯𝐺 𝑀βˆ₯2βˆ’ βˆ₯Ξ”βˆ’1𝐺 𝑀βˆ₯2=π‘€βˆ—πΊβˆ—(πΌβˆ’ (𝐼+𝐹 πΉβˆ—)βˆ’1)𝐺 𝑀

=π‘€βˆ—πΊβˆ—πΉ(𝐼+πΉβˆ—πΉ)βˆ’1πΉβˆ—πΊ 𝑀 ,

whereΞ”is defined in Theorem 2. Set𝑀to be a nonzero constant sequence which is orthogonal to the nullspace ofπΉβˆ—πΊ; such a choice always exists since 𝐹 and𝐺 are strictly causal. Set𝑣 =0. Then the left-hand side of (4.4) is strictly positive, but the

right-hand side is zero for all𝛾. β–‘

Non-existence of a controller with regret bounded by the energy of the driving disturbance and the pathlength of the measurement disturbance

In this section we establish that in unstable systems there is no controller whose regret is bounded by the energy of the driving disturbance and the pathlength of the measurement disturbance. The key observation is that the regret bound implies that the online controller must incur zero cost whenever 𝑀 = 0 and 𝑣 is constant. In the measurement-feedback setting the controller is unable to directly observe𝑀and 𝑣, and hence must set𝑒 = 0 at all times. The only setting under which this β€œzero controller” can have regret bounded by the energy of the driving disturbance and the pathlength of the measurement disturbance is when the system is stable.

Theorem 12. Fix𝛾 > 0and suppose 𝐴is unstable. There does not exist a causal controllerπœ‹ such that

𝐽(πœ‹, 𝑀 , 𝑣) βˆ’π½(πœ‹

0, 𝑀) < 𝛾2

βˆ₯𝑀βˆ₯2

2+ βˆ₯π·π‘Ÿπ‘£βˆ₯2

2

(4.5) for all driving disturbances 𝑀 and and all measurement disturbances 𝑣. If 𝐴 is stable, then the only controller which satisfies (4.5) for any value of 𝛾 is the

β€œzero controller” which always sets 𝑒 = 0. This controller satisfies (4.5) with 𝛾 = βˆ₯Ξ”βˆ’βˆ—

2 πΉβˆ—πΊβˆ₯, where Ξ”2 is the unique causal and causally invertible operator such thatΞ”βˆ—

2Ξ”2 =𝐼+πΉβˆ—πΉ.

Proof. We restrict our attention to controllers πœ‹ which set πœ‹(𝑦) = 𝐾 𝑦 for some causal operator𝐾. We argue by way of contradiction; suppose there is some causal 𝐾 satisfying (4.5). Let us first assume 𝐾 β‰  0. Let 𝑀 = 0 and let 𝑣 be nonzero constant sequence such that (πΌβˆ’π» 𝐾)βˆ’1𝑣is orthogonal to the nullspace of𝐾. The left-hand side of (4.5) is π‘£βˆ—π‘„βˆ—(𝐼 + 𝐹 πΉβˆ—)𝑄 𝑣, which is strictly positive due to our assumption on 𝑣. The right-hand side of (4.5) is clearly zero for all choices of𝛾, therefore condition (4.5) cannot hold. Now assume that𝐾 =0. The left-hand side of (4.5) is

βˆ₯𝐺 𝑀βˆ₯2βˆ’ βˆ₯Ξ”βˆ’1𝐺 𝑀βˆ₯2=π‘€βˆ—πΊβˆ—(πΌβˆ’ (𝐼+𝐹 πΉβˆ—)βˆ’1)𝐺 𝑀

=π‘€βˆ—πΊβˆ—πΉ(𝐼+πΉβˆ—πΉ)βˆ’1πΉβˆ—πΊ 𝑀 ,

whereΞ”is defined in Theorem 2. The smallest possible value of𝛾2such that (4.5) holds is given by

sup

π‘€βˆˆβ„“2

π‘€βˆ—πΊβˆ—πΉ(𝐼+πΉβˆ—πΉ)βˆ’1πΉβˆ—πΊ 𝑀

βˆ₯𝑀βˆ₯2 = βˆ₯Ξ”βˆ’βˆ—

2 πΉβˆ—πΊβˆ₯2,

where Ξ”2 is the unique causal and causally invertible operator such that Ξ”βˆ—

2Ξ”2 = 𝐼+πΉβˆ—πΉ. We note that the operatorΞ”βˆ’βˆ—

2 πΉβˆ—πΊis bounded if and only if𝐴is stable. β–‘

Dalam dokumen Regret-Optimal Control (Halaman 51-54)