Chapter IV: Regret-Optimal Measurement-Feedback Control
4.1 Non-existence results
Non-existence of a controller with bounded competitive ratio
In this section we establish that in unstable systems there is no controller with bounded competitive ratio. The key observation is that the competitive ratio bound implies that the online controller must incur zero cost on every disturbance sequences on which the optimal noncausal controller incurs zero cost. In particular, in order to have a bounded competitive ratio, a controller must always set the control π’ to be zero whenπ€ =0. In the measurement-feedback setting the controller is unable to directly observeπ€, and hence must setπ’ =0 at all times. The only setting under which this βzero controllerβ can be competitive is when the system is stable.
Theorem 10. FixπΎ > 0and suppose π΄is unstable. There does not exist a causal measurement-feedback controllerπ such that
π½(π, π€ , π£) π½(π
0, π€) < πΎ2 (4.2)
for all driving disturbances π€ and and all measurement disturbances π£. If π΄ is stable, then the only competitive controller is the βzero controllerβ which always setsπ’ =0. This controller has competitive ratio1+ β₯πΉβ₯2.
Proof. We restrict our attention to controllers π which set π(π¦) = πΎ π¦ for some causal operatorπΎ. Condition (4.2) can be rewritten as
πβ
πΎππΎ βΊ πΎ2πβ
πΎ0
ππΎ
0. (4.3)
The (2, 2) block of πβ
πΎππΎ is πβ(πΌ + πΉ πΉβ)π. This operator is clearly positive- semidefinite, and zero only whenπΎ =0. IfπΎ β 0 then condition (4.3) clearly cannot hold because the (2, 2) block ofπβ
πΎ0
ππΎ
0 is zero. IfπΎ =0, then the competitive ratio is simply
sup
π€ββ
2
β₯πΊ π€β₯2
β₯Ξβ1πΊ π€β₯2,
whereΞis defined in Theorem 2. Letπ€
2= Ξβ1πΊ π€. The competitive ratio becomes sup
π€2ββ2
β₯Ξπ€
2β₯2
β₯π€
2β₯2 = sup
π€2ββ2
β₯Ξβπ€
2β₯2
β₯π€
2β₯2
= sup
π€2ββ
2
π€β
2(πΌ+πΉ πΉβ)π€
2
β₯π€
2β₯2
= 1+ sup
π€2ββ2
π€β
2πΉ πΉβπ€
2
β₯π€
2β₯2
= 1+ β₯πΉβ₯2.
It is easy to check thatπΉ is bounded if and only if π΄is stable. β‘ Non-existence of a controller with regret bounded by the pathlength of the driving disturbance and the measurement disturbance
In this section we establish that there is no controller whose regret bounded by the joint pathlength of the driving disturbance and the measurement disturbance;
this non-existence result holds for all linear systems, stable or unstable. The key observation is that the regret bound implies that the online controller must incur zero cost wheneverπ€andπ£ are constant. In the measurement-feedback setting the controller is unable to directly observe π€ and π£, and hence must set π’ = 0 at all times. However, there exist choices ofπ€ andπ£ with zero pathlength such that this
βzero controllerβ incurs positive regret, contradicting the pathlength bound.
Theorem 11. FixπΎ >0. There does not exist a causal controllerπ such that π½(π, π€ , π£) βπ½(π
0, π€) < πΎ2
β₯π·ππ€β₯2
2+ β₯π·ππ£β₯2
2
(4.4) for all driving disturbancesπ€and and all measurement disturbancesπ£.
Proof. We restrict our attention to controllers π which set π(π¦) = πΎ π¦ for some causal operatorπΎ. We argue by way of contradiction; suppose there is some causal πΎ satisfying (4.4). Let us first assume πΎ β 0. Let π€ = 0 and let π£ be nonzero constant sequence such that (πΌβπ» πΎ)β1π£is orthogonal to the nullspace ofπΎ. The left-hand side of (4.4) is π£βπβ(πΌ + πΉ πΉβ)π π£, which is strictly positive due to our assumption on π£. The right-hand side of (4.4) is clearly zero for all choices ofπΎ, therefore condition (4.4) cannot hold. Now assume thatπΎ =0. The left-hand side of (4.4) is
β₯πΊ π€β₯2β β₯Ξβ1πΊ π€β₯2=π€βπΊβ(πΌβ (πΌ+πΉ πΉβ)β1)πΊ π€
=π€βπΊβπΉ(πΌ+πΉβπΉ)β1πΉβπΊ π€ ,
whereΞis defined in Theorem 2. Setπ€to be a nonzero constant sequence which is orthogonal to the nullspace ofπΉβπΊ; such a choice always exists since πΉ andπΊ are strictly causal. Setπ£ =0. Then the left-hand side of (4.4) is strictly positive, but the
right-hand side is zero for allπΎ. β‘
Non-existence of a controller with regret bounded by the energy of the driving disturbance and the pathlength of the measurement disturbance
In this section we establish that in unstable systems there is no controller whose regret is bounded by the energy of the driving disturbance and the pathlength of the measurement disturbance. The key observation is that the regret bound implies that the online controller must incur zero cost whenever π€ = 0 and π£ is constant. In the measurement-feedback setting the controller is unable to directly observeπ€and π£, and hence must setπ’ = 0 at all times. The only setting under which this βzero controllerβ can have regret bounded by the energy of the driving disturbance and the pathlength of the measurement disturbance is when the system is stable.
Theorem 12. FixπΎ > 0and suppose π΄is unstable. There does not exist a causal controllerπ such that
π½(π, π€ , π£) βπ½(π
0, π€) < πΎ2
β₯π€β₯2
2+ β₯π·ππ£β₯2
2
(4.5) for all driving disturbances π€ and and all measurement disturbances π£. If π΄ is stable, then the only controller which satisfies (4.5) for any value of πΎ is the
βzero controllerβ which always sets π’ = 0. This controller satisfies (4.5) with πΎ = β₯Ξββ
2 πΉβπΊβ₯, where Ξ2 is the unique causal and causally invertible operator such thatΞβ
2Ξ2 =πΌ+πΉβπΉ.
Proof. We restrict our attention to controllers π which set π(π¦) = πΎ π¦ for some causal operatorπΎ. We argue by way of contradiction; suppose there is some causal πΎ satisfying (4.5). Let us first assume πΎ β 0. Let π€ = 0 and let π£ be nonzero constant sequence such that (πΌβπ» πΎ)β1π£is orthogonal to the nullspace ofπΎ. The left-hand side of (4.5) is π£βπβ(πΌ + πΉ πΉβ)π π£, which is strictly positive due to our assumption on π£. The right-hand side of (4.5) is clearly zero for all choices ofπΎ, therefore condition (4.5) cannot hold. Now assume thatπΎ =0. The left-hand side of (4.5) is
β₯πΊ π€β₯2β β₯Ξβ1πΊ π€β₯2=π€βπΊβ(πΌβ (πΌ+πΉ πΉβ)β1)πΊ π€
=π€βπΊβπΉ(πΌ+πΉβπΉ)β1πΉβπΊ π€ ,
whereΞis defined in Theorem 2. The smallest possible value ofπΎ2such that (4.5) holds is given by
sup
π€ββ2
π€βπΊβπΉ(πΌ+πΉβπΉ)β1πΉβπΊ π€
β₯π€β₯2 = β₯Ξββ
2 πΉβπΊβ₯2,
where Ξ2 is the unique causal and causally invertible operator such that Ξβ
2Ξ2 = πΌ+πΉβπΉ. We note that the operatorΞββ
2 πΉβπΊis bounded if and only ifπ΄is stable. β‘