Related Work and Background - LEARNING AND CONTROL IN LINEAR TIME-VARYING SYSTEMS

LEARNING AND CONTROL IN LINEAR TIME-VARYING SYSTEMS

4.1 Related Work and Background

This chapter builds on the design of linear time-invariant controllers to provide a new approach for the design of stable controllers for linear-time-varying (LTV) systems. As such, we describe related work on both LTI and LTV systems below.

In the study of the control of LTI systems, linear quadratic regulator (LQR) has been considered in detail. In the classical setting where the underlying system is known, the optimal control law is given by a linear feedback controller obtained by solving Riccati equations [28]. Alternatively, the optimal control problem can also be posed via semi-definite programming (SDP) [283], which is the approach we build on in the current study.

Recently, there has been growing interest in online control of these linear systems

when the underlying dynamics are unknown. Most of these works study the problem with a regret minimization perspective, e.g., [2, 71, 162, 166]. However, these methods have so far only been applied in LTI systems with time-varying costs and disturbances. Extensions to LTV dynamics, which are the focus of this chapter, are not known.

As in the case of LTI systems, optimal control of LTV systems where the sequence of system parameters can be obtained by solving backwards Riccati equations [28].

However, in the online case when the sequence of systems is unknown, the design of controllers is challenging. There are several lines of work in adaptive control and model-predictive control (MPC) that have been studied to this point. In adaptive control of LTV systems, the underlying systems are unknown and the results generally assume slow and bounded or fixed systematic variation of dynamics with bounded disturbances [193, 199, 212]. In MPC of LTV systems, a finite horizon of sequence of systems (predictions) is known and the system is again assumed to be slowly varying or open-loop stable, e.g., [78, 304]. Different from prior works, in this chapter we consider the online problem and make no assumptions about how the system varies over time. As in the LTI setting, the study of regret minimization in LTV systems has recently received attention. Gradu et al. [98] are most related to the current study. Gradu et al. [98] studies the adaptive regret of online control in LTV systems with bounded cost. Note that when the cost is bounded, a finite regret need not guarantee stability. In contrast, we use a quadratic (unbounded) cost and we can guarantee stability.

Randomization and asynchrony are crucial to many computational tasks that involve a large number of agents working cooperatively with each other [77, 135]. They allow speed-ups and cost reductions in many artificial and biological processes by removing the synchronization time, relaxing the communication bottlenecks, minimizing the cost of cooperation, and increasing efficiency. For example, large- scale control systems with multiple sensors adopt random asynchronous updates from their sensors due to power saving and difficulty of synchronization [110].

Similarly, asynchrony and randomization are central elements in the dynamical systems of biological neural networks [246, 273]. In various studies, e.g., [40, 75], researchers have found that the synchrony/asynchrony balance phenomenon is ubiquitous in cortical networks. They show the existence of a delicate equilibrium between synchrony and asynchrony of neural firings in many cognitive tasks and regions of the brain such as visual, auditory, and memory maintenance to obtain

stable dynamics during computations. Any disturbance to this natural equilibrium may result in neurological disorders [274].

In modeling stochastic or varying dynamics like random asynchronous LTI systems, there has been a strong interest in switching linear systems/Markov jump systems [63, 210, 254, 257, 300, 301], in which state variables evolve according to a randomly selected model among all possible models. Although randomized linear models can be studied under this framework, the number of possible models becomes exponential in the number of state variables, making this approach prohibitive for large-scale systems. Moreover, the connections between the nodes in randomized systems are mostly fixed without any switching between different systems. Having these connections on or off is the main cause of randomization within the system.

Thus, in these dynamical systems, the underlying system is time-invariant while the active interaction within the system is time-varying. Prior works that adopt switching linear systems fail to capture the nature of random asynchronous LTI systems.

In addition to the switching systems viewpoint, the statistical behavior of LTV systems can be also studied from the product of random matrices perspective [21, 74, 100, 104, 131, 219]. However, these frameworks usually come with additional constraints on the state transition matrix, e.g., Hartfiel [104] requires it to be element- wise nonnegative and Avron et al. [21] requires the state transition matric to be positive definite. Similarly, approaches based on joint spectral radius are too restrictive to reveal the effect of randomization [131].

In modeling the dynamics of a system, the underlying system is usually unknown and only a sequence of inputs and outputs is available. This raises the system identification problem which aims to recover the parameters that govern the dynamics from the data collected. The classical and recent system identification methods mainly focus on linear dynamical systems (LDS) and consider stable synchronous LTI systems or switching linear systems [160, 189, 214, 233]. For switching linear systems, the system identification methods require the knowledge of the order of switched systems, otherwise, they become computationally intractable and sample inefficient due to exponential dimension dependency [170]. Thus, they have limited applica- bility to large-scale practical random asynchronous LTI systems. This highlights the necessity of a careful and systematic approach in deriving stability conditions and system identification framework of random asynchronous LTI systems.

Notation. We denote the Euclidean norm of a vector𝑥 as ∥𝑥∥. For a matrix 𝐴,

∥𝐴∥ is its spectral norm, 𝐴^⊤ is its transpose, and Tr(𝐴) is its trace, 𝜌(𝐴) denotes the spectral radius of 𝐴, i.e., the largest absolute value of its eigenvalues. 𝛿(𝑡) denotes the unit impulse function. The Kronecker product is denoted as ⊗ and ⊙ denotes the Hadamard product. N (𝜇,Σ) denotes normal distribution with mean 𝜇 and covariance Σ. 𝐴 ≻ 𝐵 and 𝐴 ⪰ 𝐵 denote that 𝐴 − 𝐵 is positive definite and positive semi-definite respectively. 𝐴•𝐵denotes the element-wise inner product of 𝐴and𝐵, i.e., Tr(𝐴^⊤𝐵). I^𝑑denotes𝑑×𝑑identity matrix.

Dalam dokumen Learning and Control of Dynamical Systems (Halaman 110-113)