• Tidak ada hasil yang ditemukan

Neural-Rendezvous: Learning-based Robust Guidance and Control

Chapter 9: Interstellar Object Exploration

9.4 Neural-Rendezvous: Learning-based Robust Guidance and Control

This section finally presents Neural-Rendezvous, a deep learning-based terminal G&C approach to autonomously encounter ISOs, thereby solving the problem 9.1 of Sec. 9.1 that arise from the large state uncertainty and high-velocity challenges.

It will be shown that the SN-DNN min-norm control (9.17) of Lemma 9.2 verifies a formal exponential bound on expected spacecraft delivery error, which provides valuable information in determining whether we should use the SN-DNN terminal guidance policy or enhance it with the SN-DNN min-norm control, depending on the size of the state uncertainty.

9.4.I Robustness and Stability Guarantee

The assumptions introduced in Sec. 9.3.III allow bounding the mean squared distance between the spacecraft relative position of (9.1) controlled by (9.17) and the desired position ๐‘๐‘‘(๐‘ก)given in (9.14), even under the presence of the state uncertainty (see Fig. 9.4). We remark that the additional notations in the following theorem are summarized in Table 9.4, where the others are consistent with the ones in Table 9.3.

Let us remark that the use of our incremental Lyapunov-like function, along with the supermartingale analysis, allows easier handling of the aforementioned local assumptions in estimation and control.

Theorem 9.1. Suppose that Assumptions 9.1 and 9.2 hold, and that the spacecraft relative dynamics with respect to the ISO, given in(9.1), is controlled by๐‘ข=๐‘ขโˆ—

โ„“. If the estimated states at time๐‘ก =๐‘ก๐‘ satisfyล“ห†๐‘  โˆˆ Ciso(๐‘…ยฏiso) and๐‘ฅห†๐‘  โˆˆ Csc(๐‘…ยฏsc), then the spacecraft delivery error is explicitly bounded as follows with probability at least 1โˆ’๐œ€ctrl:

E[โˆฅ๐‘(๐‘ก๐‘“) โˆ’๐œŒโˆฅ |ล“(ห† ๐‘ก๐‘ ) =ล“ห†๐‘ โˆฉ๐‘ฅห†(๐‘ก๐‘ ) =๐‘ฅห†๐‘ ]

โ‰ค sup

(ล“๐‘ ,๐‘ฅ๐‘ )โˆˆDest

๐‘’โˆ’๐œ†(๐‘ก๐‘“โˆ’๐‘ก๐‘ )โˆฅ๐‘๐‘ โˆ’ ๐‘๐‘‘(๐‘ก๐‘ ) โˆฅ + ๐‘’โˆ’๐œ†๐‘ก๐‘“๐ฟ๐‘˜ ๐‘šsc(๐‘ก๐‘“)

โˆซ ๐‘ก๐‘“

๐‘ก๐‘ 

๐œ›๐œ(๐‘๐‘ , ๐‘ก๐‘ )๐‘‘ ๐œ

+

๏ฃฑ๏ฃด

๏ฃด๏ฃด

๏ฃฒ

๏ฃด๏ฃด

๏ฃด

๏ฃณ

๐‘’โˆ’๐œ†(๐‘ก๐‘“โˆ’๐‘ก๐‘ )โˆ’๐‘’โˆ’๐›ผ(๐‘ก๐‘“โˆ’๐‘ก๐‘ ) ๐›ผโˆ’๐œ†

๐‘ฃ(๐‘ฅ๐‘ ,๐‘ก๐‘ )

โˆš

๐‘šsc(๐‘ก๐‘“) if๐›ผ โ‰ ๐œ† (๐‘ก๐‘“ โˆ’๐‘ก๐‘ )๐‘’โˆ’๐œ†(๐‘ก๐‘“โˆ’๐‘ก๐‘ )โˆš๐‘ฃ(๐‘ฅ๐‘ ,๐‘ก๐‘ )

๐‘šsc(๐‘ก๐‘“) if๐›ผ =๐œ†

(9.27)

where๐œ€ctrl โˆˆRโ‰ฅ0is the probability to be given in(9.42),๐œ›๐‘ก(๐‘๐‘ , ๐‘ก๐‘ )is a time-varying function defined as๐œ›๐‘ก(๐‘๐‘ , ๐‘ก๐‘ ) =๐‘’(๐œ†โˆ’๐›ผ)๐‘ก

โˆซ๐‘ก ๐‘ก๐‘ 

๐‘’๐›ผ๐œ๐œ๐œ(๐‘๐‘ , ๐‘ก๐‘ )๐‘‘ ๐œ. Note that(9.27)yields a bound onP

โˆฅ๐‘(๐‘ก๐‘“) โˆ’ ๐œŒโˆฅ โ‰ค ๐‘‘

ล“ห†(๐‘ก๐‘ ) =ล“ห†๐‘ โˆฉ๐‘ฅห†(๐‘ก๐‘ ) =๐‘ฅห†๐‘ 

as in Theorem 2.5 of [7], where๐‘‘ โˆˆRโ‰ฅ0is any given distance of interest, using Markovโ€™s inequality.

Table 9.4: Notations in Theorem 9.1.

Notation Description

Dest Set defined asDest={(ล“๐‘ , ๐‘ฅ๐‘ ) โˆˆR๐‘›ร—R๐‘›|โˆš๏ธ

โˆฅล“๐‘ โˆ’ล“ห†๐‘ โˆฅ2+ โˆฅ๐‘ฅ๐‘ โˆ’๐‘ฅห†๐‘ โˆฅ2 โ‰ค๐‘๐‘’ ๐ฟ๐‘˜ Lipschitz constant of๐‘˜in Assumption (9.1)

๐‘šsc(๐‘ก) Spacecraft mass of (9.2) satisfying๐‘šsc(๐‘ก) โˆˆ [๐‘šsc(๐‘ก๐‘ ), ๐‘šsc(๐‘ก๐‘“)]due to the Tsiolkovsky rocket equation

ล“๐‘ ,ล“ห†๐‘ , ๐‘ฅ๐‘ ,๐‘ฅห†๐‘  True and estimated ISO and S/C state at time๐‘ก=๐‘ก๐‘ , respectively

๐‘๐‘‘(๐‘ก) Desired S/C relative position trajectory, i.e.,๐‘ฅ๐‘‘(๐‘ก)=[๐‘๐‘‘(๐‘ก)โŠค,๐‘ยค๐‘‘(๐‘ก)โŠค]โŠคfor ๐‘ฅ๐‘‘of (9.14)

๐‘๐‘ , ๐‘ห†๐‘  True and estimated S/C relative position at time๐‘ก=๐‘ก๐‘ , i.e.,๐‘ฅ๐‘ =[๐‘โŠค๐‘ ,๐‘ยคโŠค๐‘ ]โŠค and ห†๐‘ฅ๐‘ =[๐‘ห†โŠค๐‘ ,๐‘ยคห†โŠค๐‘ ]โŠค

ยฏ

๐‘…iso, ๐‘…ยฏsc Constants defined as ยฏ๐‘…iso=๐‘Ÿisoโˆ’2๐‘๐‘’and ยฏ๐‘…sc=๐‘Ÿscโˆ’2๐‘๐‘’for๐‘๐‘’of Assump- tion 9.1 and๐‘Ÿisoand๐‘Ÿscof Assumption 9.2

๐‘ขโˆ—

โ„“ SN-DNN min-norm control policy (9.17) of Lemma 9.2 ๐‘ฃ(๐‘ฅ , ๐‘ก) Non-negative function given as ๐‘ฃ(๐‘ฅ , ๐‘ก) = โˆš๏ธ

๐‘‰( ยค๐‘, ๐œš๐‘‘(๐‘, ๐‘ก), ๐‘ก) =

โˆš๏ธ

๐‘šsc(๐‘ก) โˆฅฮ›(๐‘โˆ’๐‘๐‘‘(๐‘ก)) + ยค๐‘โˆ’ ยค๐‘๐‘‘(๐‘ก) โˆฅfor๐‘‰ of (9.22)

๐‘๐‘  Tuple of the true and estimated ISO and S/C state at time ๐‘ก = ๐‘ก๐‘ , i.e., ๐‘๐‘ =(ล“๐‘ ,ล“ห†๐‘ , ๐‘ฅ๐‘ ,๐‘ฅห†๐‘ )

๐›ผ Positive constant of (9.21)

๐œ† Minimum eigenvalue ofฮ›defined in (9.21) ๐œŒ Desired terminal S/C relative position of (9.4)

Proof. The dynamics (9.16) controlled by (9.17) can be rewritten as

ยฅ

๐‘ =๐’ป(๐‘ฅ ,ล“, ๐‘ก) +โ„ฌ(๐‘ฅ ,ล“, ๐‘ก)๐‘ขโˆ—

โ„“(๐‘ฅ ,ล“, ๐‘ก;๐œƒnn) +โ„ฌ(๐‘ฅ ,ล“ห†, ๐‘ก)๐‘ขหœ (9.28) where หœ๐‘ข =๐‘ขโˆ—

โ„“(๐‘ฅ ,ห† ล“ห†, ๐‘ก;๐œƒnn) โˆ’๐‘ขโˆ—

โ„“(๐‘ฅ ,ล“, ๐‘ก;๐œƒnn). Since the third term of (9.28) is subject to stochastic disturbance due to the state uncertainty of Assumption 9.1, we consider the weak infinitesimal operator ๐’œ given in [30, p. 9], instead of taking the time derivative, for analyzing the time evolution of the non-negative function๐‘ฃdefined as ๐‘ฃ(๐‘ฅ , ๐‘ก) =โˆš๏ธ

๐‘‰( ยค๐‘, ๐œš๐‘‘(๐‘, ๐‘ก), ๐‘ก) =โˆš๏ธ

๐‘šsc(๐‘ก) โˆฅฮ›(๐‘โˆ’ ๐‘๐‘‘(๐‘ก)) + ยค๐‘โˆ’ ยค๐‘๐‘‘(๐‘ก) โˆฅfor๐‘‰ of (9.22).

To this end, let us compute the following time increment of๐‘ฃ evaluated at the true state and time (๐‘ฅ ,ล“, ๐‘ก):

ฮ”๐‘ฃ=๐‘ฃ(๐‘ฅ(๐‘ก+ฮ”๐‘ก), ๐‘ก+ฮ”๐‘ก) โˆ’๐‘ฃ(๐‘ฅ , ๐‘ก)

= 1 2๐‘ฃ(๐‘ฅ , ๐‘ก)

๐œ•๐‘‰

๐œ•๐‘ยค

ยฅ

๐‘(๐‘ก) + ๐œ•๐‘‰

๐œ• ๐œš๐‘‘

ยฅ

๐œš๐‘‘(๐‘(๐‘ก), ๐‘ก) + ๐œ•๐‘‰

๐œ• ๐‘ก

ฮ”๐‘ก+ O ฮ”๐‘ก2

(9.29)

whereฮ”๐‘ก โˆˆ Rโ‰ฅ0 and the arguments ( ยค๐‘(๐‘ก), ๐œš๐‘‘(๐‘, ๐‘ก), ๐‘ก) of the partial derivatives of ๐‘‰ are omitted for notational convenience. Dropping the argument ๐‘ก for the state variables for simplicity, the dynamics decomposition (9.28) gives

๐œ•๐‘‰

๐œ•๐‘ยค

ยฅ ๐‘+ ๐œ•๐‘‰

๐œ• ๐œš๐‘‘

ยฅ

๐œš๐‘‘(๐‘, ๐‘ก) + ๐œ•๐‘‰

๐œ• ๐‘ก

โ‰ค ๐œ•๐‘‰

๐œ•๐‘ยค

(๐’ป(๐‘ฅ ,ล“, ๐‘ก) +โ„ฌ(๐‘ฅ ,ล“, ๐‘ก)๐‘ขโˆ—(๐‘ฅ ,ห† ล“ห†, ๐‘ก;๐œƒnn)) + ๐œ•๐‘‰

๐œ• ๐œš๐‘‘

( ยฅ๐‘๐‘‘โˆ’ฮ›( ยค๐‘โˆ’ ยค๐‘๐‘‘))

โ‰ค โˆ’2๐›ผ๐‘‰( ยค๐‘, ๐œš๐‘‘(๐‘, ๐‘ก), ๐‘ก) +2๐œ•๐‘‰

๐œ•๐‘ยค

โ„ฌ(๐‘ฅ ,ล“, ๐‘ก)๐‘ขหœ=โˆ’2๐›ผ๐‘ฃ(๐‘ฅ , ๐‘ก)2+2๐‘ฃ(๐‘ฅ , ๐‘ก) โˆฅ๐‘ขหœโˆฅ

โˆš๏ธ

๐‘šsc(๐‘ก) (9.30) where the first inequality follows from the fact that the spacecraft mass ๐‘šsc(๐‘ก), described by the Tsiolkovsky rocket equation, is a decreasing function and thus

๐œ•๐‘‰/๐œ• ๐‘ก โ‰ค 0, and the second inequality follows from the stability condition (9.21) evaluated at (๐‘ฅ ,ล“, ๐‘ก), which is guaranteed to be feasible due to Lemma 9.2. Since ๐‘ขโˆ—is assumed to be Lipschitz as in (9.25) of Assumption 9.2, we get the following relation for any๐‘ฅ๐‘ ,๐‘ฅห†๐‘  โˆˆ Csc(๐‘Ÿsc), ล“๐‘ ,ล“ห†๐‘  โˆˆ Ciso(๐‘Ÿiso), and๐‘ก๐‘  โˆˆ [0, ๐‘ก๐‘“], by substitut- ing (9.30) into (9.29):

๐’œ๐‘ฃ(๐‘ฅ๐‘ , ๐‘ก๐‘ ) = lim

ฮ”๐‘กโ†“0

E๐‘๐‘  [ฮ”๐‘ฃ] ฮ”๐‘ก

โ‰ค โˆ’๐›ผ๐‘ฃ(๐‘ฅ๐‘ , ๐‘ก๐‘ ) + ๐ฟ๐‘˜

โˆš๏ธ

๐‘šsc(๐‘ก๐‘“)

โˆš๏ธโˆฅล“ห†๐‘ โˆ’ล“๐‘ โˆฅ2+ โˆฅ๐‘ฅห†๐‘ โˆ’๐‘ฅ๐‘ โˆฅ2 (9.31) whereE๐‘๐‘  [ยท] =E[ ยท|๐‘ฅ(๐‘ก๐‘ )=๐‘ฅ๐‘ ,๐‘ฅห†(๐‘ก๐‘ ) =๐‘ฅห†๐‘ ,ล“(๐‘ก๐‘ ) =ล“๐‘ ,ล“ห†(๐‘ก๐‘ ) =ล“ห†๐‘ ] for ๐‘๐‘  given as ๐‘๐‘  = (๐‘ฅ๐‘ ,๐‘ฅห†๐‘ ,ล“๐‘ ,ล“ห†๐‘ ), and the relation ๐‘šsc(๐‘ก) โ‰ฅ ๐‘šsc(๐‘ก๐‘“) by the Tsiolkovsky rocket equation is also used to obtain the inequality. Applying Dynkinโ€™s formula [30, p.

10] to (9.31) gives the following with probability at least 1โˆ’ ๐œ€est due to (9.24) of Assumption 9.1:

E๐‘๐‘  [โˆฅฮ›(๐‘(๐‘ก) โˆ’๐‘๐‘‘(๐‘ก)) + ยค๐‘(๐‘ก) โˆ’ ยค๐‘๐‘‘(๐‘ก) โˆฅ] (9.32)

โ‰ค ๐‘’โˆ’๐›ผ(๐‘กโˆ’๐‘ก๐‘ )๐‘ฃ(๐‘ฅ๐‘ , ๐‘ก๐‘ )

โˆš๏ธ

๐‘šsc(๐‘ก๐‘“) + ๐‘’โˆ’๐›ผ๐‘ก๐ฟ๐‘˜ ๐‘šsc(๐‘ก๐‘“)

โˆซ ๐‘ก

๐‘ก๐‘ 

๐‘’๐›ผ๐œ๐œ๐œ(๐‘๐‘ , ๐‘ก๐‘ )๐‘‘ ๐œ, โˆ€๐‘ก โˆˆ [๐‘ก๐‘ ,๐‘กยฏ๐‘’]

where ๐œ๐œ(๐‘๐‘ , ๐‘ก๐‘ ) is as given in (9.24) and ยฏ๐‘ก๐‘’ is defined as ยฏ๐‘ก๐‘’ = min{๐‘ก๐‘’, ๐‘ก๐‘“}, with๐‘ก๐‘’ being the first exit time that any of the states leave their respective set, i.e.,

๐‘ก๐‘’ =inf{๐‘ก โ‰ฅ ๐‘ก๐‘  :ล“(๐‘ก) โˆ‰Ciso(๐‘Ÿiso) โˆชล“ห†(๐‘ก) โˆ‰Ciso(๐‘Ÿiso) (9.33) ๐‘ก๐‘’ =inf{๐‘ก โ‰ฅ ๐‘ก๐‘  :โˆช๐‘ฅ(๐‘ก) โˆ‰Csc(๐‘Ÿsc) โˆช๐‘ฅห†(๐‘ก) โˆ‰Csc(๐‘Ÿsc)}

given that the event of (9.23) in Assumption 9.1 has occurred. Since we further have that๐’œโˆฅ๐‘โˆ’๐‘๐‘‘โˆฅ = ๐‘‘ ๐‘ก๐‘‘โˆฅ๐‘โˆ’๐‘๐‘‘โˆฅ โ‰ค โˆฅ๐œšหœโˆฅ โˆ’๐œ†(๐‘โˆ’๐‘๐‘‘)for หœ๐œš= ฮ›(๐‘โˆ’๐‘๐‘‘) + ยค๐‘โˆ’ ยค๐‘๐‘‘

due to the hierarchical structure of หœ๐œš, utilizing Dynkinโ€™s formula one more time and then substituting (9.32) result in

E๐‘๐‘  [โˆฅ๐‘(๐‘ก) โˆ’๐‘๐‘‘(๐‘ก) โˆฅ] โ‰ค๐‘’โˆ’๐œ†๐‘ก

โˆซ ๐‘ก

๐‘ก๐‘ 

๐‘’(๐œ†โˆ’๐›ผ)๐œ+๐›ผ๐‘ก๐‘ ๐‘ฃ(๐‘ฅ๐‘ , ๐‘ก๐‘ )

โˆš๏ธ

๐‘šsc(๐‘ก๐‘“) + ๐ฟ๐‘˜

๐‘šsc(๐‘ก๐‘“)๐œ›๐œ(๐‘๐‘ , ๐‘ก๐‘ )๐‘‘ ๐œ +๐‘’โˆ’๐œ†(๐‘กโˆ’๐‘ก๐‘ )โˆฅ๐‘๐‘ โˆ’ ๐‘๐‘‘(0) โˆฅ, โˆ€๐‘ก โˆˆ [0,๐‘กยฏ๐‘’] (9.34) where๐‘ฅ๐‘  = [๐‘โŠค

๐‘ ,๐‘ยคโŠค

๐‘ ]โŠค.

What is left to show is that we can achieve๐‘ก๐‘’ > ๐‘ก๐‘“ with a finite probability for the first exit time๐‘ก๐‘’of (9.33). For this purpose, we need the following lemma.

Lemma 9.3. Suppose that the events of Assumption 9.1 has occurred, i.e., the event of(9.23)has occurred and the2-norm estimation error satisfies

โˆš๏ธโˆฅล“ห†(๐‘ก) โˆ’ล“(๐‘ก) โˆฅ2+ โˆฅ๐‘ฅห†(๐‘ก) โˆ’๐‘ฅ(๐‘ก) โˆฅ2 < ๐‘๐‘’, โˆ€๐‘ก โˆˆ [๐‘ก๐‘ , ๐‘ก๐‘“]. If Assumption 9.2 holds and if we have

๐’œ๐‘ฃ(๐‘ฅ๐‘ , ๐‘ก๐‘ ) โ‰ค โˆ’๐›ผ๐‘ฃ(๐‘ฅ๐‘ , ๐‘ก๐‘ ) + ๐ฟ๐‘˜

โˆš๏ธ

๐‘šsc(๐‘ก๐‘“)

โˆš๏ธโˆฅล“ห†๐‘ โˆ’ล“๐‘ โˆฅ2+ โˆฅ๐‘ฅห†๐‘ โˆ’๐‘ฅ๐‘ โˆฅ2 (9.35) for any๐‘ฅ๐‘ ,๐‘ฅห†๐‘  โˆˆ Csc(๐‘Ÿsc), ล“๐‘ ,ล“ห†๐‘  โˆˆ Ciso(๐‘Ÿiso), and ๐‘ก๐‘  โˆˆ [0, ๐‘ก๐‘“] as in (9.31), then we get the following probabilistic bound forโˆƒ๐œ€exit โˆˆRโ‰ฅ0:

๐‘stay โ‰ฅ 1โˆ’๐œ€exit =

๏ฃฑ๏ฃด

๏ฃด๏ฃด

๏ฃฒ

๏ฃด๏ฃด

๏ฃด

๏ฃณ

1โˆ’ ๐ธ๐‘ 

ยฏ ๐‘ฃ

๐‘’โˆ’

๐ป(๐‘ก ๐‘“)

ยฏ

๐‘ฃ if๐‘ฃยฏ โ‰ฅ sup๐‘กโˆˆ [๐‘ก๐‘  ,๐‘ก๐‘“]โ„Ž(๐‘ก)

ยฏ ๐›ผ

1โˆ’ ยฏ

๐›ผ ๐ธ๐‘ +(๐‘’๐ปยฏ(๐‘ก๐‘“)โˆ’1)sup๐‘กโˆˆ [๐‘ก๐‘  ,๐‘ก๐‘“]โ„Ž(๐‘ก)

ยฏ ๐›ผ๐‘ฃ ๐‘’ยฏ

ยฏ

๐ป(๐‘ก๐‘“) otherwise

(9.36)

๐‘stay=P๐‘๐‘ 

๏ฃฎ

๏ฃฏ

๏ฃฏ

๏ฃฏ

๏ฃฏ

๏ฃฐ ร™

๐‘กโˆˆ[๐‘ก๐‘ ,๐‘ก๐‘“]

๐‘ฅ(๐‘ก) โˆˆ Csc(๐‘Ÿยฏsc)

๏ฃน

๏ฃบ

๏ฃบ

๏ฃบ

๏ฃบ

๏ฃป

where ๐‘๐‘  = (ล“๐‘ ,ล“ห†๐‘ , ๐‘ฅ๐‘ ,๐‘ฅห†๐‘ ) denotes the states at time ๐‘ก = ๐‘ก๐‘ , which satisfy ล“๐‘  โˆˆ Ciso(๐‘Ÿยฏiso),ล“ห†๐‘  โˆˆ Ciso(๐‘…ยฏiso), ๐‘ฅ๐‘  โˆˆ Csc(๐‘Ÿยฏsc), and๐‘ฅห†๐‘  โˆˆ Csc(๐‘…ยฏsc) for๐‘Ÿยฏiso =๐‘Ÿiso โˆ’๐‘๐‘’ โ‰ฅ 0,

ยฏ

๐‘…iso = ๐‘Ÿiso โˆ’ 2๐‘๐‘’ โ‰ฅ 0, ๐‘Ÿยฏsc = ๐‘Ÿsc โˆ’ ๐‘๐‘’ โ‰ฅ 0, and ๐‘…ยฏiso = ๐‘Ÿsc โˆ’ 2๐‘๐‘’ โ‰ฅ 0, ๐ธ๐‘  = ๐‘ฃ(๐‘ฅ๐‘ , ๐‘ก๐‘ ) +๐›ฟ๐‘โˆฅ๐‘๐‘ โˆ’๐‘๐‘‘(๐‘ก๐‘ ) โˆฅ,๐ป(๐‘ก) =โˆซ๐‘ก

๐‘ก๐‘ 

โ„Ž(๐œ)๐‘‘ ๐œ,๐ปยฏ(๐‘ก) =2๐ป(๐‘ก)๐›ผยฏ/sup๐‘กโˆˆ[๐‘ก๐‘ ,๐‘ก๐‘“]โ„Ž(๐‘ก), and โ„Ž(๐‘ก) = ๐ฟ๐‘˜๐œ๐‘ก(๐‘๐‘ , ๐‘ก๐‘ )/โˆš๏ธ

๐‘šsc(๐‘ก๐‘“). The suitable choices of ๐‘ฃ ,ยฏ ๐›ผ, ๐›ฟยฏ ๐‘ โˆˆ R>0 are to be defined below.

Proof of Lemma 9.3. Let us define a non-negative and continuous function ๐ธ(๐‘ฅ , ๐‘ก) as

๐ธ(๐‘ฅ , ๐‘ก) =๐‘ฃ(๐‘ฅ , ๐‘ก) +๐›ฟ๐‘โˆฅ๐‘โˆ’ ๐‘๐‘‘(๐‘ก) โˆฅ (9.37)

where ๐‘ฅ = [๐‘โŠค,๐‘ยคโŠค]โŠค and ๐›ฟ๐‘ โˆˆ R>0. Note that ๐ธ(๐‘ฅ , ๐‘ก) of (9.37) is 0 only when ๐‘ฅ = ๐‘ฅ๐‘‘(๐‘ก). Since we have ๐’œโˆฅ๐‘ โˆ’ ๐‘๐‘‘(๐‘ก) โˆฅ โ‰ค ๐‘ฃ(๐‘ฅ , ๐‘ก)/โˆš๏ธ

๐‘šsc(๐‘ก๐‘“) โˆ’๐œ†โˆฅ๐‘ โˆ’ ๐‘๐‘‘(๐‘ก) โˆฅ, designing๐›ฟ๐‘to have๐›ผโˆ’๐›ฟ๐‘/โˆš๏ธ

๐‘šsc(๐‘ก๐‘“) >0, the relation (9.35) gives ๐’œ๐ธ(๐‘ฅ๐‘ , ๐‘ก๐‘ ) โ‰ค โˆ’๐›ผ ๐ธยฏ (๐‘ฅ๐‘ , ๐‘ก๐‘ ) + ๐ฟ๐‘˜

โˆš๏ธ

๐‘šsc(๐‘ก๐‘“)

โˆš๏ธโˆฅล“ห†๐‘ โˆ’ล“๐‘ โˆฅ2+ โˆฅ๐‘ฅห†๐‘ โˆ’๐‘ฅ๐‘ โˆฅ2 (9.38) for any๐‘ฅ๐‘ ,๐‘ฅห†๐‘  โˆˆ Csc(๐‘Ÿsc),ล“๐‘ ,ล“ห†๐‘  โˆˆ Ciso(๐‘Ÿiso), and๐‘ก๐‘  โˆˆ [0, ๐‘ก๐‘“], where

ยฏ

๐›ผ=min

๐›ผโˆ’๐›ฟ๐‘/

โˆš๏ธƒ

๐‘šsc(๐‘ก๐‘“), ๐œ†

>0.

Let us define another non-negative and continuous function๐‘Š(๐‘ฅ , ๐‘ก) as ๐‘Š(๐‘ฅ , ๐‘ก) =๐ธ(๐‘ฅ , ๐‘ก)๐‘’๐›พ ๐ป(๐‘ก) + ๐‘’๐›พ ๐ป(๐‘ก๐‘“) โˆ’๐‘’๐›พ ๐ป(๐‘ก)

๐›พ

where ๐›พ is a positive constant that satisfies 2 ยฏ๐›ผ โ‰ฅ ๐›พsup๐‘กโˆˆ[0,๐‘ก๐‘“] โ„Ž(๐‘ก). If ยฏ๐‘ฃ ,๐‘คยฏ โˆˆ R>0

is selected to have๐‘Š(๐‘ฅ , ๐‘ก) < ๐‘คยฏ โ‡’ ๐ธ(๐‘ฅ , ๐‘ก) < ๐‘ฃยฏ โ‡’ ๐‘ฅ โˆˆ Csc(๐‘Ÿยฏsc) = ร

๐‘กโˆˆ[0,๐‘ก๐‘“]{๐‘  โˆˆ R๐‘›| โˆฅ๐‘ โˆ’๐‘ฅ๐‘‘(๐‘ก) โˆฅ < ๐‘Ÿยฏsc} โŠ‚ R๐‘›, which is always possible since ๐ธ(๐‘ฅ , ๐‘ก) of (9.37) is 0 only when๐‘ฅ =๐‘ฅ๐‘‘(๐‘ก), then we can utilize (9.38) to compute๐’œ๐‘Š as follows for any ๐‘ฅ๐‘  = [๐‘โŠค

๐‘ ,๐‘ยคโŠค

๐‘ ]โŠค โˆˆ Csc(๐‘Ÿยฏsc)and๐‘ก๐‘  โˆˆ [0, ๐‘ก๐‘“]that satisfies๐‘Š(๐‘ฅ๐‘ , ๐‘ก๐‘ ) < ๐‘คยฏ:

๐’œ๐‘Š(๐‘ฅ๐‘ , ๐‘ก๐‘ ) = (๐›พ โ„Ž(๐‘ก๐‘ )๐ธ(๐‘ฅ๐‘ , ๐‘ก๐‘ ) +๐’œ๐ธ(๐‘ฅ๐‘ , ๐‘ก๐‘ ))๐‘’๐›พ ๐ป(๐‘ก๐‘ ) โˆ’โ„Ž(๐‘ก๐‘ )๐‘’๐›พ ๐ป(๐‘ก๐‘ ) (9.39)

โ‰ค (๐’ฝ(ล“๐‘ ,ล“ห†๐‘ , ๐‘ฅ๐‘ ,๐‘ฅห†๐‘ ) โˆ’โ„Ž(๐‘ก๐‘ ))๐‘’๐›พ ๐ป(๐‘ก๐‘ ) where๐’ฝ(ล“๐‘ ,ล“ห†๐‘ , ๐‘ฅ๐‘ ,๐‘ฅห†๐‘ ) =๐ฟ๐‘˜

โˆš๏ธƒ

( โˆฅล“ห†๐‘ โˆ’ล“๐‘ โˆฅ2+ โˆฅ๐‘ฅห†๐‘ โˆ’๐‘ฅ๐‘ โˆฅ2)/๐‘šsc(๐‘ก๐‘“)and the inequal- ity follows from the relation ยฏ๐›ผ โ‰ฅ ๐›พsup๐‘กโˆˆ[0,๐‘ก๐‘“] โ„Ž(๐‘ก). Applying Dynkinโ€™s formula [30, p. 10] to (9.39), it can be verified that๐‘Š(๐‘ฅ(๐‘ก), ๐‘ก)is a non-negative supermartingale due to the fact thatE๐‘๐‘  [๐’ฝ(ล“(๐‘ก),ล“(ห† ๐‘ก), ๐‘ฅ(๐‘ก),๐‘ฅห†(๐‘ก))] โ‰ค ๐ฟ๐‘˜๐œ๐‘ก(๐‘๐‘ , ๐‘ก๐‘ )/โˆš๏ธ

๐‘šsc(๐‘ก๐‘“) =โ„Ž(๐‘ก) by definition of ๐œ๐‘ก(๐‘๐‘ , ๐‘ก๐‘ ) in (9.24) of Assumption 9.1, which gives the following due to Villeโ€™s maximal inequality [30, p. 26]:

P๐‘๐‘ 

"

sup

๐‘กโˆˆ[๐‘ก๐‘ ,๐‘ก๐‘“]

๐‘Š(๐‘ฅ(๐‘ก), ๐‘ก) โ‰ฅ๐‘คยฏ

#

โ‰ค ๐ธ๐‘ +๐›พโˆ’1(๐‘’๐›พ ๐ป(๐‘ก๐‘“) โˆ’1)

ยฏ ๐‘ค

(9.40) as long as we have ล“(๐‘ก),ล“ห†(๐‘ก) โˆˆ Ciso(๐‘Ÿiso) and ห†๐‘ฅ(๐‘ก) โˆˆ Csc(๐‘Ÿsc) for โˆ€๐‘ก โˆˆ [๐‘ก๐‘ , ๐‘ก๐‘“], where๐‘๐‘  =(ล“๐‘ ,ล“ห†๐‘ , ๐‘ฅ๐‘ ,๐‘ฅห†๐‘ )is as defined in (9.36). For๐‘๐‘  satisfyingล“๐‘  โˆˆ Ciso(๐‘Ÿยฏiso),

ห†

ล“๐‘  โˆˆ Ciso(๐‘…ยฏiso), ๐‘ฅ๐‘  โˆˆ Csc(๐‘Ÿยฏsc), and ห†๐‘ฅ๐‘  โˆˆ Csc(๐‘…ยฏsc), the occurrence of the events of Assumption 9.1 and the forward invariance condition of (9.26) of Assumption 9.2 ensure that if we have ๐‘ฅ(๐‘ก) โˆˆ Csc(๐‘Ÿยฏsc) for โˆ€๐‘ก โˆˆ [๐‘ก๐‘ , ๐‘ก๐‘“], we get ล“(๐‘ก) โˆˆ Ciso(๐‘Ÿยฏiso),

ห†

ล“(๐‘ก) โˆˆ Ciso(๐‘Ÿiso), and ห†๐‘ฅ(๐‘ก) โˆˆ Csc(๐‘Ÿsc) for โˆ€๐‘ก โˆˆ [๐‘ก๐‘ , ๐‘ก๐‘“]. Therefore, selecting the constants๐›พ and ยฏ๐‘ค in the inequality (9.40) as in [30, pp. 82-83] yields (9.36) due to the relation๐‘Š(๐‘ฅ , ๐‘ก) < ๐‘คยฏ โ‡’ ๐ธ(๐‘ฅ , ๐‘ก) < ๐‘ฃยฏ โ‡’๐‘ฅ โˆˆ Ciso(๐‘Ÿยฏiso).

Now, let us select ๐‘๐‘’ of Assumption 9.1 as ๐‘๐‘’ = ๐‘˜๐‘’E

๐œ๐‘ก๐‘ (๐‘0,0)

, where ๐‘0 is the tuple of the true and estimated ISO and spacecraft state at time ๐‘ก = 0 and ๐‘˜๐‘’ โˆˆ R>0. Using Markovโ€™s inequality along with the bounds (9.23) and (9.24) of Assumption 9.1, we get the following probability bound:

P

hโˆš๏ธโˆฅล“ห†๐‘  โˆ’ล“๐‘ โˆฅ2+ โˆฅ๐‘ฅห†๐‘ โˆ’๐‘ฅ๐‘ โˆฅ2 < ๐‘˜๐‘’E

๐œ๐‘ก๐‘ (๐‘0,0)i

โ‰ฅ (1โˆ’๐œ€est)

1โˆ’ 1 ๐‘˜๐‘’

. (9.41) Since we have ห†ล“๐‘  โˆˆ Ciso(๐‘…ยฏiso) and ห†๐‘ฅ๐‘  โˆˆ Csc(๐‘…ยฏsc) by the theorem assumption, the occurrence of the event in (9.41) implies ล“๐‘  โˆˆ Ciso(๐‘Ÿยฏiso) and ๐‘ฅ๐‘  โˆˆ Csc(๐‘Ÿยฏsc) for

ยฏ

๐‘Ÿiso = ๐‘Ÿiso โˆ’ ๐‘˜๐‘’E

๐œ๐‘ก๐‘ (๐‘0,0)

โ‰ฅ 0, ยฏ๐‘Ÿsc = ๐‘Ÿsc โˆ’ ๐‘˜๐‘’E

๐œ๐‘ก๐‘ (๐‘0,0)

โ‰ฅ 0, and then ล“(๐‘ก) โˆˆ Ciso(๐‘Ÿยฏiso), โˆ€๐‘ก โˆˆ [๐‘ก๐‘ , ๐‘ก๐‘“] due to the forward invariance condition (9.26) in Assumption 9.2. Therefore, using ๐œ€err of Assumption 9.1 along with the result of Lemma 9.3, we haveโˆƒ๐œ€ctrl โˆˆRโ‰ฅ0s.t.

P[ A |ล“(ห† ๐‘ก๐‘ ) =ล“ห†๐‘  โˆฉ๐‘ฅห†(๐‘ก๐‘ ) =๐‘ฅห†๐‘ ] โ‰ฅ 1โˆ’๐œ€ctrl = (1โˆ’๐œ€exit) (1โˆ’๐œ€est) (1โˆ’๐œ€err)

1โˆ’ 1 ๐‘˜๐‘’

(9.42) whereA =ร‘

๐‘กโˆˆ[๐‘ก๐‘ ,๐‘ก๐‘“]ล“(๐‘ก) โˆˆ Ciso(๐‘Ÿยฏiso) โˆฉล“(ห† ๐‘ก) โˆˆ Ciso(๐‘Ÿiso) โˆฉ๐‘ฅ(๐‘ก) โˆˆ Csc(๐‘Ÿยฏsc) โˆฉ๐‘ฅห†(๐‘ก) โˆˆ Csc(๐‘Ÿsc) โˆฉ E withE being the event of (9.23) in Assumption 9.1. The desired rela- tion (9.27) follows by evaluating the first term of the integral in (9.34), and by ob- serving that ifAof (9.42) occurs, we have๐‘ก๐‘’ > ๐‘ก๐‘“ andโˆš๏ธ

โˆฅล“ห†๐‘ โˆ’ล“๐‘ โˆฅ2+ โˆฅ๐‘ฅห†๐‘ โˆ’๐‘ฅ๐‘ โˆฅ2<

๐‘˜๐‘’E

๐œ๐‘ก๐‘ (๐‘0,0)

due to (9.41).

Note that if Eiso = Esc = R๐‘› in Assumption 9.1 and ๐‘˜ is globally Lipschitz in Assumption 9.2, the estimation bound (9.24) and the Lipschitz bound (9.25) always hold without the second condition of Assumption 9.1. This indicates that the bound (9.27) holds as long as (ล“๐‘ , ๐‘ฅ๐‘ ) โˆˆ Dest for given ห†ล“๐‘ ,๐‘ฅห†๐‘  โˆˆ R๐‘›, which occurs with probability at least 1โˆ’๐‘˜โˆ’1

๐‘’ due to (9.41).

As derived in Theorem 9.1, the SN-DNN min-norm control policy (9.17) of Lemma 9.2 enhances the SN-DNN terminal guidance policy of Lemma 9.1 by providing the explicit spacecraft delivery error bound (9.27), which holds even un- der the presence of the state uncertainty. This bound is valuable in modulating the learning and control parameters to achieve a verifiable performance guaran- tee consistent with a mission-specific performance requirement as to be seen in Sec. 9.4.II.

As illustrated in Fig. 9.6, the expected state tracking error bound (9.27) of Theo- rem 9.1 decreases exponentially in time if ๐œ๐‘ก(๐‘๐‘ , ๐‘ก๐‘ ) is a non-increasing function

: S/C : ISO

๐‘ฅ(๐‘ก๐‘ ) ล“(๐‘ก๐‘ )

: expected tracking error bound

๐œŒ

เทœ

๐‘ฅ ๐‘ก , ล“เท ๐‘ก

เทœ

๐‘ฅ ๐‘ก , ล“เท ๐‘ก

เทœ

๐‘ฅ ๐‘ก , ล“เท ๐‘ก

Figure 9.6: Illustration of the expected position tracking error bound (9.27) of Theorem 9.1, where๐œ๐‘ก(๐‘๐‘ , ๐‘ก๐‘ ) is assumed to be a non-increasing function in time.

in ๐‘ก. The following example demonstrates how we compute the bound (9.27) in practice.

Example 9.1. Suppose that the estimation error is upper-bounded by a function that exponentially decreases in time, i.e., we have

๐œ๐‘ก(๐‘๐‘ , ๐‘ก๐‘ ) =๐‘’โˆ’๐›ฝ(๐‘กโˆ’๐‘ก๐‘ )

โˆš๏ธโˆฅล“ห†๐‘ โˆ’ล“๐‘ โˆฅ2+ โˆฅ๐‘ฅห†๐‘ โˆ’๐‘ฅ๐‘ โˆฅ2+๐‘

for (9.24)in Assumption 9.1, where ๐‘ โˆˆ Rโ‰ฅ0 and ๐›ฝ โˆˆ R>0. Assuming that ๐›ผ โ‰  ๐›ฝ, ๐›ฝโ‰  ๐œ†, and๐œ†โ‰ ๐›ผfor simplicity, we get

RHS of(9.27) โ‰ค ๐‘’โˆ’๐œ†๐‘กยฏ๐‘“( โˆฅ๐‘ห†๐‘ โˆ’ ๐‘๐‘‘(๐‘ก๐‘ ) โˆฅ +๐‘๐‘’) + ๐‘’โˆ’๐œ†๐‘กยฏ๐‘“ โˆ’๐‘’โˆ’๐›ผ๐‘กยฏ๐‘“

๐›ผโˆ’๐œ†

(๐œ†ยฏ+1) ( โˆฅ๐‘ฅห†๐‘ โˆ’๐‘ฅ๐‘‘(๐‘ก๐‘ ) โˆฅ +๐‘๐‘’)

โˆš๏ธ

๐‘šsc(๐‘ก๐‘“) + ๐ฟ๐‘˜

๐‘šsc(๐‘ก๐‘“)

( ๐‘๐‘’ ๐›ผโˆ’ ๐›ฝ

๐‘’โˆ’๐œ†ยฏ๐‘ก๐‘“ โˆ’๐‘’โˆ’๐›ฝยฏ๐‘ก๐‘“ ๐›ฝโˆ’๐œ†

โˆ’ ๐‘’โˆ’๐œ†ยฏ๐‘ก๐‘“ โˆ’๐‘’โˆ’๐›ผยฏ๐‘ก๐‘“ ๐›ผโˆ’๐œ†

+ ๐‘ ๐›ผ

1โˆ’๐‘’โˆ’๐œ†๐‘กยฏ๐‘“ ๐œ†

โˆ’ ๐‘’โˆ’๐œ†๐‘กยฏ๐‘“ โˆ’๐‘’โˆ’๐›ผ๐‘กยฏ๐‘“ ๐›ผโˆ’๐œ†

(9.43) where๐‘กยฏ๐‘“ =๐‘ก๐‘“ โˆ’๐‘ก๐‘ . Note that the bound(9.43)can be computed explicitly for given

ห†

๐‘ฅ๐‘  andล“ห†๐‘  at time๐‘ก =๐‘ก๐‘  as long asล“ห†๐‘  โˆˆ Ciso(๐‘…ยฏiso) and๐‘ฅห†๐‘  โˆˆ Csc(๐‘…ยฏsc). Its dominant term in(9.43)for large๐‘กยฏ๐‘“ is๐ฟ๐‘˜๐‘/(๐‘šsc(๐‘ก๐‘“)๐›ผ๐œ†)due to the following relation:

lim

๐‘ก๐‘“โ†’โˆžRHS of(9.43)= ๐ฟ๐‘˜๐‘ ๐‘šsc(๐‘ก๐‘“)๐›ผ๐œ†

.

The bound 9.27 of Theorem 9.1 can be used as a tool to determine whether we should use the SN-DNN terminal guidance policy of Lemma 9.1 or the SN-DNN min-norm control policy (9.17) of Lemma 9.2, based on the trade-off between them as to be seen in the following.

๐‘ขโˆ— ๐‘ขโ„“

๐‘ข

ล“(๐‘ก)

On-board Navigation

LAUNCH

2. CRUISE G&C 3. โ€œONLINEโ€ TERMINAL G&C โ€“ NEURAL-RENDEZVOUS : S/C

: ISO

๐‘ฆ ๐‘ก ๐‘ฅ ๐‘ก

SN-DNN Guidance ๐‘ขโ„“

On-board Navigation ๐œŒ SN-DNN

Min-norm Control ๐‘ขโˆ—

เทœ

๐‘ฅ ๐‘ก เทล“ ๐‘ก ๐‘ฅ ๐‘กเทœ , ล“เท ๐‘ก ๐‘ฅ ๐‘กเทœ , ล“เท ๐‘ก

๐‘ขโˆ—

Ground-based Navigation

Spectrally-Normalized Deep Neural Network

(SN-DNN) { าง๐‘ฅ๐‘–}๐‘–=1๐‘๐‘‘

{ เดฅล“๐‘–}๐‘–=1๐‘๐‘‘ { าง๐‘ก๐‘–}๐‘–=1๐‘๐‘‘ {๐›ฅ าง๐‘ก๐‘–}๐‘–=1๐‘๐‘‘ { เดค๐œŒ๐‘–}๐‘–=1๐‘๐‘‘

Nonlinear Optimization- based Motion Planner

(MPC)

{๐‘ขmpc( าง๐‘ฅ๐‘–, เดฅล“๐‘–, าง๐‘ก๐‘–)}๐‘–=1๐‘๐‘‘ {๐œ‘mpcาง๐‘ก๐‘–+๐›ฅ าง๐‘ก๐‘–( าง๐‘ฅ๐‘–, เดฅล“๐‘–, าง๐‘ก๐‘–)}๐‘–=1๐‘๐‘‘

{๐‘ขโ„“( าง๐‘ฅ๐‘–, เดฅล“๐‘–, าง๐‘ก๐‘–)}๐‘–=1๐‘๐‘‘ target control input learned control input target state trajectory learned state trajectory

1. โ€œOFFLINEโ€ SN-DNN TRAINING FOR FASTER ONLINE COMPUTATION

{๐œ‘โ„“าง๐‘ก๐‘–+๐›ฅ าง๐‘ก๐‘–( าง๐‘ฅ๐‘–, เดฅล“๐‘–, าง๐‘ก๐‘–)}๐‘–=1๐‘๐‘‘ SN-DNN Loss Functions

MISSION DESIGN PHASE Input data for SN-DNN training Target data and SN-DNN outputs

Figure 9.7: Mission timeline for encountering an ISO, where ๐‘ฆ(๐‘ก) is the state measurement as in Fig. 9.2 and the other notation follows that of Lemma 9.1 and Theorem 9.1. Terminal G&C (Neural-Rendezvous) is performed using the SN- DNN guidance and min-norm control policies online as in Algorithm 1, thereby enabling fast response autonomous operations under the large ISO state uncertainty and high-velocity challenges. Note that the SN-DNN is trained offline.

9.4.II Neural-Rendezvous

We summarize the pros and cons of the aforementioned terminal G&C techniques.

โ€ข The SN-DNN terminal guidance policy of Lemma 9.1 can be implemented in real-time and possesses an optimality gap as in (9.11), resulting in the near-

Algorithm 1:Neural-Rendezvous Inputs :๐‘ขโ„“of Lemma 9.1 and๐‘ขโˆ—

โ„“ of Lemma 9.2 Outputs:Control input๐‘ขof (9.1) for๐‘ก โˆˆ [0, ๐‘ก๐‘“] ๐‘ก0โ†current time

ฮ”๐‘กโ†control time interval ๐‘ก๐‘˜, ๐‘ก๐‘‘, ๐‘กintโ†current timeโˆ’๐‘ก0 flagA, flagBโ†0

while๐‘ก๐‘˜ < ๐‘ก๐‘“ do

Obtain ห†ล“(๐‘ก๐‘˜)and ห†๐‘ฅ(๐‘ก๐‘˜)using navigation technique ifflagA=0then

๐‘ขโ†๐‘ขโ„“(๐‘ฅห†(๐‘ก๐‘˜),ล“(ห† ๐‘ก๐‘˜), ๐‘ก๐‘˜) else

ifflagB=1then ๐‘ขโ†๐‘ขโˆ—

โ„“(๐‘ฅห†(๐‘ก๐‘˜),ล“(ห† ๐‘ก๐‘˜), ๐‘ก๐‘˜) else

ComputeRHS of(9.27) in Theorem 9.1 with๐‘ก๐‘ =๐‘ก๐‘˜ ifRHS of(9.27)> ๐‘ก โ„Ž๐‘Ÿ ๐‘’ ๐‘  โ„Ž๐‘œ๐‘™ ๐‘‘then

๐‘ขโ†๐‘ขโ„“(๐‘ฅห†(๐‘ก๐‘˜),ล“(ห† ๐‘ก๐‘˜), ๐‘ก๐‘˜) else

๐‘ขโ†๐‘ขโˆ—

โ„“(๐‘ฅห†(๐‘ก๐‘˜),ล“(ห† ๐‘ก๐‘˜), ๐‘ก๐‘˜) flagBโ†1

Apply๐‘ขto (9.1) for๐‘กโˆˆ [๐‘ก๐‘˜, ๐‘ก๐‘˜+ฮ”๐‘ก] whilecurrent timeโˆ’๐‘ก0 < ๐‘ก๐‘˜+ฮ”๐‘กdo

ifflagB=0then

Integrate dynamics to get๐‘ฅ๐‘‘andล“๐‘‘of (9.14) form๐‘กint ifintegration is completethen

Update๐‘ฅ๐‘‘andล“๐‘‘

๐‘ก๐‘‘, ๐‘กintโ†๐‘ก๐‘˜+ฮ”๐‘ก flagAโ†1 else

๐‘กintโ†time when S/C stopped integration ๐‘ก๐‘˜ โ†๐‘ก๐‘˜+ฮ”๐‘ก

optimal guarantee in terms of dynamic regret as discussed below (9.5) and in Lemma 9.1. However, obtaining any quantitative bound on the spacecraft delivery error with respect to the desired relative position, to either flyby or impact the ISO, is difficult in general [7].

โ€ข In contrast, the SN-DNN min-norm control policy (9.17) of Lemma 9.2, which solves the problem 9.1 of Sec. 9.1, can also be implemented in real-time and provides an explicit upper-bound on the spacecraft delivery error that decreases in time as proven in Theorem (9.1). However, the desired trajectory it tracks is subject to the large state uncertainty initially for small๐‘ก๐‘ , resulting in a large optimality gap.

Based on these observations, it is ideal to utilize the SN-DNN terminal guidance policy without any feedback control initially for large state uncertainty, to avoid having a large optimality gap, then activate the SN-DNN min-norm control once the verifiable spacecraft delivery error of (9.27) of Theorem (9.1) becomes smaller than a mission-specific threshold value for the desired trajectory (9.14), which is to be updated at๐‘ก๐‘‘ โˆˆ [0, ๐‘ก๐‘“] along the way as discussed in Remark 9.6. The pseudo- code for the proposed learning-based approach for encountering the ISO is given in Algorithm 1, and its mission timeline is shown in Fig. 9.7.

Again, the role of offline learning in Neural-Rendesvous, with the size of the neural network selected to be small enough for real-time implementation, is to deal with the limited computational resources of the spacecraft in solving any nonlinear opti- mization with non-negligible online computational load, including the one of MPC with receding horizons. This also enables utilizing Neural-Rendezvous in more challenging and highly nonlinear G&C scenarios as to be seen in Sec. 10.2, where solving even a single optimization online could become unrealistic.

Remark 9.7. We use ๐‘กint in Algorithm 1 to account for the fact that computing๐‘ฅ๐‘‘ could take more than ฮ”๐‘ก for small ฮ”๐‘ก and ๐‘ก๐‘˜ as pointed out in Remark 9.6, and thus the spacecraft is sometimes required to compute it over multiple time steps.

It can be seen that the SN-DNN terminal guidance policy ๐‘ขโ„“ is also useful when the spacecraft does not possess ๐‘ฅ๐‘‘ yet, i.e., when flagA = 0. Also, the impact of discretization introduced in Algorithm 1 will be demonstrated in Sec. 10.1.III, where the detailed discussion of its connection to continuous-time stochastic systems can be found in [32].