Untitled

(1)

INFORMATION

Volume 19, Number l

,

pp.17‑23

ISSN 1343‑4500

ｩ2016

International Information Institute

P o s i t i o n a l Variance of the Hand based on Linear Quadratic Regulator i n

Two‑Link Upper‑Arm Reaching Movements

Yoshiaki Taniai*1, Tomohide Naniwa1, Munadi2

1. Graduate School of Engineering, University of Fukui, Fukui 910‑8507, Japan, 2. Department of Mechanical Engineering, Diponegoro University, Semarang 50275,

Indonesia

*E‑mail: ta凶[email protected].・s.u-fuk凶.ac.jp

Abstract

In upper‑arm reaching movements, positional variances of the optiｭ mal trajectories based on optimal feedback control have been reported to exhibit a peak around the middie of the movement duration in a oneｭ link arm model and in a mass point model. However, the variance in a two‑link arm model has not been examined. In thi司 study, we show that the positional variance of the hand of a twcトlinkarm based on a l凶e紅

quadratic regulator exhibits a peak around the middle of the movement duration and this property agrees with that of the measured trajectories of humans

Keywords : two-li叫王 upper-armreaching movement, optimal feedback control, linear quadratic regulator, positional variance

1 Introduction

In upper‑arm reaching movements of humans, the trajectories of the hand are slight curves 叩d the註 speed profiles are bell司shaped curves [l]. F\凶hermore,

the positional variances of the hand exhibit a peak around the middle of the

mo問mentduration in iterative one‑link upper‑arm reaching movements [2

]

and in iterative twかlinkupper‑arm reaching mo刊ments[3][4].

Todorov and Li (2005) proposed the iterative linear quadratic Gaussian method (iLQG) as an optimal feedback control for upper‑arm reaching _mo刊－

ments [5]. The results show that the optimal trajectories of the hand are slight curves and that 出eoptimal speed profiles are bell‑shaped curves. Furthermore, the optimal trajectories based on optimal feedback control have been reported to exhibit a reduction of the positional variance around the middle of the moveｭ ment duration in a m部spoint model [4] or an inertial point model [6]. However, the characteristic of the positional variance of the hand in a _two-li叫Earm model has not been examined. In this study, we examined the positional variance of the hand based on a linear quadratic regulator (LQR) in the twiかlinkupper‑arm

r巴aching movements.

ウt

1i

(2)

YOSHIAKI T ANIAI, TOMOHIDE NANIW A AND MUN ADI

y

Otargd(O. 0.55)

:l'

Figure 1: Task of a twcトlink upper‑arm reaching movement in the horizontal plane.

2 乱1ethod

We _obtain巴dthe optimal control signals by using an LQR for a nonlinear muscuｭ loskeletal system after its linearization and then computed the trajectories from the optimal control signals and signal‑dependent noise. Finally, the positional variance of the hand was compuもed from the trajectories.

2 . 1 Task

Figure 1 shows the task of an upper‑arm reaching movement. The start position Ps

=

(xs, Ys) is (0, 0.3) [m), and the target position Pe

=

(xe, Ye) is (0, 0.55) [m]. The movement duration is 0.35 s.

2 . 2 M u s c u l o s k e l e t a l system

The musculoskeletal system is a twcトlink six‑muscle arm model. The dynamics of the system g(x, u) are as follows [7]:

where the state vector x denotes [q, q,

f j T ,

u = [u1, u2, ｷｷｷ , u6JT represents the vector of control

_sign叫s,

q

=

[81, 82jT indicates the joint angles,

f =

[f1,

h , ｷｷｷ ,

f5]T refers to the muscle forces, r

=

[T1, T2]T denotes the joint torques that are related to the muscle forces, the matrix of time constants T is diag[T1,T2 ，・・・， T5],and fc = lJc1, fc2 ，・·• , f c6JT represents the intrinsic muscle forces.

M(q)

[ I ＋叫 2 1 '

(2)

ら＋M2L1S2 cos 82 12

I

「山（201 引いin82] _I _.

(3) M2L1S281 sin 82

I

C(q,q)

円円U

1ょ

(3)

POSITIONAL VARIANCE OF THE HAND BASED ON LINEAR QUADRATIC REGULATOR

Table 1: Parameters of thβtwo-link arm model Parameter Link 1 Link 2

Mi [kg] 1.59 1.44 Li [m] 0.30 0.35 Ii [kgm2] 0.18 0.21 Si [cm] 6.78 7.99

Thepararr叫ersof the two‑link arm model are given in Table 1 [8]. The dynamics of the muscle forces have the low‑pass filter property in th叫 the muscle force changes smoothly [9], and the time constant T; is set 回 0.2 s.

The intrinsic muscle forces

f

^care determined as follows:

fc(u)

=u+Di+K(l 一九）

(4) where l

=

[li, ^l²^, ^,^l⁶^J^T^denotes the vector of muscle lengths, le represents the vector of muscle natural leng七hs, the matrix of viscosity coefficients D is diag[d1, _{d2γ ・・} , d6], and 七hematrix of the elasticity _{coe伍cien七s}K is diag[k1, k2, ｷｷｷ , k6]ｷ The

coefficier山 ki

and di are set as 1.0ﾗ103

_N/ _{m 阻d}

2.5ﾗ103 Ns/m, re‑

spectively. The muscle length vector l is described 出

l = Gq

+

l。（ 5)

＝［一α1α1

⁰

0 ーα3

³¹

I

^T

̲

(6) 0 0 －α2α2 一α4α41

where G represents the matrix of the moment arms，向 denotesthe moment arm (Table 2), and lo indicates the muscle length for q

=

0. Then, the following equation is derived 仕omEqs. (4) and (5):

fc(u) =u+DGq+KG(q ーも） (7) where qe denotes the non‑singular angle for the natural length le. The nonｭ singular angle qe is assumed to be the joint angle for the target position Peｷ On the other hand, the joint torque T is described 出

T

=

^‑GTf. (8) Then, the alternative dynamics of the musculoskeletal system are derived by substituting Eqs. (4) and (8) into Eq. (1).

,J I q I I

. i J

I

ニ liJ.I

=g(x,it)=

I

^M‑1(q)(^‑^C(q,q)^‑GTf)

I

(9) dt

l J t l r ‑ 1 { f+u+DGq+KG(q ーもnJ

2 . 3 LQR c o n t r o l l e r

We obtained the control signals by using an LQR [7]. The linearized model of the musculoskeletal system around a non‑singular point (x＊，ザ） is described 槌

-

U

B +

-

一一 A Z

m

-

d

一必

(10)

ハ可U1i

(4)

YOSHIAKI TANIAI. TOMOHIDE NANIWA AND MUNADI

Table 2: Moment arms of muscles. Parameter

Moment arm [cm]

α1

4.0

α2

2.5

α3

3.5

α4

3.0

where

x

= x ‑_x_ぺ _U=U-Uぺ

u z

Qu一znOヌσ一一A U 2 ⁹一UAυ一円。一一B ) －Eム^可^Eム(

The non‑singular point is _{set 剖（}_［qe, 01xsJI', 0). The evaluation function J is

J=xT悦戸（t1)+ fot1 内定＋内同

⁽¹²⁾

where t f denotes the mo刊mentduration, Q f = diag(2α × 107

,

2α

×

107, 5

ﾗ

106, 5ﾗ106, 10, 10, 10, 10, 10, 10), Q = 0 , and R = (3

ﾗ

diag(l, 1, 1, 1, 1, _{1 ）}_， _αis set as 0.1, 0.25, and 1.0, and f3 is set 出 1.0, 5.0, and 10.0. Then, the optimal control signal u takes the following form:

u

=

‑ R‑1 BT P(t)x. (13) The matrix _{P （も）} is obtained 仕om the following Ricatti differential equation running backward in time from t = t 1:

P(t)

=

‑AT P(t) ‑P(t)A

+

^P(t)BR¹^BTP(t) ‑Q, (14) P(t1)

=

Q1. (15)

2.4 Positional variance

The signal‑dependent noise w(t) _{w出 added}to the optimal control signal u(t). The applied noise _{w出 the}Gaussian noise, in which the mean _{w回 zero} and the variance _W出_γl_uil2, where _γ_{W出 set}as 1.0.

The positional variance of the hand in the horizontal plane w回 estimated

using a Monte Carlo simulation with 1000 repetitions. The variance σ；11 W槌 defined as the sum of the varianceσ2 _$勺and the variance σ；， in accordance with [3].

The state vector of the musculoskeletal system and the matrix P of the Ricatti differential equation were upda七ed by using the fourth‑order Rungeｭ Kutta method. The 凶tial values of the state vector x(O) were set 酪［q(O),

q(O), f(O)jI', where the inverse kinematics _{W出 used}for determini時 theinitial angles q(O) for the start position p8, q(O)=[O, OJI', and

f

(0) = [O, 0, 0, 0, 0, O]T. The time step was set as 1 ms.

3 Result

Figure 2 (a) shows the optimal trajectory of the hand position based on the LQR. The optimal trajectory of the hand position exhibits an almost straight line. Figure 2 (b) shows the hand speed profile. The speed profile _{exh出回 a}

ハUつ中

(5)

0.2 0.3 0 0.1 0.2 time [s] time [s]

( c) ( d)

Figure 2：・ Optimal trajectory based on the LQR. (a) The hand trajectory, (b) 0.6

0.4 E

>. 0.2

A斗－

hu

今ムハり

円XA rlιAりm ] (

今ノ匂

n u

AU寸0

サ

r「 x10 6

E 8 6

<

:::

司

話 4>

司

ｧ

2

-

ー事 o

同 O 0.1

主 1.5

.f'

.にA2 l

<U

>

.3 0.5

ロ<U 0.0

<

:::

国 0.1 0.2 time [s] (b)

0.3

5 _，、^{，，．‘・}

，、

,

，、

, ,

AU

［

g

守山］

UE『』o－

、

、，

. . ‑

ζJ

0.3

もhehand speed profile, ( c) the positional variances of the hand (The solid line, the dashed line, and the dotted line show the variances in the li.orizontal plane, the x direction, and they direction, respectively), (d) the angular torques (The solid line and the dashed line show the torques of the shoulder and of the elbow, respectively) .

N~ ^X10‑‑tj E 戸一－

8

6

時同

話 4

>

「 X10‑‑6 E F一一一

]

8 6

"

'

^司

旨 4

>

ｧ

2f

I 、、；＇ j J :

²

~ O' ~ 『←一言。

色 0 0.1 0.2 0.3 0.. 0 0.1 0.2 0.3 time [s] time [s]

(a) (b)

Figure 3: Positional variances of the hand in the horizontal plane. (a) The variances for α ＝ 0.1, 0.25, 1.0 and

f 3 =

1.0 ar官 representedby a dotted line, a dashed line, and a solid line, respectively. (b) The varianc四 for α ＝ 1.0 and

f 3

=

1.0, 5.0, 10.0 are represented by a solid line, a dashed line, and a dotted line, respectively.

司1ム円ノω

(6)

YOSHIAKI T ANIAI. TOMOHIDE NANIW A AND MUN ADI

bell句shaped curve with a peak around the middle of the movement duration. Figure 2 ( c) shows the positional variances of the hand. The variance in the horizontal plane exhibits a peak around the middle of the movement duration. Figure 2 ( d) shows the angular torques, which change smoothly.

Figure 3 shows the positional variances of the hand in the horizontal plane. The weightsαand βdenote the weights of the end‑position consもraintand of the energy cost, respectively. When αincreases, the positional variance at the end of the reaching movement decreases because the positional accuracy required by the reaching movement increases. On the other hand, when fJ is large (i.e., the rate of the energy cost in the cost function is large), the pos出onalvariance increases because the control signal is suppressed.

4 Con c l us i o n s

In this study, we examined the positional variance of the hand on the basis of the LQR in the twかlink upper‑arm reaching movement. The result revealed that the positional variance exhibited a peak around the middle of the movement duration. Therefore, optimal trajectories based on the optimal feedback can show the characteristics of a human reaching movement, such as the trajectory, the speed profile, and the positional variance.

It has _b巴enreported that the positional variance of the hand decreases with a decrease in the target size in a one‑link reaching movement of a human [2]. We examined the positional variances for the weights of the cost function. The

resul七 reveal巴dthat the optimal trajectory based on the LQR can represent the positional variance for the target size. The central nervous system might select the weights for the target size.

We adapted the LQR for the nonlinear twか link musculoskeletal system, and the optimal control signal w出 corrupted by the signal‑dependent noise. Although the iLQG was used in the previous studies [4][5), we used the LQR to easily examine the characteristics of the LQR and could _repre田ntthe _char乱cter

istics of the human reaching movement. In the future, we might have to ex乱mine

the characteristics of the optimal trajectoηby minimizing the expected value of the cost function in the twcトlinkarm model.

References

[1] Morasso P., Spatial control of arm movements, Experimental Brain Rか

search, 42(1981), 223‑227.

[2] Osu R., Kamimura N., Iwasaki H., Nakano E., Harris C. M., Wada Y. and Kawato M., Optimal impedance control for task achievement in the presence of signal‑dependent no問， Journalof Neurophysiology, 92(2004)>

1199‑1215.

[3] Morishige K., Osu R., Miyamoto H. and Kawato M., The sources of variｭ ability in the time course of reaching movements, in Brain‑Inspired IT II, International Congress Series 1291, edited by Ishii K., Natsume K., Hanazawa A., Elsevier, 105‑108, 2006.

‑22‑

(7)

[4] Liu D. and Todorov E., Evidence for the flexible sensorimotor _s七rategies predicted by optimal feedback control, Journal of Neuroscience, 27(2007), 9354‑9368.

[5] Todorov E. and Li W., A generalized iterative LQG method for locallyｭ optimal feedback control of constrained nonlinear stochastic systems, Pr，か

ceedings of American Control Conference, 1(2005), 300‑306.

[6] Rigoux L. and Guigon E., A model of reward－姐d effort‑based optimal decision maki時 andmotor control, PLoS Computational Biology, 8(2012), el002716.

[7] Tomi N., Gouko M. and Ito K., Combined mechanisms of internal model control and impedance control under force fields, in Artificial Neural Netｭ works ‑ICANN2009, Lecture Notes in Computer Science 5768, edited by Alippi C., Polycarpou M., Panayiotou C., and Ellinas G., Springer Berlin Heidelberg, 628‑637, 2009.

[8] Kambara H., Kim K., Shin D., Sato M. and Koike Y., Learning and generaｭ tion of goal-directed 紅m reachi時仕omscratch, Neural Networks, 22(2009), 348‑361.

[9] Mannard A. and Stein R. B., Determination of the frequency response of isometric solues muscle in the cat using random nerve stimulation, Journal of Physiology, 229(1973), 275‑296.

円ぺU円／U】