public version (3)

(1)

Models of Motivation for Particle Swarm Optimization with Application to Task Allocation in Multi-Agent Systems

Author:

Hardhienata, Medria Publication Date:

2015

DOI: https://doi.org/10.26190/unsworks/18374 License:

https://creativecommons.org/licenses/by-nc-nd/3.0/au/

Link to license to see what you are allowed to do with this resource.

Downloaded from http://hdl.handle.net/1959.4/54884 in https://

unsworks.unsw.edu.au on 2025-02-19

(2)

Models of Motivation for Particle Swarm Optimization with Application to Task

Allocation in Multi-Agent Systems

Medria Kusuma Dewi Hardhienata

A thesis submitted in fulfilment of the requirements of the degree of

Doctor of Philosophy

University of New South Wales

August, 2014

(3)

(4)

Copyright Statement

I hereby grant the University of New South Wales or its agents the right to archive and to make available my thesis or dissertation in whole or part in the University libraries in all forms of media, now or here after known, subject to the provisions of the Copyright Act 1968. I retain all proprietary rights, such as patent rights. I also retain the right to use in future works (such as articles or books) all or part of this thesis or dissertation. I also authorise University Microfilms to use the 350 word abstract of my thesis in Dissertation Abstract International (this is applicable to doctoral theses only). I have either used no substantial portions of copyright material in my thesis or I have obtained permission to use copyright material; where permission has not been granted I have applied/will apply for a partial restriction of the digital copy of my thesis or dissertation.

Signed:

Medria Kusuma Dewi Hardhienata Date: 09-09-2015

i

(5)

(6)

Authenticity Statement

I certify that the Library deposit digital copy is a direct equivalent of the final officially approved version of my thesis. No emendation of content has occurred and if there are any minor variations in formatting, they are the result of the conversion to digital format.

Signed:

iii

(7)

(8)

Originality Statement

I hereby declare that this submission is my own work and to the best of my knowledge it contains no materials previously published or written by another person, or substantial proportions of material which have been accepted for the award of any other degree or diploma at UNSW or any other educational institution, except where due acknowledgment is made in the thesis. Any contribution made to the research by others, with whom I have worked at UNSW or elsewhere, is explicitly acknowledged in the thesis. I also declare that the intellectual content of this thesis is the product of my own work, except to the extent that assistance from others in the projects design and conception or in style, presentation and linguistic expression is acknowledged.

Signed:

v

(9)

(10)

Acknowledgements

I still remember the days when my mother bought resourceful books for me and my brother about the universe and science when were young. Inspired by scientific innovation of that time, I dreamt of becoming a researcher. Hence in this thesis, I would like to acknowledge the people who have helped me to make my dream true and real.

First and foremost, I would like to thank my supervisor, Dr. Kathryn Merrick, for her guidance, support, and patience throughout my study at UNSW Canberra.

I would also thank Kathryn for her continuous encouragement.

I would express my sincere gratitude to my co-supervisor, A/Prof. Valeri Ougrinovski, whose guidance has helped me to enhance my research skills. His valuable feedback not only taught me about the academic matters but also served as life lessons.

I deeply appreciate the assistance from the Research Student Unit Staff, Elvira Berra, regarding the enrolment, thesis, scholarship, and visa issues. I would also thank the staff at the UNSW Canberra ALL Unit, Debbie, Maya, Anne, Maria, and Fiona for their help to improve my writing skills.

My sincere gratitude goes to those people I met in the IEEE conferences in Brisbane and Florida. I would like to thank Prof. Andries P. Engelbrecht and A/Prof. Xiaodong Li for their valuable research input. Thank you also to Prof.

Russel Eberhart, one of the founders of Particle Swarm Optimization, whose talk in SSCI 2014 about Human Swarm has become my inspiration.

Finishing this PhD will not be possible without the support of my friends. Thank you to Irman, Sheila, Erandi, Suranjith, Adnan, Umran, Theam, Shir Li, Mai, Val, Amin, Shen, Lucy, Bin Zhang, Bing Wang, Rakib, and to all my friends who have supported me.

I dedicate this thesis to my family to whom I am indebted to. I appreciate my husband, M. Saad Nurul Ishlah, for his support, company, and patience during my road from failures to success. I would like to specially mention the name of my newborn son, Dzaky Andrenova Ishlah, who brought countless happiness and joy during the last period of my study. I also want to thank my parents, Sri Setyaningsih and Soewarto Hardhienata, and my brother, Hendradi Hardhienata, for their constant support and encouragement.

Last but not least, I acknowledge the PhD Scholarship received from UNSW Canberra research scheme that provided me with financial assistance.

Above all, I would like to thank God for His wonderful creation of the universe and for His blessings to accomplish this endeavour.

vii

(11)

(12)

Abstract

Computational models of motivation have been explored as a means for artificial agents to identify, prioritize and select the goals they will pursue autonomously.

This thesis aims to extend current research on models of motivation by investigating the role of motivation in solving optimization problems, with application to task allocation as an example. To achieve this aim, a novel approach that incorporates computational models of motivation into Particle Swarm Optimization (PSO) is proposed. Each particle acts as a self-motivated agent. Three models of motivation, namely achievement, affiliation, and power motivation, are explored. The use of models of motivation enables PSO agents to select the optima that they will pursue autonomously, which has led us to introduce a new class of PSO algorithms, the Motivated Particle Swarm Optimization (MPSO) algorithms. The proposed MPSO algorithms are tested on a range of task allocation problems with different initialization points, varying numbers of tasks, and diverse parameter settings. To evaluate the effectiveness of the algorithms, new behavioral and performance metrics are introduced. Furthermore, this study provides analysis of parameter selection for the new algorithms and evaluates the effectiveness of swarms with different compositions of motivated agents.

Simulation results confirm that agents with different motive profiles exhibit different behavioral characteristics, and that these characteristics have a positive impact on performance. The proposed approach is also shown to improve the performance of existing PSO approaches without motivation, particularly when there is only a small number of agents and when the agents are initialized from a single point, which is the case in many realistic situations.

This thesis makes contributions in three research areas. The first contribution is within the field of motivated learning as it extends the use of computational models of motivation to the optimization domain. The second contribution is in the area of PSO where a new approach that incorporates motivation into PSO settings is introduced to enhance the performance of existing PSO approaches. This thesis also contributes in the domain of task allocation by introducing a new decision making mechanism that permits agents to select a task according to their own motivations.

ix

(13)

(14)

List of Publications

1.M. K. Hardhienata, K. E. Merrick, and V. Ugrinovskii. Task allocation in multi-agent systems using models of motivation and leadership. In IEEE Congress on Evolutionary Computation (CEC-2012), Jun. 2012, pp. 1-8.

2.M. K. Hardhienata, V. Ugrinovskii, and K. E. Merrick. Task allocation under communication constraints using motivated particle swarm optimization. In IEEE Congress on Evolutionary Computation (CEC-2014), Jul. 2014, pp. 3135-3142.

3.M. K. Hardhienata, K. E. Merrick, and V. Ugrinovskii. Effective motive profiles and swarm compositions for motivated particle swarm optimisation applied to task allocation. In IEEE Symposium on Computational Intelligence for Human-like Intelligence (CIHLI-2014), Dec. 2014, pp. 1-8.

xi

(15)

(16)

List of Figures

2.1 Motivational curves for (a) an individual motivated to approach success (M_s = 2, M_f = 1) and (b) and individual motivated to avoid failure (M_s = 1, M_f = 2). In both figures (a) and (b): tendency to approach success (thin solid line), tendency to avoid failure (dashed

line), resultant tendency (thick solid line) [1] . . . 21

2.2 Sigmoid representations of (a) motivation to approach success (Sach= 1, M_ach⁺ = 0.25, M_ach⁻ = 0.75, ρ⁺ =ρ⁻ = 20) (b) motivation to avoid failure (S_ach= 1, M_ach⁺ = 0.75, M_ach⁻ = 0.25, ρ⁺=ρ⁻ = 20) [1]. . . 23

2.3 (a) Affiliation motivation as the sum of curves for hope of affiliation and fear of conflict (S_{af f} = 1, M_{af f}⁺ = 0.3, M_{af f}⁺ = 0.1, ρ⁺_{af f} = 20, ρ⁻_{af f} = 20), (b) Power motivation as the sum of curves for approaching and avoiding power(S_pow = 1, M_pow⁺ = 0.6, M_pow⁺ = 0.1, ρ⁺_pow = 20, ρ⁻_pow = 20) [1]. . . 26

2.4 (a) affiliation motivation, (b) achievement motivation, (c) power motivation. The configuration for these motive profiles can be found in Appendix A.1-A.3 . . . 28

2.5 A simplified leadership motive profile. The configuration for this profile can be found in Appendix A.4 . . . 31

2.6 The Star topology. . . 37

2.7 The Ring topology. . . 37

2.8 The Nearest Neighbors topology. . . 38

3.1 The algorithm for dealing with the case where a particle moves outside the search space [2]. . . 56

3.2 The directed ring topology. . . 57

3.3 The graph ofI(a, D) function for all possible values ofa, in the case M=30 and D∈[0,1]. . . 61

3.4 Affiliation motive profile (solid line), Power motive profile (dashed line). The values ofS_{af f}, S_ach, S_powandρused to achieve these profiles are listed in Table 3.1. . . 63

3.5 Agents^m can communicate with agentsⁱ, but it can not communicate with s^j. . . 67

xvii

(21)

M=30 and D∈[0,1]. . . 72 3.7 Three different motive profiles that are used in the Motivated-Guaranteed

Convergence Particle Swarm Optimization algorithm. The configuration for these profiles are listed in Table 3.2. . . 75 4.1 Agent s^m is said to be visiting optimum n because the distance

between its current position and the optimum is less than a certain threshold, ε. . . 84 4.2 An example of how to visualize average dwell time frequency

distribution for three different types of agents. In this example, the graph has 10 different class intervals of average dwell times on its x-axis 85 5.1 The position of the tasks used in the simulations for N = 1,2, ...,9. . 102 5.2 Average dwell time frequency distribution of homogeneous agents

in the benchmark PSO algorithm (white color) using a ring neighborhood topology, power-motivated agents (black color), and affiliation-motivated agents (gray color). This graph has 25 different class intervals of average dwell times on its x-axis. . . 105 5.3 Success rates for the PSO algorithm using Ring (15) (circular-line)

and MPSO algorithm using MPSO (AFF 9 + POW 6) (asterisk-line) in the average number of discovered tasks. . . 109 5.4 Success rates for the PSO algorithm using Ring (100) (circular-line)

and MPSO algorithm using MPSO (AFF 80 + POW 20) (asterisk-line) in the average number of discovered tasks. . . 109 5.5 Success rates for the PSO algorithm using Ring (15) (circular-line)

and MPSO algorithm using MPSO (AFF 9 + POW 6) (asterisk-line) in the average number of tasks to which the agents are allocated. . . 110 5.6 Success rates for PSO algorithm using Ring (100) (circular-line) and

MPSO algorithm using MPSO (AFF 80 + POW 20) (asterisk-line) in the average number of tasks to which the agents are allocated. . . 110 5.7 Effect of changes in c⁽¹⁾₁ to the average number of tasks to which the

agent are allocated using MPSO (AFF 9 + POW 6). . . 116 5.8 Effect of changes in c⁽¹⁾₂ to the average number of tasks to which the

agent are allocated using MPSO (AFF 9 + POW 6). . . 117 5.9 Effect of changes in c⁽²⁾₁ to the average number of tasks to which the

agent are allocated using MPSO (AFF 9 + POW 6). . . 117 5.10 Effect of changes in c⁽²⁾₂ to the average number of tasks to which the

agent are allocated using MPSO (AFF 9 + POW 6). . . 118 5.11 Effect of changes in h to the average number of tasks to which the

agents are allocated using MPSO (AFF 9 + POW 6). . . 119

(22)

5.12 Effect of changes in τ to the average number of tasks to which the agents are allocated using MPSO (AFF 9 + POW 6). . . 120 6.1 The location of the tasks in the simulations. . . 126 6.2 The effect of different communication ranges on the average number

of tasks to which the agents are allocated for M-GCPSO (P¹ 12 + P² 12 + P³ 6) (black), GCPSO (30) (gray), and MPSO (AFF 24 + POW 6) (white) in the case where the agents are initialized from: (a) random points, (b) a single point. . . 130 6.3 Allocation success rates for the M-GCPSO (P¹ 12 + P² 12 + P³ 6),

GCPSO (30), and MPSO (AFF 24 + POW 6) algorithms when the agents are initialized from: (a) random points, (b) a single point. . . . 131 6.4 Comparison of the Motivated-Guaranteed Convergence PSO

(M-GCPSO), Guaranteed Convergence PSO (GCPSO), and the Motivated PSO (MPSO) algorithms over time for N = 9. Average number of discovered tasks when agents are initialized from: (a) random points, (c) a single point. Average number of tasks to which the agents are allocated when agents are initialized from: (b) random points, (d) a single point. . . 134 7.1 The motivation curves of (a) affiliation, (b) achievement, (c) power

and (d) leadership motive profiles based on [1]. The baseline configuration for these profiles are listed in Table 7.2. . . 144 7.2 Effects of changes in S_{af f} of the affiliation-motivated agents to

the shape of the motivation curve. This figure shows examples of affiliation motive profiles with (a) S_{af f} = 0.5, (b) S_{af f} = 1, (c) S_{af f} = 1.5, (d) S_{af f} = 2. In all profiles, S_ach=S_pow= 1. . . 146 7.3 The effects of changing the value of the relative motivation strengths,

(a)Saf f,(b)Sach,(c)Spow of the affiliation-motivated agents in the case where the agents are initialized from random positions. . . 148 7.4 Allocation success rates obtained by changing the value of

the relative motivation strengths, (a)Saf f,(b)Sach,(c)Spow, of the affiliation-motivated agents in the case where the agents are initialized from random positions. . . 149 7.5 The effects of changing the value of the relative motivation strengths,

(a)S_{af f},(b)S_ach,(c)S_pow of the affiliation-motivated agents in the case where the agents are initialized from a single point. . . 150 7.6 Allocation success rates obtained by changing the value of

the relative motivation strengths, (a)S_{af f},(b)S_ach,(c)S_pow, of the affiliation-motivated agents when the agents are initialized from a single point. . . 151

(23)

(a)S_{af f},(b)S_ach,(c)S_pow of the achievement-motivated agents in the case where the agents are initialized from random positions. . . 152 7.8 Allocation success rates obtained by changing the value of

the relative motivation strengths, (a)S_{af f},(b)S_ach,(c)S_pow, of the achievement-motivated agents in the case where the agents are initialized from random positions. . . 153 7.9 The effects of changing the value of the relative motivation strengths,

(a)S_{af f},(b)S_ach,(c)S_pow of the achievement-motivated agents in the case where the agents are initialized from a single point. . . 154 7.10 Allocation success rates obtained by changing the value of

the relative motivation strengths, (a)S_{af f},(b)S_ach,(c)S_pow, of the achievement-motivated agents in the case where the agents are initialized from a single point. . . 155 7.11 The effects of changing the value of the relative motivation strengths,

(a)S_{af f},(b)S_ach,(c)S_pow of the power-motivated agents in the case where the agents are initialized from random positions. . . 156 7.12 Allocation success rate obtained by changing the value of

the relative motivation strengths, (a)S_{af f},(b)S_ach,(c)S_pow, of the power-motivated agents in the case where the agents are initialized from random positions. . . 157 7.13 The effects of changing the value of the relative motivation strengths,

(a)Saf f,(b)Sach,(c)Spow of the power-motivated agents in the case where the agents are initialized from a single point. . . 158 7.14 Allocation success rate obtained by changing the value of

the relative motivation strengths, (a)S_{af f},(b)S_ach,(c)S_pow, of the power-motivated agents in the case where the agents are initialized from a single point. . . 159 7.15 Average number of tasks allocated by swarm with different

compositions where the agents are initialized from random points. . . 160 7.16 Average number of tasks allocated by swarm with different

compositions where the agents are initialized from a single point. . . . 161 7.17 (a) Average number of tasks discovered and (b) average number of

tasks allocated between MPSO (AFF’ 9 + POW 6) (gray bar) and MPSO (AFF 9 + LEAD 6) (white bar) with random and single point initialization. . . 165 7.18 (a) Average number of tasks discovered and (b) average number of

tasks allocated between M-GCPSO (P¹ 12 + P² 12 + P³ 6) (gray bar) and M-GCPSO (AFF 12 + LEAD 18) (white bar) with random and single point initialization. . . 167

(24)

List of Tables

3.1 The constants for affiliation and power motive profiles used in the MPSO algorithm with a ring topology . . . 63 3.2 The constant values of Profile 1, Profile 2, and Profile 3 used in

the Motivated-Guaranteed Convergence Particle Swarm Optimization algorithm . . . 73 3.3 A general comparison between the features of the Motivated PSO

(MPSO) with a ring topology and the Motivated Guaranteed PSO (M-GCPSO) algorithms . . . 73 5.1 The location of the tasks for benchmark application 1 . . . 101 5.2 Average number of tasks visited by the affiliation-motivated,

power-motivated, and homogeneous agents in Simulation 1 and Simulation 2. . . 104 5.3 Performance comparison between the PSO algorithm using a ring

topology and the MPSO algorithm on a test function with 9 tasks (Numerical Experiment 2, Simulations 1-4)

111

5.4 The effect of different swarm compositions on the average number of tasks to which the agents are allocated . . . 118 6.1 Parameters derived from the multi-robot search scenario in [3] . . . . 127 6.2 Performance comparison of the Motivated-Guaranteed Convergence

PSO (M-GCPSO), Guaranteed Convergence PSO (GCPSO), and the Motivated PSO (MPSO) algorithms in the case of nine tasks (Numerical Experiment 2, Simulations 1-2). . . 132 6.3 Discovery and allocation success rates of the Motivated-Guaranteed

Convergence PSO (M-GCPSO), Guaranteed Convergence PSO (GCPSO), and Motivated PSO (MPSO) algorithms (Numerical Experiment 2, Simulations 1-2). . . 133 7.1 Simulation setup for Numerical Experiment 3 . . . 143 7.2 The baseline configuration and suggested values for affiliation,

achievement, power and leadership motive profiles [1] . . . 144 xxi

(25)

7.4 Discovery and allocation success rates for different swarm compositions163 7.5 Discovery and allocation success rates for MPSO (AFF’ 9 + POW 6)

and MPSO (AFF 9 + LEAD 6) . . . 166 7.6 Discovery and allocation success rates for M-GCPSO (P¹ 12 + P² 12

+ P³ 6) and M-GCPSO (AFF 12 + POW 18) . . . 168 A.1 Constants of affiliation motivation and suggested values [1] . . . 183 A.2 Constants of achievement motivation and suggested values [1] . . . . 184 A.3 Constants of power motivation and suggested values [1] . . . 184 A.4 Constants of the simplified leadership motive profile . . . 185 B.1 Effect of changes inλto the allocation success rates using M-GCPSO

(P¹ 12 + P² 12 + P³ 6) . . . 187

(26)

Chapter 1

Introduction

Models of motivation have recently been explored as a means for artificial agents to autonomously select goals in settings such as reinforcement learning [4] and active learning [5]. Early studies have demonstrated that models of motivation permit artificial agents to evolve and adapt in new environments, learn new tasks which have not been previously specified, and select the goals they will pursue autonomously.

These models have been shown to be particularly useful in the area of developmental robotics [6] and artificial life [1].

In recent research, three computational models of motivation, namely achievement, affiliation, and power models, have been introduced which allow artificial agents to exhibit different risk-taking behaviors in goal selection [1]. These models were originally inspired by psychological motivation theories [7], [8], which propose that individuals with different motive profiles have different preferences for certain kinds of incentives. As a result, they select goals differently. To validate these three models of motivation in artificial agents, the agents were tested in a series of simulations such as a ring-toss game, a roulette game, as well as a prisoner’s dilemma game [1]. The simulation results demonstrated that agents endowed with models of motivation were able to exhibit similar goal-selection behaviors as those observed in humans with corresponding motives in certain constrained scenarios.

(27)

Research on multi-agent systems has also made significant progress in building computational models of leadership. For examples, these models include leader-follower models that are employed for robotic patrolling in environments with arbitrary topologies [9] and the use of leaders for reducing global costs in market-based multirobot coordination [10].

Although computational models of motivation and leadership have been widely studied, only a few works have been performed to explore the use of motivation and leadership to support the decision-making mechanism within optimization and task allocation domains. This thesis explores the role of motivation within an optimization framework, with application to task allocation as a benchmark application. The present study will also investigate the role of leadership, as a by-product of motivation, in solving multi-agent task allocation problems. For this purpose, a particle swarm optimization (PSO) approach [11] is employed.

PSO is a population-based stochastic optimization approach where a swarm of particles (agents) is used to iteratively find the best solution to an optimization problem. The approach was first introduced by Kennedy and Eberhard in 1995 [11], and since then, has attracted broad attention due to its effectiveness and simplicity in implementation. Originally, the PSO approach was inspired by how a flock of birds creates collective behaviors to find a source of food. To find the position of the food, each bird follows another bird that is closest to the food. By mimicking and simplifying the behaviors of these social animals, the idea of PSO was to represent the flock of birds as a swarm of particles that can work together to find the best solution (optimum) of an optimization problem. At each time step, the particles iteratively refine the solution by following the particle that has found the best solution so far. This will eventually lead the particles to the position of the best solution found by the population. In the context of optimization, this allows the particles to move towards the position of the optimum.

In the basic PSO algorithm, the dynamics of each particle are determined by the

(28)

3

position and velocity of the particle. The position of each particle is adjusted based on its previous velocity, the memory of the best position the particle has found so far, and the best position found by its neighbors. Using these three velocity components, the particle updates its position at each iteration in order to find the best solution in the search space.

Compared to other optimization algorithms, such as Genetic Algorithm (GA) [12], the advantages of the PSO algorithm is that it has relatively low computational complexity and requires only a small numbers of parameters to be tuned. This allows the PSO algorithm to be broadly applied into a wide range of problems including in the domains of multi-agent search [3], [13] and task allocation [14], [15].

Within the context of task allocation, the PSO particles in the optimization problems are often associated with agents that are equipped with sensors, such as robots, underwater or air vehicles. On the other hand, the optima can be associated with the position of the tasks. It is assumed that the tasks are able to broadcast signals that can be sensed by the agents. Examples of such signals include chemical substances, electromagnetic, and acoustic waves. The strength of the signal sensed by the agents is assumed to continuously decay with distance from the task and to peak (maximize), when the agent is at the task location. Mathematically this is equivalent to having a fitness function. This equivalence allows agents behave as if they take measurements of a fitness function, and hence, allows the task allocation problem to be cast as an optimization problem.

Despite all its advantages, the basic PSO algorithm has several drawbacks that can limit its application. One aspect to consider is that the basic PSO algorithm was originally designed to solve optimization problems where there was only a single solution in the search space (unimodal problems) [16]. Thus, all particles eventually converged to a single attractor after several time-steps. The basic PSO algorithm, therefore, could not be employed directly to solve optimization problems with more than one optimum (multimodal problems). In some cases, a system of equations

(29)

may have multiple solutions in which all solutions need to be located. Some examples include linear or non linear programming [16]. This required the basic PSO algorithm to be further modified when dealing with multimodal optimization problems. Furthermore, many real-world applications require more than one solution to be discovered. One example is the task allocation problem where a group of agents needs to discover multiple tasks.

Another limitation of the basic PSO algorithm also manifests in situations when the solution is sought using a small population of particles [16]. The difficulty of this situation is that the PSO particles may concentrate within a small region of the search space and potentially find fewer optima. In the context of task allocation, such a problem may also arise when only a small number of agents are available to search for tasks. In such a situation, employing a small number of PSO agents may result in having only a small number of tasks being discovered.

Furthermore, the basic PSO algorithm has no mechanism to control the distribution of the particles among the optima. In some cases, this results in having densely-populated particles concentrated in one optimum [17]. Such a problem often occurs when a relatively small number of particles are used to solve a function which has many optima [17].

It is also the case that many applications may require the particles (agents) to be initialized from a single point. An example of such problems in the context of task allocation is the case where the agents enter a building to search for targets from a single entrance [18]. To address this situation, the starting configuration of the agents was pre-computed and the agents were assigned to move to their starting positions [18]. However, in the case where the geometry inside the building is unknown, this can complicate the process of distributing the agents around the search space. Therefore, developing an algorithm which has the ability to initialize the agents from a single departure point, particularly when the geometry inside the building is unknown, is essential. Most of the existing PSO algorithms assumed that

(30)

1.1. Research Objectives 5

initially the particles are randomly positioned over the search space [19], [16]. To apply the basic PSO algorithm in this situation, a modification to the basic PSO algorithm is thus needed.

In order to address the problems mentioned above, this thesis presents a novel approach that incorporates computational models of motivation into PSO. This has lead us to introduce a new class of PSO algorithms, namely the Motivated Particle Swarm Optimization (MPSO) algorithms. This thesis focuses on task allocation as a benchmark problem for studying MPSO.

Relevant work to solve task allocation problems in the area of Swarm Intelligence has been presented in [20], [14]. Other studies have also been conducted in the area of Economic Systems [21], [10]. Different from these existing approaches, a new approach to the task allocation problems that combines three computational models of motivation with the PSO algorithm is developed in this study. Such a combination permits the agents to autonomously select which task to pursue based on their own motivations.

In the remainder of this chapter, the research objectives, the contributions and the significance, the methodology of the research, as well as the thesis overview are presented.

1.1 Research Objectives

The main objective of this research is to extend current research on models of motivation by investigating the role of motivation in solving optimization problems, with application to task allocation. In particular, this research aims to improve the performance of Particle Swarm Optimization in a task allocation domain by endowing agents with models of motivation. This research is focused on achieving the following objectives:

1. Develop Motivated PSO algorithms that embed models of motivation in agents for optimization.

(31)

2. Develop incentive-based models of motivation for PSO.

3. Quantify the relationship between models of motivation and agents’ behaviors.

4. Develop new metrics for evaluating the behavior of the agents and the performance of the Motivated PSO algorithms in optimization and task allocation domains.

5. Demonstrate and evaluate Motivated PSO as it applies to task allocation problems.

6. Identify the most effective motive profiles and swarm compositions for the Motivated PSO algorithms with application to task allocation.

1.2 Research Contributions

This thesis makes the following contributions:

1. One of the main contributions of this thesis is the introduction of a novel approach that incorporates computational models of motivation into PSO. In the new approach, each particle acts as a self-motivated agent which allow the agent to select the goals it will pursue autonomously. In particular, two new algorithms that extend existing PSO algorithms for solving multimodal optimization problems are presented in this thesis. These two algorithms are the MPSO algorithm that uses the PSO algorithm with a ring topology as a basic foundation and the Motivated Guaranteed Convergence PSO (M-GPSO) algorithm which is constructed by using a Guaranteed Convergence PSO (GCPSO) as a baseline (Chapter 3). This study focuses on task allocation as a benchmark problem for studying MPSO. In the task allocation problem, an agent often needs to consider multiple extrema as candidate tasks to undertake.

To deal with this problem, the idea of the MPSO algorithm is to construct a mechanism to consider potential neighborhood best positions and to calculate those positions based on the agent’s motivation. Results of the simulations

(32)

1.2. Research Contributions 7

indicate that the proposed algorithms improve the performance of the basic PSO algorithms without motivation, particularly in the case when there is only a small number of agents available and when the agents are initialized from a single departure point (Chapters 5-6).

2. The introduction of new incentive functions that can be executed iteratively in PSO settings is another key contribution of this thesis (Chapter 3). In this study, the incentive function is defined as a nonlinear function that is sensitive to risk-taking behavior. Specifically, two kinds of incentive function are introduced in this study. The first incentive function is designed for the MPSO algorithm with a ring topology which is built under the assumptions that there is unrestricted communications between the agents and no restriction on agents’ velocity. The second incentive function is designed for the M-GCPSO algorithm which is applied to a multi-agent search scenario. In contrast to the MPSO algorithm with a ring topology, the agents in the M-GCPSO algorithm can only communicate within a predefined communication range.

Furthermore, these agents have limitation on their velocity. The construction of the incentive function plays a major role in the MPSO algorithms as this function allows the agent to select one of the potential neighborhood best positions of an optimum to pursue based on the assigned incentive value and the agent’s motive profile. To prevent agents from concentrating in a small area, the novel aspect of the proposed approach is to assign a sufficiently high incentive value to a randomly generated position in the search space. Such a mechanism will encourage some particles to perform broader exploration.

Furthermore, a small incentive value will be assigned to optima with a large number of agents. In the case where the particles are initialized from a single point, such a mechanism encourages the particles to move away from the initialized point to provide a more uniform distribution of agents in the search space.

3. In this study, two sets of metrics are introduced to quantify the behavior of

(33)

the motivated agents which cannot be measured using existing PSO metrics (Chapter 4). The first set of metrics is used to evaluate the behavior of the agents in terms of the number of optima visited and the average dwell time of each type of agent. The second metric category is used as a means to evaluate the performance of the MPSO algorithms. These performance metrics include the average number of optima discovered, the average number on which the agents converge, the success rate, and the distribution of the agents among optima.

4. To demonstrate the application of the MPSO algorithms, a benchmark problem of task allocation is considered in this thesis. The thesis develops a new approach to solving this problem, where agents can select the tasks autonomously based on the agents own motivations. This new approach provides an alternative to existing task allocation approaches such as economics-based [21], [10], [22] and swarm intelligence [19], [13], [23], [14]

approaches. Different from most of the existing task allocation approaches where negotiation between the agents is required to choose the tasks, the decision making mechanism in the proposed approach is driven by the agents’

own motivations. The use of motivations, specifically the three models of motivation, is a new approach within the task allocation domain as it removes the need for explicit negotiation between agents.

5. Another contribution of this thesis is the identification of the most effective profiles and swarm compositions for the MPSO algorithms (Chapter 7). For this purpose, this study examines the behavior of agents with four distinct motive profiles: affiliation, achievement, power and a new hybrid leadership motive profile. Using task allocation problems as benchmark problems, results of the numerical experiments show that affiliation-motivated agents tend to perform local search and allocate themselves more quickly to tasks (optima).

In contrast, power-motivated agents tend to facilitate exploration to discover new tasks. These two types of agents perform better in the presence of

(34)

1.3. Methodology 9

achievement-motivated agents, informing the design of the leadership motive profile, which demonstrates good performance in the two task allocation settings studied in this thesis.

1.3 Methodology

The research methodology used in this thesis comprises five stages. In the first stage, a new approach is developed by incorporating models of motivation, based on psychological motivation theories, into PSO algorithms. In this approach, we modify the PSO algorithms to include motive profiles for the agents. The new algorithms are referred to as the Motivated PSO (MPSO) algorithms. The parameters used to construct models of motivation in the MPSO algorithms are selected based on psychological motivation theory [1]. At the preliminary stage, the agents were embedded with motive profiles that were constructed based on initial hypotheses about the behavior and performance of agents with well known motives. The basic PSO parameters in the MPSO algorithms were set according to common PSO parameters values that have been empirically found to provide good performance [16]. As a means to incorporate models of motivation into the PSO algorithms, novel incentive functions were developed.

In the second stage, two sets of metrics are introduced to determine the behavior and the performance of the proposed algorithm. The first category of the metrics introduced in this thesis is the behavioral metrics. These metrics are used to quantify whether agents with different motive profiles exhibit different characteristics that can improve the performance of the algorithms. The second category of the metrics evaluates the performance of the algorithms. These metrics are first presented in the general context of optimization. The definition of these metrics are then adopted and represented in the specific context of task allocation.

To demonstrate the advantages of using MPSO, the third stage of the research methodology is to test the performance of the algorithm for solving task allocation problems. A number of numerical experiments are conducted to test the algorithms

(35)

for solving task allocation problems with different task configurations. To evaluate the effectiveness of the new algorithms, the MPSO algorithms are compared with benchmark PSO algorithms that do not employ motivated agents.

In the fourth stage, the results from each experiment are collected and analyzed statistically through an empirical analysis. Conclusions about the performance and behavior of the agents are drawn based on the statistical analysis of numerical experimental results.

In order to further identify the most effective motive profiles and swarm compositions for the MPSO algorithms for solving task allocation problems, the final stage of this research is to examine the relationship between motivation and the behavior of MPSO agents through a series of numerical experiments. After the most effective swarm compositions in these experiments are identified, the final experiment is designed to compare the performance of the MPSO algorithms when using the new swarm compositions to the initial swarm compositions used in the first stage of the research methodology. This is performed to investigate whether the new swarm compositions can be used as universal motive profiles for the MPSO algorithms.

1.4 Thesis Overview

This thesis is organized in eight chapters. Chapter 2 discusses the background, theories, and related approaches underpinning the study presented in this thesis.

Chapter 3 introduces the proposed MPSO approach which is the main contribution of this study. Two metric categories that are designed to identify different agent behaviors and the performance of the proposed algorithms are presented in Chapter 4. Chapters 5 and 6 describe a number of numerical experiments that are performed to test the ability of the proposed algorithms in solving benchmark task allocation problems. A further investigation to identify the most effective motive profiles and swarm compositions for the MPSO algorithm in solving task allocation problems is presented in Chapter 7. This thesis concludes in Chapter 8 which summarizes

(36)

1.4. Thesis Overview 11

the main contributions and limitations of the study. Furthermore, potential future research directions are also presented in this chapter.

(37)

(38)

Chapter 2

Background and Related Work

In this chapter, the background and related work that are important to understand the study conducted in this thesis are presented. This chapter begins with a brief introduction to motivation in psychology and how it has further inspired the development of computational models of motivation and leadership in artificial intelligence systems. The development of computational models of motivation in several domains is discussed and the applications where models of motivation may be useful in other domains are identified. This chapter also identifies that motivation has not been considered within an optimization framework in the previous literature. In order to investigate which models of motivation are suitable for use within an optimization framework, existing models of motivation are reviewed. Furthermore, this chapter also describes the development and limitations of current PSO algorithms which are used as a basic foundation for the proposed approach and discusses how motivation can be used to improve the performance of existing PSO algorithms. Moreover, a problem of task allocation to which the proposed approach will be applied and tested is presented. These concepts are the main components that have motivated the work presented in this thesis.

(39)

2.1 Models of Motivation

Over the past few decades, the theories of human motivation have been widely studied across many disciplines, including in the field of psychology [8]

and neuroscience [24]. This in turn has shed new light on the development of computational models of motivation within the domain of artificial intelligence systems. To understand how theories of human motivations have inspired the development of computational models of motivation in artificial intelligence systems, this section first presents an overview of motivation from a psychological perspective.

It will then be followed by a discussion of computational models of motivation in artificial intelligence systems which are drawn based on these psychological motivation studies.

2.1.1 Models of Motivation from a Psychological Perspective

From a psychological perspective, the terminology of ‘motivation’ is often defined as the cause of action in natural systems [25]. In particular, ‘to be motivated’

has been defined as the act ‘to be moved’ to do something [26]. The concept of motivation appears in a wide range of psychological studies including the study of human behaviors [27] and the psychology of learning [28]. One common line that connects various studies on motivation is that they are designed to seek the reasons for actions, to investigate how different motivations influence certain actions, and to understand how motivation influences the activation, control, and persistence of goal-oriented behavior [8]. In particular, the study of human motivation has contributed to better understand how different strengths and types of motivation determine individuals’ underlying attitudes and the goals they will pursue [26].

One type of human motivation that has been studied in motivational research and has been shown to be useful in artificial intelligence systems is intrinsic motivation.

This type of motivation refers to the drive to reach a certain goal because it is inherently interesting or enjoyable [26]. The following subsection discusses how the theories of motivation, which have been suggested by psychologists, have been

(40)

2.1. Models of Motivation 15

further modeled in artificial intelligence systems.

2.1.2 Models of Motivation in Artificial Intelligence Systems

Compared to the theoretical study of motivation in psychology, the study of motivation in the domains of artificial intelligence systems and machine learning is relatively new. Traditionally, machines are designed with a goal that has been previously programmed by system engineers. It has been reported that the use of such traditional approaches, however, is not sufficient to deal with complex problems in dynamic environments that require the agents to continuously learn [6]. Such autonomous mental development will be particularly useful when the environments are complex and unknown. Furthermore, these developmental agents will be beneficial in performing tasks which humans are not willing to perform or might be difficult/dangerous for humans to deal with, such as water or space exploration, clearing mines, or cleaning up nuclear waste [6].

Due to the benefits of using developmental agents as mentioned above, early works in the field of artificial intelligence have been conducted to embed agents with models of motivation. It has been demonstrated that agents which are endowed with models of motivation are able to perform certain tasks, such as adaptive exploration of the environment [29] and autonomous learning that is driven by self-motivation [30].

The success of using motivation in these areas has inspired the study in this thesis that motivation may also be usefully applied in other areas. Specifically, this thesis aims to investigate the role of motivation in an optimization framework which, to the best of our knowledge, has not been considered in any of the previous studies.

In order to see which models of motivation are suitable for the use in optimization, the following subsections provide a review of several computational models which have been developed to date.

(41)

Existing Computational Models of Motivation

Over the last few decades, a number of computational models of motivation have been developed and their success has been demonstrated in several domains [31], [32], [33], [34]. These include approaches to motivation based on measures such as knowledge based models, competence based models, and morphological models [35].

In the knowledge based models, measure of intrinsic motivation relates to the comparison between the situations experienced by an agent and the existing knowledge and expectations that the agent has about these situations. Approaches based on these models include information gain motivation [36], learning progress motivation [37], [38], novelty [39], [29], and curiosity [37].

One example of knowledge based models where information gain motivation has been used can be found in [36] , [35]. Based on the psychological theory that humans have a natural tendency to learn and assimilate, the notion of assimilation has been modeled by the decrease of uncertainty in the agent’s knowledge of the world after an event has occurred [35].

Another example of knowledge based models which has also been shown to be particularly useful in artificial intelligence systems is the development of computational models of novelty [40]. Specifically [40] has presented a self-motivated developmental system, resulting from competing pressures to achieve a balance between predictability and novelty. The aim of the experiment in [40] was to investigate whether a developing robot could predict and track the motion of a moving target. The proposed self-motivated system has provided insights into the development of a framework to explore open-ended learning and skill acquisition in developmental robotics.

Computational models based on novelty have also been proposed in [29]. Here, an inherent value (motivational) system was proposed to permit agents to perform an autonomous exploration of the environment and continuously develop a more complex value system as they explore. The study in [29] has provided a basic

(42)

foundation for the growth of developmental learning, in which the agents are driven by their innate motivation rather than being governed by a predefined task specific goal.

Computational approach to intrinsic motivation based on knowledge based models have also drawn on other psychological theories, such as curiosity. For example, [30] has proposed a model of curiosity for artificial intelligence agents.

To test the model, the curious agents were tested to evaluate the layout of several art works that were exhibited in an art gallery. This model of curiosity permits the agents to provide useful information about the ‘interestingness’ of the layouts. This work has presented a possible future direction for developing agent-based simulations that are able to learn from experience and report their individual evaluation as they explore the environment.

Example of the model of curiosity can also be found in [41]. The main question being investigated in this particular study was to see whether a machine could be endowed with an intrinsic motivation system such as the motivation that was observed in humans. To answer this question, the agents were endowed with a mechanism of intelligence, named ‘adaptive curiosity’. To test the proposed models, a physical robot with models of curiosity was placed on a baby play mat with several objects that could be learnt by the agents. The results of the experiments indicate that, at the first stage, the robot spends time focusing on objects which are easy to learn about. It will then change its focus to a situation in which the level of difficulty increases, and eventually the robot avoids situations where nothing new can be learnt. The model of motivation presented in [41] has provided a basic foundation for developing further intrinsic motivations which allow self-organization to emerge.

Besides the knowledge based approach, another approach which has been used to measure intrinsic motivation is based on competence. This approach relates to the agent’s competence for achieving self-determined goals [35]. The approach is inspired

(43)

by several psychological theories which stated that human behavior was driven by effectance [42], personal causation [43], competence and self-determination [44], and flow [45].

The third type of measure of intrinsic motivation is based on the morphological models which is based on the comparison of information characterizing several pieces of stimuli perceived at the same time [35]. In contrast with the two previous models of motivation which are based on long-term knowledge or competence of the agent, this model measures intrinsic motivation based on morphological mathematical properties of the current flow of sensorimotor values. One example of these models is the synchronicity motivation in [46].

The above studies have shown that models of motivation have played important roles for creating agents that are capable of autonomous mental development. These studies provided a foundation for the development of other models of motivation covering a wider range of psychological motivational theories, and specifically, the three models of achievement, affiliation, and power motivation proposed in [1].

These three models of motivation are constructed to influence different risk-taking behavior depending on the obtained incentives: (1) Achievement motivation is characterized by a preference for selecting goals of intermediate difficulty and goals with intermediate incentive values; (2) Affiliation motivation is characterized by a preference for avoiding conflict by minimizing risks and taking low-incentive goals;

and (3) Power motivation is characterized by a need to be influential by taking extreme risks and selecting higher incentive goals [1], [47].

Based on psychological theories in [48], the three models of motivation, namely achievement, affiliation, and power, have been considered as fundamental underlying motivation drivers in humans.The use of the three models of motivation can be useful to generally model different types of motivations in humans. For example in [49], it is argued that one factor that motivates humans in learning is “the result for someones wish to reduce the discrepancy between ones ideal self (i.e., ones image

(44)

of what one would become) and ones actual self (i.e., ones actual self-state)”. Such a motive can be categorized as achievement motivation as it involves the need to master, to attain a high standard and to excel ones self. Another example of the motivation described in [49], may also relate to affiliation-motivated behavior such as the consideration that learning will help the person to understand people around the world (the need for friendship).

In this thesis, the development of the above mentioned models of motivation provides some useful insights for selecting appropriate models of motivation in an optimization framework. Specifically, the present study considers the use of the three models of motivation for achievement, affiliation, and power motivation as the models show potential to control decision-making mechanisms for goal selection.

Within an optimization framework, such a mechanism is particularly useful to choose the best solution from a set of candidate solutions, particularly when there is more than one solution to be considered. Furthermore, the use of the three models of motivation in artificial intelligence agents offers the opportunity to create a division of labor among the agents. The incorporation of such models of motivation into an optimization technique is expected to enhance the performance of existing optimization approaches. The following subsection discusses the three models of motivation in more detail as they form the basis of the approach proposed in this study.

Achievement, Affiliation and Power Motivation

The following subsections provide a brief overview of the achievement, affiliation, and power motivation from their original psychological theories and how these models are further modeled for use in artificial intelligence systems.

Models of Achievement Motivation

Compared to other motivation, such as affiliation and power motivation, achievement motivation has been identified as the most rigorously studied motive [8].

In [50], achievement motivation is defined as a drive to accomplish something

(45)

difficult, to overcome obstacles, to attain a high standard, to excel one’s self, and to surpass others [8]. Specifically, a behavior is considered as an achievement if competition with a standard of excellence is involved [28].

One of the foremost psychological model of achievement which has dominated achievement-motivation research was Atkitson’s Risk Taking Model (RTM) [7].

The RTM model was proposed to predict one’s preferences for accepting difficult goals. In the RTM, motivation is described by the difference between a directional component and an intensity component of motivation. The directional component corresponds to the preferred level of task difficulty. This component relates to the dominance of a particular motive, e.g. dominance of the success or failure motive.

On the other hand, the intensity component of motivation relates to the efficiency of tasks’ performance. Specifically, achievement motivation is modeled in the RTM as conflicting desires to approach success or to avoid failure. Six variables are used to construct achievement motivation: the incentive for success (I_s), probability of success (P_s), strength of motivation to approach success (M_s), incentive for avoiding failure (I_f), probability of failure (P_f), and strength of motivation to avoid failure (M_f). Using these six variables, a resultant motivation tendency for achievement motivation is defined as follows:

T_r =M_sI_sP_s+M_fI_fP_f (2.1)

where P_s+P_f = 1.

It is further assumed in Atkitson’s model that incentive is inversely proportional to probability of success [51]. This assumption was taken into account as it has been observed from everyday experience that the feeling of success increases as the probability of success decreases. Formally defined, the relation between incentive and probability of success for success-motivated individuals in the Atkitson’s model can be formulated as I_s = 1−P_s. For failure-motivated individuals, the incentive for avoiding failures is highest for very easy tasks [51]. Thus, in failure-motivated

(46)

individuals, the relation between incentive for avoiding failure and probability of success can be formulated as I_f =−P_s.

Using the above assumptions, Equation (2.1) can be further simplified as follows [51]:

T_r = M_s(1−P_s)P_s−M_fP_s(1−P_s)

= (M_s−M_f)(P_s−P_s²) (2.2)

Figure 2.1 illustrates examples of models of achievement motivation that are constructed using Equation (2.2).

0 0.2 0.4 0.6 0.8 1

−1

−0.8

−0.6

−0.4

−0.2 0 0.2 0.4 0.6 0.8 1

Probability of Success

Resultant Tendency

(a)

0 0.2 0.4 0.6 0.8 1

−1

−0.8

−0.6

−0.4

−0.2 0 0.2 0.4 0.6 0.8 1

Resultant Tendency

(b)

Figure 2.1: Motivational curves for (a) an individual motivated to approach success (M_s= 2, M_f = 1) and (b) and individual motivated to avoid failure (M_s = 1, M_f = 2). In both figures (a) and (b): tendency to approach success (thin solid line), tendency to avoid failure (dashed line), resultant tendency (thick solid line) [1]

The relation between the probability of success and the resultant tendency in the RTM is represented in a quadratic form as depicted in Figure 2.1. In the RTM, the resultant tendency of achievement motivation peaks at a moderate probability of success for success-motivated individuals. This reflects that success-motivated individuals are modeled as those who select goals of moderate difficulty. On the other hand, the failure-motivated individuals are modeled as those who select either very easy goals or very difficult goals. Different from success-motivated individuals, the resultant-motivational tendency for failure-motivated individuals is

(47)

minimum for a moderate probability of success. Such a general trend of achievement motivation described in the RTM has been observed experimentally in humans with corresponding motive profiles.

There are, however, several limitations of the RTM [1]. It has been observed that unlike predicted by the RTM, the point of maximum approach of success tends to fall below the critical level ofP_S = 0.5. Moreover, individuals with failure-motivated individuals do not tend to select extremely difficult goals as predicted by the RTM.

To permit such characteristics to be captured in artificial intelligence agents, a more flexible model of achievement motivation was proposed in [1]. Inspired by how curiosity and interest were successfully modeled in artificial intelligence agents using a sigmoid function, the RTM was represented in a sigmoid form. The next subsection discusses the models of achievement motivation presented in [1] in more detail.

Computational Models of Achievement Motivation

Achievement motivation is characterized in [1] by the distinction between two sigmoid functions for approach and avoidance of a goal G. Approach motivation has a higher resultant tendency for goals with a higher probability of success until it reaches a certain threshold. On the other hand, the resultant tendency for avoidance motivation is zero for goals with a very low probability of success, and negative for goals with a high probability of success. Mathematically, the resultant tendency to achieve a goal G for the achievement-motivated agents can be represented using a sigmoid function as [1]:

T_res(I) = Sach

1 +e^ρ⁺^ach^(M^ach⁺ ^−P^s^(G)) − Sach

1 +e^ρ⁻^ach^(M^ach⁻ ^−P^s^(G)) (2.3) Here, P_s(G) is the probability of success to achieve goal G which is described as a value range between zero (guaranteed failure) and one (guaranteed success);

M_ach⁺ is the turning point of the sigmoid for approach motivation, whereas M_ach⁻ is the turning point of the sigmoid for avoidance motivation. Note that when M_ach⁺ < M_ach⁻ , the shape for resultant tendency is concave-down which represents a

(48)

success-motivated individual. On the contrary, when M_ach⁺ > M_ach⁻ the shape of the resultant tendency is concave-up which represents a failure-motivated individual.

Furthermore, in Equation (2.3), ρ⁺_ach > 0 represents the gradient of approach to success and ρ⁻_ach > 0 denotes the gradient of avoidance of failure. Furthermore, S_ach measures the relative strength of achievement motivation compared to other motives. Examples of sigmoid representations for motivation to approach success and motivation to avoid failure are illustrated in Figure 2.2.

Besides the need for achievement, early studies had identified that humans have also the need for affiliation motivation [50]. A brief background of affiliation motivation will be provided in the next subsection.

0 0.2 0.4 0.6 0.8 1

−2

−1.5

−1

−0.5 0 0.5 1 1.5 2

Resultant Tendency

(a)

0 0.2 0.4 0.6 0.8 1

−2

−1.5

public version (3)

Models of Motivation for Particle Swarm Optimization with Application to Task Allocation in Multi-Agent Systems

unsworks.unsw.edu.au on 2025-02-19

Authenticity Statement

Originality Statement

Acknowledgements

Abstract

List of Publications

Contents

List of Figures

List of Tables

Chapter 1

1.1 Research Objectives

1.2 Research Contributions

1.3 Methodology

1.4 Thesis Overview

Chapter 2

2.1.1 Models of Motivation from a Psychological Perspective

2.1.2 Models of Motivation in Artificial Intelligence Systems