Conclusion and Future Work - Ulsan National Institute of Science and Technology

In this paper, we proposed an algorithm for mission planning of heterogeneous missions for UAVs. We formulate the mission planning problem into a vehicle routing problem with various methods to solve.

We used an attention-based deep reinforcement learning approach expecting fast computation time and sufficiently good performance. We proposed the unified mission representation to represent heterogeneous missions into same-sized vectors to utilize the attention-based neural network. Then the action masking strategy when choosing an action from the output of the neural network is adjusted to deal with the constraint of the problem.

We compared our proposed algorithm with heuristic algorithms (OR-Tools) and the simple greedy algorithm. We trained the proposed algorithm with 3~30 missions; the fact shows the proposed attention-based neural network has the training scalable with various missions. We analyze the cost of the solution as the performance of the algorithms and the computation time increment along with the number of missions. The performance of the proposed algorithm has a small gap with the stat-of-the- art heuristic algorithm (OR-Type2), while the computation time is significantly faster. The results show that the proposed algorithm can be a good selection with a reasonable trade-off between performance and computation time.

The ablation study provides that the unified representation for heterogeneous missions is effective. The training cost of the unified representation input is compared with partially informed representation inputs without geometrical information, type information, and both. The results show that the geometrical information more affects the performance than type information. The t-SNE results visualize the embedding space of the input layer that unified representation is enough to train the neural network to obtain sufficient performance.

The proposed algorithm considers using the single UAV for mission planning problems. Operating with multiple UAVs should consider the cooperative strategy, which is not simple. Also, the deep learning- based algorithm has the limitation of generalizing to a different type of environment. We do not show the generalization ability of the proposed neural network sufficiently yet but only the scalable training ability of the attention-based neural network—especially extrapolating/interpolating ability to the number of mission environments. The future work will be extending the algorithm to the utilizing multiple UAV environment and overcome the generalization problem.

REFERENCES

[1] Shakhatreh, H., Sawalmeh, A. H., Al-Fuqaha, A., Dou, Z., Almaita, E., Khalil, I., ... & Guizani, M. (2019). Unmanned aerial vehicles (UAVs): A survey on civil applications and key research challenges. Ieee Access, 7, 48572-48634.

[2] Atyabi, A., MahmoudZadeh, S., & Nefti-Meziani, S. (2018). Current advancements on autonomous mission planning and management systems: An AUV and UAV perspective. Annual Reviews in Control, 46, 196-215.

[3] Kumar, S. N., & Panneerselvam, R. (2012). A survey on the vehicle routing problem and its variants. Intelligent Information Management, 4, 66-74

[4] Cao, W., & Yang, W. (2017). A survey of vehicle routing problem. In MATEC Web of Conferences (Vol. 100, p. 01006). EDP Sciences.

[5] Dai, H., Khalil, E. B., Zhang, Y., Dilkina, B., & Song, L. (2017). Learning combinatorial optimization algorithms over graphs. arXiv preprint arXiv:1704.01665.

[6] Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I.

(2017). Attention is all you need. In Advances in neural information processing systems (pp.

5998-6008).

[7] Sutton, R. S., McAllester, D. A., Singh, S. P., & Mansour, Y. (2000). Policy gradient methods for reinforcement learning with function approximation. In Advances in neural information processing systems (pp. 1057-1063).

[8] Toth, P., & Vigo, D. (2002). Branch-and-bound algorithms for the capacitated VRP. In The vehicle routing problem (pp. 29-51). Society for Industrial and Applied Mathematics.

[9] Mingozzi, A., Roberti, R., & Toth, P. (2013). An exact algorithm for the multitrip vehicle routing problem. INFORMS Journal on Computing, 25(2), 193-207.

[10] Kytöjoki, J., Nuortio, T., Bräysy, O., & Gendreau, M. (2007). An efficient variable neighborhood search heuristic for very large scale vehicle routing problems. Computers & operations research, 34(9), 2743-2757.

[11] Fu, Z., Eglese, R., & Li, L. Y. (2005). A new tabu search heuristic for the open vehicle routing problem. Journal of the operational Research Society, 56(3), 267-274.

[12] Baker, B. M., & Ayechew, M. A. (2003). A genetic algorithm for the vehicle routing problem.

Computers & Operations Research, 30(5), 787-800.

[13] Vinyals, O., Fortunato, M., & Jaitly, N. (2015). Pointer networks. arXiv preprint arXiv:1506.03134.

[14] Mazyavkina, N., Sviridov, S., Ivanov, S., & Burnaev, E. (2021). Reinforcement learning for combinatorial optimization: A survey. Computers & Operations Research, 105400.

[15] Bello, I., Pham, H., Le, Q. V., Norouzi, M., & Bengio, S. (2016). Neural combinatorial optimization with reinforcement learning. arXiv preprint arXiv:1611.09940.

[16] Kool, W., Van Hoof, H., & Welling, M. (2018). Attention, learn to solve routing problems!. arXiv preprint arXiv:1803.08475.

[17] Hassoun, M. H. (1995). Fundamentals of artificial neural networks. MIT press.

[18] Liu, W., Wang, Z., Liu, X., Zeng, N., Liu, Y., & Alsaadi, F. E. (2017). A survey of deep neural network architectures and their applications. Neurocomputing, 234, 11-26.

[19] Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. MIT press

[20] Albawi, S., Mohammed, T. A., & Al-Zawi, S. (2017, August). Understanding of a convolutional neural network. In 2017 International Conference on Engineering and Technology (ICET) (pp.

1-6). Ieee.

[21] Medsker, L. R., & Jain, L. C. (2001). Recurrent neural networks. Design and Applications, 5, 64- 67.

[22] Sutskever, I., Vinyals, O., & Le, Q. V. (2014). Sequence to sequence learning with neural networks. In Advances in neural information processing systems (pp. 3104-3112).

[23] Graves, A., Wayne, G., & Danihelka, I. (2014). Neural turing machines. arXiv preprint arXiv:1410.5401.

[24] Bahdanau, D., Cho, K., & Bengio, Y. (2014). Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:1409.0473.

[25] Luong, M. T., Pham, H., & Manning, C. D. (2015). Effective approaches to attention-based neural machine translation. arXiv preprint arXiv:1508.04025.

[26] Li, Y. (2017). Deep reinforcement learning: An overview. arXiv preprint arXiv:1701.07274.

[27] Sutton, R. S., & Barto, A. G. (2018). Reinforcement learning: An introduction. MIT press.

[28] Google OR-Tools: https://developers.google.com/optimization/routing/vrp

Dalam dokumen Ulsan National Institute of Science and Technology (Halaman 37-40)