Conclusion - Enabling Deep Neural Network Inferences on Resource-constraint Devices

In this paper, we proposed a new framework, LG-DI, for latency-guaranteed DNN inferences in next- generation (6G) cellular networks, which is designed to support delay-sensitive DNN-based applications requiring latency bounds at the application level on dynamic network conditions. In order to guarantee latency while maintaining DNN performance as much as possible, our proposed framework adaptively exploits the lightweight-DNN and the compressive-offloading, which offloads DNN inferences by trans- mitting compressed data, based on computation time estimators and available bandwidth estimators.

Furthermore, we investigate which level (i.e., userspace, kernel, and device driver) is proper to imple- ment modules of LG-DI.

V Conclusion

This thesis presents methods for enabling neural network inferences on resource-constraint devices. In Sec II, we show that the pre-processing for lightweight DNN can reduce computation overhead while improving DNN performance on resource-constraint devices. In Sec III, we propose the essential information extractor for DNN inferences, which minimizes transmission volume for offloading with low compression overhead. Thanks to the essential information extractor, the lightweight DNN inferences are further accelerated by exploiting extra computing power with low overhead. In Sece IV, we finally present the LG-DI framework that guarantees latency for DNN inference while maintaining the DNN performance as much as possible by adaptively exploiting the lightweight DNN and offloading with the essential information.

References

[1] X. Xie and K.-H. Kim, “Source compression with bounded dnn perception loss for iot edge computer vision,” inACM Mobicom, 2019.

[2] G. Intelligence, “Understanding 5g: Perspectives on future technological advancements in mobile,”White paper, pp. 1–26, 2014.

[3] R. Desislavov, F. Martínez-Plumed, and J. Hernández-Orallo, “Compute and energy consumption trends in deep learning inference,”arXiv preprint arXiv:2109.05472, 2021.

[4] H. Choi, R. A. Cohen, and I. V. Baji´c, “Back-and-forth prediction for deep tensor compression,”

inIEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2020, pp. 4467–4471.

[5] D. Hu and B. Krishnamachari, “Fast and accurate streaming cnn inference via communication compression on the edge,” inIEEE/ACM International Conference on Internet-of-Things Design and Implementation (IoTDI), 2020, pp. 157–163.

[6] A. E. Eshratifar, A. Esmaili, and M. Pedram, “Bottlenet: A deep learning architecture for intelligent mobile cloud computing services,” inIEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED), 2019, pp. 1–6.

[7] J. H. Cho and B. Hariharan, “On the efficacy of knowledge distillation,” inIEEE ICCV, 2019, pp.

4794–4802.

[8] F. N. Iandola, S. Han, M. W. Moskewicz, K. Ashraf, W. J. Dally, and K. Keutzer, “Squeezenet:

Alexnet-level accuracy with 50x fewer parameters and< 0.5 mb model size,” inarXiv preprint arXiv:1602.07360, 2016.

[9] P. Panda, A. Sengupta, and K. Roy, “Conditional deep learning for energy-efficient and enhanced pattern recognition,” inProc. of IEEE DATE, 2016.

[10] S. Teerapittayanon, B. McDanel, and H. Kung, “Branchynet: Fast inference via early exiting from deep neural networks,” inProc. of IEEE ICPR, 2016.

[11] E. Onzon, F. Mannan, and F. Heide, “Neural auto-exposure for high-dynamic range object detection,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition,

[12] R. Torfason, F. Mentzer, E. Agustsson, M. Tschannen, R. Timofte, and L. Van Gool, “Towards image understanding from deep compression without decoding,” International Conference on Learning Representations (ICLR), 2018.

[13] E. Bourtsoulatze, D. B. Kurka, and D. Gündüz, “Deep joint source-channel coding for wireless image transmission,”IEEE Transactions on Cognitive Communications and Networking, vol. 5, no. 3, pp. 567–579, 2019.

[14] H. Xie, Z. Qin, G. Y. Li, and B.-H. Juang, “Deep learning enabled semantic communication systems,”IEEE Transactions on Signal Processing, vol. 69, pp. 2663–2675, 2021.

[15] H. Xie and Z. Qin, “A lite distributed semantic communication system for internet of things,”

IEEE Journal on Selected Areas in Communications (JSAC), vol. 39, no. 1, pp. 142–153, 2020.

[16] S. Lee, J. Lee, and K. Lee, “Vehiclesense: A reliable sound-based transportation mode recognition system for smartphones,” inProc. of IEEE WoWMoM, 2017.

[17] O. D. Incel, M. Kose, and C. Ersoy, “A review and taxonomy of activity recognition on mobile phones,”BioNanoScience, vol. 3, no. 2, pp. 145–171, 2013.

[18] M. Sundholm, J. Cheng, B. Zhou, A. Sethi, and P. Lukowicz, “Smart-mat: Recognizing and counting gym exercises with low-cost resistive pressure sensing matrix,” inProc. of ACM Ubi- comp, 2014.

[19] M. Cheffena, “Fall detection using smartphone audio features,”IEEE journal of biomedical and health informatics, vol. 20, no. 4, pp. 1073–1080, 2015.

[20] L. Stenneth, O. Wolfson, P. S. Yu, and B. Xu, “Transportation mode detection using mobile phones and GIS information,” inProc. of ACM SIGSPATIAL GIS, 2011.

[21] A. Narayanan, E. Ramadan, R. Mehta, X. Hu, Q. Liu, R. A. K. Fezeu, U. K. Dayalan, S. Verma, P. Ji, T. Li, F. Qian, and Z.-L. Zhang, “Lumos5g: Mapping and predicting commercial mmwave 5g throughput,” inProc. of ACM IMC, 2020.

[22] X. Liu, S. Nath, and R. Govindan, “Gnome: A practical approach to nlos mitigation for gps positioning in smartphones,” inProc. of ACM MobiSys, 2018.

[23] S. Hemminki, P. Nurmi, and S. Tarkoma, “Accelerometer-based transportation mode detection on smartphones,” inProc. of ACM SenSys, 2013.

[24] K. Sankaran, M. Zhu, X. F. Guo, A. L. Ananda, M. C. Chan, and L.-S. Peh, “Using mobile phone barometer for low-power transportation context detection,” inProc. of ACM Sensys, 2014.

[25] M. Han, L. T. Vinh, Y.-K. Lee, and S. Lee, “Comprehensive context recognizer based on multi- modal sensors in a smartphone,”Sensors, vol. 12, pp. 12 588–12 605, 2012.

[26] L. Wang and D. Roggen, “Sound-based transportation mode recognition with smartphones,” in Proc. of IEEE ICASSP, 2019.

[27] Y.-S. Lee and S.-B. Cho, “Activity recognition using hierarchical hidden markov models on a smartphone with 3d accelerometer,” inProc. of Springer HAIS, 2011.

[28] J. V. Jeyakumar, E. S. Lee, Z. Xia, S. S. Sandha, N. Tausik, and M. Srivastava, “Deep convolutional bidirectional lstm based transportation mode recognition,” inProc. of ACM Ubicomp, 2018.

[29] Google, “Tensorflow libraries & extensions,” https://www.tensorflow.org/resources/libraries- extensions, Visited: Oct. 2019.

[30] M. A. Shafique and E. Hato, “Use of acceleration data for transportation mode prediction,”

Springer Transportation, vol. 42, no. 1, pp. 163–188, 2015.

[31] R. C. Shah, C. yih Wan, H. Lu, and L. Nachman, “Classifying the mode of transportation on mobile phones using GIS information,” inProc. of ACM UbiComp, 2014.

[32] K.-Y. Chen, R. C. Shah, J. Huang, and L. Nachman, “Mago: Mode of transport inference using the hall-effect magnetic sensor and accelerometer,”ACM IMWUT, vol. 1, no. 2, pp. 1–23, 2017.

[33] P. Zhou, Y. Zheng, and M. Li, “How long to wait? predicting bus arrival time with mobile phone based participatory sensing,”IEEE Transactions on Mobile Computing, vol. 13, no. 6, pp. 1228–

1241, 2014.

[34] K.-B. Duan and S. S. Keerthi, “Which is the best multiclass svm method? an empirical study,”

Lecture Notes in Computer Science, vol. 3541, pp. 278–285, 2005.

[35] “Spectral density,” https://en.wikipedia.org/wiki/Spectral-density, Visited: Feb. 2016.

[36] M. Steriu, “Statistics brief world metro figures,” Union Internationale des Transports Publics, Oct 2014.

[37] V. Vijaykumar, P. Vanathi, and P. Kanagasapabathy, “Modified adaptive filtering algorithm for noise cancellation in speech signals,”Elektronika ir elektrotechnika, vol. 74, no. 2, pp. 17–20, 2007.

[38] P. H. Veltink, H. B. J. Bussmann, W. de Vries, W. L. J. Martens, and R. C. V. Lummel, “Detection of static and dynamic activities using uniaxial accelerometers,”IEEE Transactions on Reabilita- tion Engineering, vol. 4, no. 4, pp. 375–385, 1996.

[39] S. Trost, P. D. Loprinzi, R. Moore, and K. Pfeiffer, “Comparison of accelerometer cut points for predicting activity intensity in youth,”Medicine and Science in Sports and Exercise, vol. 43, no. 7, pp. 1360–1368, 2011.

[40] Z. Fang, Z. Guoliang, and S. Zhanjiang, “Comparison of different implementations of mfcc,”

Journal of Computer Science and Technology, vol. 16, pp. 582–589, 2001.

[41] X. Zhou, D. Garcia-Romero, R. Duraiswami, C. Espy-Wilson, and S. Shamma, “Linear versus mel frequency cepstral coefficients for speaker recognition,” inProc. of IEEE Workshop on Auto- matic Speech Recognition & Understanding, 2011.

[42] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” inProc. of NIPS, 2012, pp. 1097–1105.

[43] D. M. W. Powers, “Evaluation: From precision, recall and f-measure to roc, informedness, markedness & correlation,”Journal of Machine Learning Technologies, vol. 2, pp. 37–63, 2011.

[44] M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isardet al., “Tensorflow: A system for large-scale machine learning,” in Proc. of USENIX OSDI, 2016, pp. 265–283.

[45] C.-C. Chang and C.-J. Lin, “LIBSVM: A library for support vector machines,”ACM Transactions on Intelligent Systems and Technology, vol. 2, no. 3, pp. 1–27, 2011.

[46] Y. Kang, J. Hauswald, C. Gao, A. Rovinski, T. Mudge, J. Mars, and L. Tang, “Neurosurgeon:

Collaborative intelligence between the cloud and mobile edge,” inProc. of ACM ASPLOS, 2017.

[47] M. D. Zeiler, G. W. Taylor, and R. Fergus, “Adaptive deconvolutional networks for mid and high level feature learning,” inProc. of IEEE ICCV, 2011, pp. 2018–2025.

[48] S. J. Pan and Q. Yang, “A survey on transfer learning,”IEEE Transactions on knowledge and data engineering, vol. 22, no. 10, pp. 1345–1359, Oct. 2009.

[49] J. Yosinski, J. Clune, Y. Bengio, and H. Lipson, “How transferable are features in deep neural networks?” inProc. of NIPS, 2014, pp. 3320–3328.

[50] J. Ballé, V. Laparra, and E. P. Simoncelli, “End-to-end optimization of nonlinear transform codes for perceptual quality,” inIEEE Picture Coding Symposium (PCS), 2016, pp. 1–5.

[51] N. Le, H. Zhang, F. Cricri, R. Ghaznavi-Youvalari, and E. Rahtu, “Image coding for machines:

An end-to-end learned approach,” inIEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2021, pp. 1590–1594.

[52] L. Liu, H. Li, and M. Gruteser, “Edge assisted real-time object detection for mobile augmented reality,” inACM MobiCom, 2019, pp. 1–16.

[53] K. Du, A. Pervaiz, X. Yuan, A. Chowdhery, Q. Zhang, H. Hoffmann, and J. Jiang, “Server-driven video streaming for deep learning inference,” inACM SIGCOMM, 2020.

[54] W. Zhang, Z. He, L. Liu, Z. Jia, Y. Liu, M. Gruteser, D. Raychaudhuri, and Y. Zhang, “Elf:

accelerate high-resolution mobile deep vision with content-aware parallel offloading,” inACM MobiCom, 2021, pp. 201–214.

[55] Y. Kang, J. Hauswald, C. Gao, A. Rovinski, T. Mudge, J. Mars, and L. Tang, “Neurosurgeon:

Collaborative intelligence between the cloud and mobile edge,”ACM SIGARCH Computer Ar- chitecture News, vol. 45, no. 1, pp. 615–629, 2017.

[56] S. Yao, J. Li, D. Liu, T. Wang, S. Liu, H. Shao, and T. Abdelzaher, “Deep compressive offloading:

speeding up neural network inference by trading edge computation for network latency,” inACM SenSys, 2020, pp. 476–488.

[57] G. Hinton, O. Vinyals, and J. Dean, “Distilling the knowledge in a neural network,”arXiv preprint arXiv:1503.02531, 2015.

[58] M. Cordts, M. Omran, S. Ramos, T. Rehfeld, M. Enzweiler, R. Benenson, U. Franke, S. Roth, and B. Schiele, “The cityscapes dataset for semantic urban scene understanding,” inIEEE CVPR, 2016, pp. 3213–3223.

[59] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, “Imagenet: A large-scale hierarchical image database,” inIEEE CVPR, 2009, pp. 248–255.

[60] P. Young, A. Lai, M. Hodosh, and J. Hockenmaier, “From image descriptions to visual denota- tions: New similarity metrics for semantic inference over event descriptions,”Transactions of the Association for Computational Linguistics, vol. 2, pp. 67–78, 2014.

[61] J. Sneyers and P. Wuille, “Flif: Free lossless image format based on maniac compression,” in IEEE International Conference on Image Processing (ICIP), 2016, pp. 66–70.

[62] WebP Image format. https://developers.google.com /speed/webp/.

[63] G. K. Wallace, “The JPEG still picture compression standard,” IEEE transactions on consumer electronics, vol. 38, no. 1, pp. 18–34, 1992.

[64] F. Bellard. BPG Image format. https://bellard.org/ bpg/.

[65] F. Mentzer, L. V. Gool, and M. Tschannen, “Learning better lossless compression using lossy compression,” inIEEE CVPR, 2020, pp. 6638–6647.

[66] F. Mentzer, E. Agustsson, M. Tschannen, R. Timofte, and L. V. Gool, “Practical full resolution learned lossless image compression,” inIEEE CVPR, 2019, pp. 10 629–10 638.

[67] F. Mentzer, E. Agustsson, M. Tschannen, R. Timofte, and L. Van Gool, “Conditional probability models for deep image compression,” inIEEE CVPR, 2018, pp. 4394–4402.

[68] J. Lee, S. Cho, and S.-K. Beack, “Context-adaptive entropy model for end-to-end optimized image

[69] L. Theis, W. Shi, A. Cunningham, and F. Huszár, “Lossy image compression with compressive autoencoders,”International Conference on Learning Representations (ICLR), 2017.

[70] M. Li, W. Zuo, S. Gu, D. Zhao, and D. Zhang, “Learning convolutional networks for content- weighted image compression,” inIEEE CVPR, 2018, pp. 3214–3223.

[71] Y. Choi, M. El-Khamy, and J. Lee, “Variable rate deep image compression with a conditional autoencoder,” inIEEE ICCV, 2019, pp. 3146–3154.

[72] C. E. Shannon, “A mathematical theory of communication,” The Bell system technical journal, vol. 27, no. 3, pp. 379–423, 1948.

[73] Z. Wang, A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli, “Image quality assessment: from error visibility to structural similarity,”IEEE transactions on image processing, vol. 13, no. 4, pp.

600–612, 2004.

[74] X. Xiao, J. Zhang, W. Wang, J. He, and Q. Zhang, “Dnn-driven compressive offloading for edge- assisted semantic video segmentation,” inIEEE INFOCOM, 2022.

[75] J. Lee, S. Lee, J. Lee, S. D. Sathyanarayana, H. Lim, J. Lee, X. Zhu, S. Ramakrishnan, D. Grun- wald, K. Lee et al., “Perceive: deep learning-based cellular uplink prediction using real-time scheduling patterns,” inACM MobiSys, 2020, pp. 377–390.

[76] S. Holmer, H. Lundin, G. Carlucci, L. D. Cicco, and S. Mascolo, “A google congestion control algorithm for real-time communication,” inIETF draft, January 2017.

[77] T. H. Cormen, C. E. Leiserson, R. L. Rivest, and C. Stein, Introduction to algorithms. MIT press, 2009.

[78] R. Caruana, “Multitask learning,”Machine learning, vol. 28, no. 1, pp. 41–75, 1997.

[79] A. Kendall, Y. Gal, and R. Cipolla, “Multi-task learning using uncertainty to weigh losses for scene geometry and semantics,” inIEEE CVPR, 2018, pp. 7482–7491.

[80] Z. Cheng, H. Sun, M. Takeuchi, and J. Katto, “Learned image compression with discretized gaussian mixture likelihoods and attention modules,” inIEEE CVPR, 2020.

[81] Samsung Galaxy S20 Ultra 5G. https://www. samsung.com/us/mobile/galaxy-s20-5g/specs/.

[82] F. Yu, V. Koltun, and T. Funkhouser, “Dilated residual networks,” inIEEE CVPR, 2017.

[83] J.-Y. Park, Y. Hwang, D. Lee, and J.-H. Kim, “Marsnet: Multi-label classification network for images of various sizes,”IEEE Access, vol. 8, pp. 21 832–21 846, 2020.

[84] I. Alhashim and P. Wonka, “High quality monocular depth estimation via transfer learning,”arXiv preprint arXiv:1812.11941, 2018.

[85] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” inIEEE CVPR, 2016, pp. 770–778.

[86] A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly et al., “An image is worth 16x16 words: Transformers for image recognition at scale,”arXiv preprint arXiv:2010.11929, 2020.

[87] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.

[88] P. Vincent, H. Larochelle, I. Lajoie, Y. Bengio, P.-A. Manzagol, and L. Bottou, “Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion.”

Journal of machine learning research, vol. 11, no. 12, 2010.

[89] K. Papineni, S. Roukos, T. Ward, and W.-J. Zhu, “Bleu: a method for automatic evaluation of machine translation,” inAssociation for Computational Linguistics (ACL), 2002, pp. 311–318.

[90] Portable Network Graphics (PNG). http://libpng.org /pub/png/libpng.html.

[91] R. Netravali, A. Sivaraman, S. Das, A. Goyal, K. Winstein, J. Mickens, and H. Balakrishnan,

“Mahimahi: Accurate{Record-and-Replay}for{HTTP},” inUSENIX ATC 15, 2015, pp. 417–

429.

[92] C. Ledig, L. Theis, F. Huszár, J. Caballero, A. Cunningham, A. Acosta, A. Aitken, A. Tejani, J. Totz, Z. Wanget al., “Photo-realistic single image super-resolution using a generative adversarial network,” inIEEE CVPR, 2017, pp. 4681–4690.

[93] H. Zhang, T. Xu, H. Li, S. Zhang, X. Wang, X. Huang, and D. N. Metaxas, “Stackgan: Text to photo-realistic image synthesis with stacked generative adversarial networks,” inIEEE ICCV, 2017, pp. 5907–5915.

[94] D. P. Kingma and M. Welling, “An introduction to variational autoencoders,” arXiv preprint arXiv:1906.02691, 2019.

[95] M. Lopez-Martin, B. Carro, A. Sanchez-Esguevillas, and J. Lloret, “Conditional variational autoencoder for prediction and feature recovery applied to intrusion detection in IoT,” Sensors, vol. 17, no. 9, p. 1967, 2017.

[96] E. Agustsson, M. Tschannen, F. Mentzer, R. Timofte, and L. V. Gool, “Generative adversarial networks for extreme learned image compression,” inIEEE ICCV, 2019, pp. 221–231.

[97] K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,”International Conference on Learning Representations (ICLR), 2014.

[98] Z. Cheng, H. Sun, M. Takeuchi, and J. Katto, “Learned image compression with discretized

[99] D. He, Y. Zheng, B. Sun, Y. Wang, and H. Qin, “Checkerboard context model for efficient learned image compression,” inIEEE CVPR, 2021, pp. 14 771–14 780.

[100] R. J. Wang, X. Li, and C. X. Ling, “Pelee: A real-time object detection system on mobile devices,”

inNIPS, vol. 31, 2018.

[101] A. Wong, M. Famuori, M. J. Shafiee, F. Li, B. Chwyl, and J. Chung, “Yolo nano: a highly compact you only look once convolutional neural network for object detection,” inWorkshop on Energy Efficient Machine Learning and Cognitive Computing-NeurIPS Edition (EMC2-NIPS), 2019, pp.

22–25.

[102] Y.-C. Chiu, C.-Y. Tsai, M.-D. Ruan, G.-Y. Shen, and T.-T. Lee, “Mobilenet-ssdv2: An improved object detection model for embedded systems,” inIEEE International conference on system science and engineering (ICSSE), 2020, pp. 1–5.

[103] K. Duan, S. Bai, L. Xie, H. Qi, Q. Huang, and Q. Tian, “Centernet: Keypoint triplets for object detection,” inIEEE ICCV, 2019, pp. 6568–6577.

[104] P. Ganesh, Y. Chen, Y. Yang, D. Chen, and M. Winslett, “Yolo-ret: Towards high accuracy real- time object detection on edge gpus,” inIEEE/CVF Winter Conference on Applications of Com- puter Vision (WACV), 2022, pp. 3267–3277.

[105] L. N. Huynh, Y. Lee, and R. K. Balan, “Deepmon: Mobile gpu-based deep learning framework for continuous vision applications,” inACM MobiSys, 2017, pp. 82–95.

[106] M. Xu, M. Zhu, Y. Liu, F. X. Lin, and X. Liu, “Deepcache: Principled cache for mobile deep vision,” inACM Mobicom, 2018, pp. 129–144.

[107] B. Wang and P. Dudek, “A fast self-tuning background subtraction algorithm,” in IEEE CVPR Workshops, 2014, pp. 395–398.

Acknowledgements

First of all, I would like to express my deep appreciation to my advisors, Prof. Youngbin Im and Prof.

Kyunghan Lee, who provided motivation, encouragement, and guidance during the research and writing of this dissertation.

I am deeply grateful to the rest of the dissertation committee: Prof. Hyoil Kim, Prof. Sung Whan Yoon, and Prof. Changhee Joo, for offering insightful and valuable comments and constructive criticisms to improve my dissertation.

I would also like to thank the collaborators and experts: Prof. Sangtae Ha and Jinsung Lee, who were involved in my research.

I sincerely thank my colleagues in NXC lab: JunSeon Kim, Seongmin Ham, Seyeon Kim, Jeong- Min Bae, Shinik Park, JongYun Lee, Kyungmin Bin, GoodSol Lee, WooSeung Nam, SeongSik Cho, Sanghyun Han, DongGyu Yang, Taekyung Han, Jaeyoon Hwang, Serae Kim, Gibum Park, Jongseok Park, Byunggu Kang, Dongsu Kwak, and Gyulim Gu who shared an unforgettable time with me.

Special thanks to all my friends who have always believed in and supported me.

Lastly, I would like to express my sincere gratitude to my parents and older sister for their unwavering love and support throughout my graduate school life.

Dalam dokumen Enabling Deep Neural Network Inferences on Resource-constraint Devices (Halaman 84-95)