Performance of End-to-end Model Based on Convolutional LSTM for Human Activity Recognition

Authors

  • Young Ghyu Sun Department of Electronic Convergence Engineering, Kwangwoon University, Seoul, Korea
  • Soo Hyun Kim Department of Electronic Convergence Engineering, Kwangwoon University, Seoul, Korea
  • Seongwoo Lee Department of Electronic Convergence Engineering, Kwangwoon University, Seoul, Korea
  • Joonho Seon Department of Electronic Convergence Engineering, Kwangwoon University, Seoul, Korea
  • SangWoon Lee Department of Multimedia, Namseoul University, Cheonan, Korea
  • Cheong Ghil Kim Department of Computer Science, Namseoul University, Cheonan, Korea
  • Jin Young Kim Department of Electronic Convergence Engineering, Kwangwoon University, Seoul, Korea

DOI:

https://doi.org/10.13052/jwe1540-9589.21512

Keywords:

Human activity recognition, video-based model, deep learning, convolutional long-short term memory, end-to-end model

Abstract

Human activity recognition (HAR) is a key technology in many applications, such as smart signage, smart healthcare, smart home, etc. In HAR, deep learning-based methods have been proposed to recognize activity data effectively from video streams. In this paper, the end-to-end model based on convolutional long short-term memory (LSTM) is proposed to recognize human activities. Convolutional LSTM can learn features of spatial and temporal simultaneously from video stream data. Also, the number of learning weights can be diminished by employing convolutional LSTM with an end-to-end model. The proposed HAR model was optimized with various simulation environments using activities data from the AI hub. From simulation results, it can be confirmed that the proposed model can be outperformed compared with the conventional model.

Downloads

Download data is not yet available.

Author Biographies

Young Ghyu Sun, Department of Electronic Convergence Engineering, Kwangwoon University, Seoul, Korea

Young Ghyu Sun received the B.Sc. (summa cum laude) and M.Sc. degree from the Department of Electronic Convergence Engineering, Kwangwoon University, Seoul, Korea, in 2018 and 2020, respectively, where he is currently pursuing the Ph.D. degree. He was a recipient of the IEEE Student Paper Gold Award in 2020. His research interests include wireless communication systems, deep learning, and internet of energy, etc.

Soo Hyun Kim, Department of Electronic Convergence Engineering, Kwangwoon University, Seoul, Korea

Soo Hyun Kim received the B.Sc. and M.Sc. degree from the Department of Electronic Convergence Engineering, Kwangwoon University, Seoul, Korea, in 2019 and 2021, respectively, where he is currently pursuing the Ph.D. degree. His research interests include wireless communication systems, and machine learning applications for internet of energy etc.

Seongwoo Lee, Department of Electronic Convergence Engineering, Kwangwoon University, Seoul, Korea

Seongwoo Lee received the B.Sc. (magna cum laude) degree from the Department of Electronic Convergence Engineering, Kwangwoon University, Seoul, Korea, in 2021, where he is currently working toward the M.Sc. degree. He was a recipient of the Best Paper Award of IIBC in 2021. His research interests include smart grid, internet of energy, deep learning, etc.

Joonho Seon, Department of Electronic Convergence Engineering, Kwangwoon University, Seoul, Korea

Joonho Seon received the B.Sc. degree from the Department of Electronic Convergence Engineering, Kwangwoon University, Seoul, Korea, in 2021, where he is currently pursuing the Ph.D. degree. His research interests include internet of energy, deep learning, smart grid, etc. He received the Best Paper Awards from the Institute of Internet, Broadcasting and Communication (IIBC) conference (2021).

SangWoon Lee, Department of Multimedia, Namseoul University, Cheonan, Korea

SangWoon Lee received the B.S., M.S. in electrical engineering and Ph.D degrees in electrical and electronics engineering from Yonsei University, Seoul, Korea, in 1987, 1989 and 2005, respectively. From 1991 to 2005, he worked as a research engineer and project manager of R&D Center of MBC (MunHwa Broadcasting Corp.), Seoul Korea. In 2005, he joined Yonsei University, Seoul, Korea, as a Research Professor in the Dept. of Electrical and Electronics and a research fellow for the CABT (Center for Advanced Broadcast Technology). Currently, he is working as a professor in the Department of Multimedia at Namseoul University, Cheon-An City Korea. His main research areas are mobile multimedia broadcasting and intelligent transportation systems. He is active as a Korean Deligate for ITU-R SG6, SG1, and ISO TC204 and President of Korea ITS Society.

Cheong Ghil Kim, Department of Computer Science, Namseoul University, Cheonan, Korea

Cheong Ghil Kim received the B.S. degree in computer science from the University of Redlands, CA, USA, in 1987, and the M.S. and Ph.D. degrees in computer science from Yonsei University, South Korea, in 2003 and 2006, respectively. He is currently a Professor with the Department of Computer Science, Namseoul University, South Korea. His research areas include multimedia embedded systems, mobile AR, and 3-D contents.

Jin Young Kim, Department of Electronic Convergence Engineering, Kwangwoon University, Seoul, Korea

Jin Young Kim received the B. S., M. S., and Ph. D. degrees from the School of Electrical Engineering, Seoul National University (SNU), Seoul, Korea, in 1991, 1993, and 1998, respectively. He was Member of Research Staff at the Institute of New Media and Communications (INMC) and at the Inter-university Semiconductor Research Center (ISRC) of the SNU from 1994 to 1998. He was Postdoctoral Research Fellow at the Department of Electrical Engineering, Princeton University, NJ, U.S.A, from 1998 to 2000. He was Principal Member of Technical Staff at the Central Research and Development Center, SK Telecom, Korea, from 2000 to 2001. He is currently Full Professor at the School of Electronics Engineering, Kwangwoon University, Seoul, Korea. He had his sabbatical leave as Visiting Scientist at the LIDS (Laboratory of Information and Decision Systems), Massachusetts Institute of Technology (M.I.T), MA, U.S.A from 2009 to 2010.

His research interests include artificial intelligence, design and implementation of wireline/wireless multimedia communication systems with basis on modulation/demodulation, synchronization, and detection/estimation theory. He received the Best Paper Awards from several academic conferences and societies including Jack Neubauer Best Systems Paper Award from IEEE VT Society (2001), the Award of Prime Minister of Korea Government (2011), He is now Senior Member of IEEE, Regular Member of IET and IEICE.

References

F. Gu, M.-H. Chung, M. Chignell, S. Valaee, B. Zhou, and X. Liu, “A survey on deep learning for human activity recognition,” ACM Computing Surveys, vol. 54, no. 8, pp. 1–34, Oct. 2021.

S. Neili Boualia and N. Essoukri Ben Amara, “Deep full-body HPE for activity recognition from RGB frames only” Informatics, vol. 8, no. 1, pp. 1–16, Jan. 2021.

L. Pei, S. Xia, L. Chu, F. Xiao, W. Yu, and R. Qiu, “MARS: Mixed virtual and real wearable sensors for human activity recognition with multidomain deep learning model,” IEEE Internet of Things Journal, vol. 8, no. 11, pp. 9383–9396, June 2021.

S. Song, C. Lan, J. Xing, W. Zeng, and J. Liu, “An end-to-end spatio-temporal attention model for human action recognition from skeleton data,” in Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI’17), San Francisco, USA, Feb. 2017, pp. 4263–4270.

T. Zebin, M. Sperrin, N. Peek and A. J. Casson, “Human activity recognition from inertial sensor time-series using batch normalized deep LSTM recurrent networks,” in Proceedings of 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Honolulu, USA, July 2018, pp. 1–4.

A. Singh, S. Agarwal, P. Nagrath, A. Saxena and N. Thakur, “Human pose estimation using convolutional neural networks,” in Proceedings of 2019 Amity International Conference on Artificial Intelligence (AICAI), Dubai, U.A.E., Feb. 2019, pp. 946–952.

C. Ito, X. Cao, M. Shuzo, and E. Maeda, “Application of CNN for human activity recognition with FFT spectrogram of acceleration and gyro sensors,” in Proceedings of the 2018 ACM International Joint Conference and 2018 International Symposium on Pervasive and Ubiquitous Computing and Wearable Computers, Singapore, Singrapore, Oct. 2019, pp. 1503–1510.

S. Ji, W. Xu, M. Yang, and K. Yu, “3D convolutional neural networks for human action recognition,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no. 1, pp. 221–231, Jan. 2013.

X. Shi, Z. Chen, H. Wang, D.-Y. Yeung, W.-k. Wong, and W.-c. Woo, “Convolutional LSTM Network: a machine learning approach for precipitation nowcasting,” in Proceedings of the 28th International Conference on Neural Information Processing Systems – Volume 1(NIPS’15), Montreal, Canada, Dec. 2015, pp. 802–810.

Md. Z. Islam, Md. M. Islam, and A. Asraf, “A combined deep CNN-LSTM network for the detection of novel coronavirus (COVID-19) using X-ray images,” Informatics in Medicine Unlocked, vol. 20, no. 100412, pp. 1–11 Aug. 2020.

Y. Bengio and Y. Grandvalet, “No unbiased estimator of the variance of k-fold cross-validation,” The Journal of Machine Learning Research, vol. 5, pp. 1089–1105, Dec. 2004.

B. Zhang, S. Qi, P. Monkam, C. Li, F. Yang, Y.-D. Yao, and W. Qian, “Ensemble learners of multiple deep CNNs for pulmonary nodules classification using CT images,” IEEE Access, vol. 7, pp. 110358–110371, Aug. 2019.

I. Markoulidakis, I. Rallis, I. Georgoulas, G. Kopsiaftis, A. Doulamis, and N. Doulamis, “Multiclass confusion matrix reduction method and its application on net promoter score classification problem,” Technologies, vol. 90, no. 4, pp. 1–22, Nov. 2021.

Fitness gesture image in AI hub. Retrieved from: https://aihub.or.kr/aidata/8051.

Published

2022-08-27

How to Cite

Sun, Y. G. ., Kim, S. H. ., Lee, S. ., Seon, J. ., Lee, S. ., Kim, C. G. ., & Kim, J. Y. . (2022). Performance of End-to-end Model Based on Convolutional LSTM for Human Activity Recognition. Journal of Web Engineering, 21(05), 1671–1690. https://doi.org/10.13052/jwe1540-9589.21512

Issue

Section

SPECIAL ISSUE ON Future Multimedia Contents and Technology on Web in the 5G Era