Performance of End-to-end Model Based on Convolutional LSTM for Human Activity Recognition
DOI:
https://doi.org/10.13052/jwe1540-9589.21512Keywords:
Human activity recognition, video-based model, deep learning, convolutional long-short term memory, end-to-end modelAbstract
Human activity recognition (HAR) is a key technology in many applications, such as smart signage, smart healthcare, smart home, etc. In HAR, deep learning-based methods have been proposed to recognize activity data effectively from video streams. In this paper, the end-to-end model based on convolutional long short-term memory (LSTM) is proposed to recognize human activities. Convolutional LSTM can learn features of spatial and temporal simultaneously from video stream data. Also, the number of learning weights can be diminished by employing convolutional LSTM with an end-to-end model. The proposed HAR model was optimized with various simulation environments using activities data from the AI hub. From simulation results, it can be confirmed that the proposed model can be outperformed compared with the conventional model.
Downloads
References
F. Gu, M.-H. Chung, M. Chignell, S. Valaee, B. Zhou, and X. Liu, “A survey on deep learning for human activity recognition,” ACM Computing Surveys, vol. 54, no. 8, pp. 1–34, Oct. 2021.
S. Neili Boualia and N. Essoukri Ben Amara, “Deep full-body HPE for activity recognition from RGB frames only” Informatics, vol. 8, no. 1, pp. 1–16, Jan. 2021.
L. Pei, S. Xia, L. Chu, F. Xiao, W. Yu, and R. Qiu, “MARS: Mixed virtual and real wearable sensors for human activity recognition with multidomain deep learning model,” IEEE Internet of Things Journal, vol. 8, no. 11, pp. 9383–9396, June 2021.
S. Song, C. Lan, J. Xing, W. Zeng, and J. Liu, “An end-to-end spatio-temporal attention model for human action recognition from skeleton data,” in Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence (AAAI’17), San Francisco, USA, Feb. 2017, pp. 4263–4270.
T. Zebin, M. Sperrin, N. Peek and A. J. Casson, “Human activity recognition from inertial sensor time-series using batch normalized deep LSTM recurrent networks,” in Proceedings of 2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), Honolulu, USA, July 2018, pp. 1–4.
A. Singh, S. Agarwal, P. Nagrath, A. Saxena and N. Thakur, “Human pose estimation using convolutional neural networks,” in Proceedings of 2019 Amity International Conference on Artificial Intelligence (AICAI), Dubai, U.A.E., Feb. 2019, pp. 946–952.
C. Ito, X. Cao, M. Shuzo, and E. Maeda, “Application of CNN for human activity recognition with FFT spectrogram of acceleration and gyro sensors,” in Proceedings of the 2018 ACM International Joint Conference and 2018 International Symposium on Pervasive and Ubiquitous Computing and Wearable Computers, Singapore, Singrapore, Oct. 2019, pp. 1503–1510.
S. Ji, W. Xu, M. Yang, and K. Yu, “3D convolutional neural networks for human action recognition,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 35, no. 1, pp. 221–231, Jan. 2013.
X. Shi, Z. Chen, H. Wang, D.-Y. Yeung, W.-k. Wong, and W.-c. Woo, “Convolutional LSTM Network: a machine learning approach for precipitation nowcasting,” in Proceedings of the 28th International Conference on Neural Information Processing Systems – Volume 1(NIPS’15), Montreal, Canada, Dec. 2015, pp. 802–810.
Md. Z. Islam, Md. M. Islam, and A. Asraf, “A combined deep CNN-LSTM network for the detection of novel coronavirus (COVID-19) using X-ray images,” Informatics in Medicine Unlocked, vol. 20, no. 100412, pp. 1–11 Aug. 2020.
Y. Bengio and Y. Grandvalet, “No unbiased estimator of the variance of k-fold cross-validation,” The Journal of Machine Learning Research, vol. 5, pp. 1089–1105, Dec. 2004.
B. Zhang, S. Qi, P. Monkam, C. Li, F. Yang, Y.-D. Yao, and W. Qian, “Ensemble learners of multiple deep CNNs for pulmonary nodules classification using CT images,” IEEE Access, vol. 7, pp. 110358–110371, Aug. 2019.
I. Markoulidakis, I. Rallis, I. Georgoulas, G. Kopsiaftis, A. Doulamis, and N. Doulamis, “Multiclass confusion matrix reduction method and its application on net promoter score classification problem,” Technologies, vol. 90, no. 4, pp. 1–22, Nov. 2021.
Fitness gesture image in AI hub. Retrieved from: https://aihub.or.kr/aidata/8051.