A Survey on Light-weight Convolutional Neural Networks: Trends, Issues and Future Scope
Keywords:Lightweight CNNs, deep learning, survey, convolutional neural networks, limited-hardware devices
Today with the substantial increase in the computing power of small devices and systems new challenges are emerging. For example, how to control a small handheld device which has the computing capabilities of a desktop Personal computer (PC) used five years ago. Devolving decision-making power to the device in order to make it more intelligent e.g. in the case of autonomous driving, is an interesting area. Deep learning has paved the way for this task due to its reliable decision-making capabilities which are quite popular. However for small devices there are constraints like availability of limited computation hardware, less power due to small batteries, need for real-time as well as accurate decision-making abilities, etc. In this regard, light-weight Convolutional Neural Networks (CNNs) are a valuable tool. Lightweight CNNs like MobileNets, ShuffleNets, CondenseNets, etc. are deep networks which have a much lesser number of layers and a much smaller number of parameters as compared to their larger CNN counterparts like GoogLeNet, Inception, ResNets, etc. Due to their unique advantages for small stand-alone systems, light-weight CNNs are used in these systems. In this literature survey the notable light-weight CNNs along with their architecture, design features, performance metrics, advantages, etc are discussed. The trends, issues and future scope in the area are also discussed. It is hoped that by studying this survey, the reader will engage in research in this interesting area.
Amna Amanat, Muhammad Rizwan, Abdul Rehman Javed, Maha Abdelhaq, Raed Alsaqour, Sharnil Pandya, and Mueen Uddin. Deep learning for depression detection from textual data. Electronics, 11(5), 2022.
Matin N. Ashtiani and Bijan Raahemi. News-based intelligent prediction of financial markets using text mining and machine learning: A systematic literature review. Expert Systems with Applications, 217:119509, 2023.
Kishor Barasu Bhangale and Mohanaprasad Kothandaraman. Survey of deep learning paradigms for speech processing. Wireless Personal Communications, 125(2):1913–1949, 2022.
Abdelmalek Bouguettaya, Ahmed Kechida, and Amine Mohammed Taberkit. A survey on lightweight cnn-based object detection algorithms for platforms with limited computational resources. International Journal of Informatics and Applied Mathematics, 2(2):28–44.
Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, and Alan L. Yuille. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40(4):834–848, 2018.
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale hierarchical image database. In 2009 IEEE Conference on Computer Vision and Pattern Recognition, pages 248–255, 2009.
Jean Louis K. E Fendji, Diane C. M. Tala, Blaise O. Yenke, and Marcellin Atemkeng. Automatic speech recognition using limited vocabulary: A survey. Applied Artificial Intelligence, 36(1):2095039, 2022.
Rodolfo Ferro-Pérez and Hugo Mitre-Hernandez. ResMoNet: a residual mobile-based network for facial emotion recognition in resource-limited systems. arXiv preprint arXiv: 2005.07649v1, 2020.
Ross Girshick. Fast R-CNN. In 2015 IEEE International Conference on Computer Vision (ICCV), pages 1440–1448, 2015.
Ross Girshick, Jeff Donahue, Trevor Darrell, and Jitendra Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. In 2014 IEEE Conference on Computer Vision and Pattern Recognition, pages 580–587, 2014.
A. M. Hafiz, R. A. Bhat, and M. Hassaballah. Image classification using convolutional neural network tree ensembles. Multimedia Tools and Applications, pages 1–18, 2022.
A. M. Hafiz and M. Hassaballah. Digit image recognition using an ensemble of one-versus-all deep network classifiers. In M. Shamim Kaiser, Juanying Xie, and Vijay Singh Rathore, editors, Information and Communication Technology for Competitive Strategies (ICTCS 2020), pages 445–455, Singapore, 2021. Springer Singapore.
Abdul Mueed Hafiz and Ghulam Mohiuddin Bhat. A survey of deep learning techniques for medical diagnosis. In Milan Tuba, Shyam Akashe, and Amit Joshi, editors, Information and Communication Technology for Sustainable Development, pages 161–170, Singapore, 2020. Springer Singapore.
Abdul Mueed Hafiz and Ghulam Mohiuddin Bhat. A survey on instance segmentation: state of the art. International journal of multimedia information retrieval, 9(3):171–189, 2020.
Abdul Mueed Hafiz and Ghulam Mohiuddin Bhat. Fast Training of Deep Networks with One-Class CNNs, pages 409–421. Springer International Publishing, Cham, 2021.
Abdul Mueed Hafiz, Rouf Ul Alam Bhat, Shabir Ahmad Parah, and M Hassaballah. SE-MD: a single-encoder multiple-decoder deep network for point cloud reconstruction from 2D images. Pattern Analysis and Applications, pages 1–12, 2023.
Jin Han and Yonghao Yang. L-Net: lightweight and fast object detector-based shufflenetv2. Journal of Real-Time Image Processing, 18(6):2527–2538, 2021.
Song Han, Huizi Mao, and William J Dally. Deep compression: Compressing deep neural networks with pruning, trained quantization and huffman coding. arXiv preprint arXiv:1510.00149, 2015.
Mahmoud Hassaballah and Ali Ismail Awad. Deep learning in computer vision: principles and applications. CRC Press, 2020.
Mahmoud Hassaballah and Hosny Khalid M. Recent Advances in Computer Vision: Theories and Applications. Springer, 2019.
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 770–778, 2016.
Xin He, Kaiyong Zhao, and Xiaowen Chu. AutoML: A survey of the state-of-the-art. Knowledge-Based Systems, 212:106622, 2021.
Andrew Howard, Mark Sandler, Grace Chu, Liang-Chieh Chen, Bo Chen, Mingxing Tan, Weijun Wang, Yukun Zhu, Ruoming Pang, Vijay Vasudevan, et al. Searching for mobilenetv3. In 2019 IEEE/CVF international conference on computer vision, pages 1314–1324, 2019.
Andrew G Howard, Menglong Zhu, Bo Chen, Dmitry Kalenichenko, Weijun Wang, Tobias Weyand, Marco Andreetto, and Hartwig Adam. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv preprint arXiv:1704.04861, 2017.
Jie Hu, Li Shen, and Gang Sun. Squeeze-and-excitation networks. In 2018 IEEE conference on computer vision and pattern recognition, pages 7132–7141, 2018.
G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger. Densely connected convolutional networks. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 2261–2269, Los Alamitos, CA, USA, jul 2017. IEEE Computer Society.
Gao Huang, Shichen Liu, Laurens van der Maaten, and Kilian Q. Weinberger. CondenseNet: an efficient densenet using learned group convolutions. In 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018.
Forrest N Iandola, Song Han, Matthew W Moskewicz, Khalid Ashraf, William J Dally, and Kurt Keutzer. SqueezeNet: alexnet-level accuracy with 50x fewer parameters and
5 mb model size. arXiv preprint arXiv:1602.07360, 2016.
Longlong Jing and Yingli Tian. Self-supervised visual feature learning with deep neural networks: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43(11):4037–4058, 2021.
Ivan Khokhlov, Egor Davydenko, Ilya Osokin, Ilya Ryakin, Azer Babaev, Vladimir Litvinenko, and Roman Gorbachev. Tiny-YOLO object detection supplemented with geometrical data. In 2020 IEEE 91st Vehicular Technology Conference (VTC2020-Spring), pages 1–5. IEEE, 2020.
O. Kopuklu, N. Kose, A. Gunduz, and G. Rigoll. Resource Efficient 3D Convolutional Neural Networks. In 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW), pages 1910–1919, Los Alamitos, CA, USA, Oct 2019. IEEE Computer Society.
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep convolutional neural networks. In F. Pereira, C.J. Burges, L. Bottou, and K.Q. Weinberger, editors, Advances in Neural Information Processing Systems, volume 25. Curran Associates, Inc., 2012.
Alina Kuznetsova, Hassan Rom, Neil Alldrin, Jasper Uijlings, Ivan Krasin, Jordi Pont-Tuset, Shahab Kamali, Stefan Popov, Matteo Malloci, Alexander Kolesnikov, et al. The open images dataset v4. International Journal of Computer Vision, 128(7):1956–1981, 2020.
Y. Lecun, L. Bottou, Y. Bengio, and P. Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998.
Christian Ledig, Lucas Theis, Ferenc Huszar, Jose Caballero, Andrew Cunningham, Alejandro Acosta, Andrew Aitken, Alykhan Tejani, Johannes Totz, Zehan Wang, and Wenzhe Shi. Photo-realistic single image super-resolution using a generative adversarial network. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 2017.
Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollár. Focal loss for dense object detection. In 2017 IEEE international conference on computer vision (ICCV), pages 2980–2988, 2017.
Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollár, and C Lawrence Zitnick. Microsoft coco: Common objects in context. In 2014 European Conference on Computer Vision (ECCV), pages 740–755. Springer, 2014.
Fuchen Long, Ting Yao, Zhaofan Qiu, Xinmei Tian, Jiebo Luo, and Tao Mei. Learning to localize actions from moments. arXiv preprint arXiv: 2008.13705, 2020.
Zhou Long, Wei Suyuan, Cui Zhongma, Fang Jiaqi, Yang Xiaoting, and Ding Wei. Lira-YOLO: A lightweight model for ship detection in radar images. Journal of Systems Engineering and Electronics, 31(5):950–956, 2020.
Ningning Ma, Xiangyu Zhang, Hai-Tao Zheng, and Jian Sun. ShuffleNet V2: Practical guidelines for efficient cnn architecture design. In 2018 European Conference on Computer Vision (ECCV), September 2018.
Sparsh Mittal. Power management techniques for data centers: A survey. arXiv preprint arXiv:1404.6681, 2014.
Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(6):1137–1149, 2017.
Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-Net: convolutional networks for biomedical image segmentation. In Nassir Navab, Joachim Hornegger, William M. Wells, and Alejandro F. Frangi, editors, Medical Image Computing and Computer-Assisted Intervention - MICCAI 2015, pages 234–241, Cham, 2015. Springer International Publishing.
Faiqa Sajid, Abdul Rehman Javed, Asma Basharat, Natalia Kryvinska, Adil Afzal, and Muhammad Rizwan. An efficient deep learning framework for distracted driver detection. IEEE Access, 9:169270–169280, 2021.
Mark Sandler, Andrew Howard, Menglong Zhu, Andrey Zhmoginov, and Liang-Chieh Chen. Mobilenetv2: Inverted residuals and linear bottlenecks. In 2018 IEEE conference on computer vision and pattern recognition, pages 4510–4520, 2018.
Evan Shelhamer, Jonathan Long, and Trevor Darrell. Fully convolutional networks for semantic segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(4):640–651, apr 2017.
Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv: 1409.1556, 2014.
C. Szegedy, Wei Liu, Yangqing Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich. Going deeper with convolutions. In 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1–9, Los Alamitos, CA, USA, jun 2015. IEEE Computer Society.
Mingxing Tan, Bo Chen, Ruoming Pang, Vijay Vasudevan, Mark Sandler, Andrew Howard, and Quoc V Le. Mnasnet: Platform-aware neural architecture search for mobile. In 2019 IEEE/CVF conference on computer vision and pattern recognition, pages 2820–2828, 2019.
Du Tran, Lubomir Bourdev, Rob Fergus, Lorenzo Torresani, and Manohar Paluri. Learning spatiotemporal features with 3d convolutional networks. In 2015 IEEE International Conference on Computer Vision (ICCV), December 2015.
O. Vinyals, A. Toshev, S. Bengio, and D. Erhan. Show and tell: A neural image caption generator. In 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 3156–3164, Los Alamitos, CA, USA, jun 2015. IEEE Computer Society.
Robert J. Wang, Xiang Li, and Charles X. Ling. Pelee: A real-time object detection system on mobile devices. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 31. Curran Associates, Inc., 2018.
Yunfei Wang, Rong Li, Zheng Wang, Zhixin Hua, Yitao Jiao, Yuanchao Duan, and Huaibo Song. E3D: An efficient 3D CNN for the recognition of dairy cow’s basic motion behavior. Computers and Electronics in Agriculture, 205:107607, 2023.
Bichen Wu, Forrest Iandola, Peter H Jin, and Kurt Keutzer. Squeezedet: Unified, small, low power fully convolutional neural networks for real-time object detection for autonomous driving. In 2017 IEEE conference on computer vision and pattern recognition workshops, pages 129–137, 2017.
Zhang Xiang Yan Chun-man and Wang Qingpeng. Face expression recognition based on improved MobileNeXt. Research Square Preprint, DOI: 10.21203/rs.3.rs-2270472/v1, 16 November 2022.
Le Yang, Haojun Jiang, Ruojin Cai, Yulin Wang, Shiji Song, Gao Huang, and Qi Tian. CondenseNet V2: Sparse feature reactivation for deep networks. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 3569–3578, June 2021.
Tien-Ju Yang, Andrew Howard, Bo Chen, Xiao Zhang, Alec Go, Mark Sandler, Vivienne Sze, and Hartwig Adam. Netadapt: Platform-aware neural network adaptation for mobile applications. In 2018 European Conference on Computer Vision (ECCV), pages 285–300, 2018.
Jianmei Zhang, Hongqing Zhu, Pengyu Wang, and Xiaofeng Ling. ATT squeeze U-Net: a lightweight network for forest fire detection and recognition. IEEE Access, 9:10858–10870, 2021.
Xiangyu Zhang, Xinyu Zhou, Mengxiao Lin, and Jian Sun. ShuffleNet: an extremely efficient convolutional neural network for mobile devices. In 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2018.
Hengshuang Zhao, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, and Jiaya Jia. Pyramid scene parsing network. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 6230–6239, 2017.
Yanli Zhao and Guang Yang. Deep learning-based integrated framework for stock price movement prediction. Applied Soft Computing, 133:109921, 2023.