Distributed JPEG Compression and Decompression for Big Image Data Using Map-Reduce Paradigm
Keywords:Big image data, Map-Reduce, Distributed environment, JPEG Compression, JPEG decompression
Digital data is primarily created and delivered in the form of images and videos in today’s world. Storing and transmitting such a large number of images necessitates a lot of computer resources, such as storage and bandwidth. So, rather than keeping the image data as is, the data could be compressed and then stored, which saves a lot of space. Image compression is the process of removing as much redundant data from an image as feasible while retaining only the non-redundant data. In this paper, the traditional JPEG compression technique is executed in the distributed environment with map-reduce paradigm on big image data. This technique is carried out in serial as well as in parallel fashion with different number of workers in order to show the time comparisons between these setups with the self-created large image dataset. In this, more than one Lakh (121,856) images are compressed and decompressed and the execution times are compared with three different setups: single system, Map-Reduce (MR) with 2 workers and MR with 4 workers. Compression on more than one Million (1,096,704) images using single system and MR with 4 workers is also done. To evaluate the efficiency of JPEG technique, two performance measures such as Compression Ratio (CR) and Peak Signal to Noise Ratio (PSNR) are used.
Rafel C. Gonzalez and Richard E. Woods, “Digital Image Processing”, 3rd Edition, Person.
Kazuhiro Kobayashi and Qiu Chen Content-Based Image Retrieval Using Features in Spatial and Frequency Domains, ICSIIT 2015, CCIS 516, pp. 269–277, 2015. DOI: https://doi.org/10.1007/978-3-662-46742-8_25
J. A. Stuchi et al., “Improving image classification with frequency domain layers for feature extraction,” 2017 IEEE 27th International Workshop on Machine Learning for Signal Processing (MLSP), 2017, pp. 1–6, doi: https://doi.org/10.1109/MLSP.2017.8168168.
I. H. Witten, R. M. Neal, and J. G. Cleary, “Arithmetic coding for data compression,” Communications of the ACM, vol. 30, no. 6, pp. 520–540, 1987.
S. Golomb, “Run-length encodings (Corresp.),” IEEE Trans. on information theory, vol. 12, no. 3, pp. 399–401, 1966.
D. A. Huffman, “A method for the construction of minimum redundancy codes,” Proceedings of the IRE, vol. 40, no. 9, pp. 1098–1101, 1952.
H. Andrews and W. Pratt, “Fourier transform coding of images,” in Proc. Hawaii Int. Conf. System Sciences, 1968, pp. 677–679.
W. K. Pratt, J. Kane, and H. C. Andrews, “Hadamard transform image coding,” Proceedings of the IEEE, vol. 57, no. 1, pp. 58–68, 1969.
N. Ahmed, T. Natarajan, and K. R. Rao, “Discrete cosine transform,” IEEE Trans. on Computers, vol. 100, no. 1, pp. 90–93, 1974.
C. Harrison, “Experiments with linear prediction in television,” Bell System Technical Journal, vol. 31, no. 4, pp. 764–783, 1952.
C. Christopoulos, A. Skodras, and T. Ebrahimi, “The JPEG2000 still image coding system: an overview,” IEEE trans. on consumer electronics, vol. 46, no. 4, pp. 1103–1127, 2000.
D. Taubman, “High performance scalable image compression with EBCOT,” IEEE Trans. on image processing, vol. 9, no. 7, pp. 1158–1170, 2000.
Mayer-Schönberger Viktor, and Kenneth Cukier. Big data: A revolution that will transform how we live, work, and think. Houghton Mifflin Harcourt, 2013.
Uddin Muhammad Fahim, and Navarun Gupta. “Seven V’s of Big Data understanding Big Data to extract value.” In Proceedings of the 2014 zone 1 conference of the American Society for Engineering Education, pp. 1–5. IEEE, 2014.
The 42V’s of BIG Data and Data Science, https://www.kdnuggets.com/2017/04/42-vs-big-data-data-science.html
Fritz Venter, Andrew Stein (2012) Analytics: Driving better business decisions. http://analytics-magazine.org/images-a-videos-really-big-data/
Mark Sugrue (2015) CCTV- the challenge of sifting through Big Data. https://www.engineersireland.ie/Engineers-Journal/Technology/cctv-the-challenge-of-sifting-through-big-data
Wang Weining., Weijian Zhao, Chengjia Cai, Jiexiong Huang, Xiangmin Xu, and Lei Li. “An efficient image aesthetic analysis system using Hadoop.” Signal Processing: Image Communication 39 (2015): 499–508.
Lin Yuanqing., Fengjun Lv, Shenghuo Zhu, Ming Yang, Timothee Cour, Kai Yu, Liangliang Cao, and Thomas Huang. “Large-scale image classification: fast feature extraction and svm training.” In CVPR 2011, pp. 1689–1696. IEEE, 2011.
Zhang Shiliang., Ming Yang, Xiaoyu Wang, Yuanqing Lin, and Qi Tian. “Semantic-aware co-indexing for image retrieval.” In Proceedings of the IEEE international conference on computer vision, pp. 1673–1680. 2013.
Dong Le, Zhiyu Lin, Yan Liang, Ling He, Ning Zhang, Qi Chen, Xiaochun Cao, and Ebroul Izquierdo. “A hierarchical distributed processing framework for big image data.” IEEE Transactions on Big Data 2, no. 4 (2016): 297–309.
ProjectPro. Healthcare applications of Hadoop and Big data. https://www.dezyre.com/article/5-healthcare-applications-of-hadoop-and-big-data/85
Koppad Shaila H., and Anupamma Kumar. “Application of big data analytics in healthcare system to predict COPD.” In 2016 International Conference on Circuit, Power and Computing Technologies (ICCPCT), pp. 1–5. IEEE, 2016.
Chen Mao., Zhong Xugang, Wang Guansen, and Ma Jianxiao. “A preliminary discussion on the application of big data in urban residents travel guidance.” In 2015 International Conference on Intelligent Transportation, Big Data and Smart City, pp. 47–50. IEEE, 2015.
Im Hyeongsoon., Bonghee Hong, Seungwoo Jeon, and Jaegi Hong. “Bigdata analytics on CCTV images for collecting traffic information.” In 2016 International Conference on Big Data and Smart Computing (BigComp), pp. 525–528. IEEE, 2016.
Applications of Big Data Drive Industries. https://www.simplilearn.com/tutorials/big-data-tutorial/big-data-applications
Turkington G., (2013). Hadoop Beginner’s Guide. Packt Publishing Ltd.
Perera S., (2013). Hadoop MapReduce Cookbook. Packt Publishing Ltd.
Gonzalez, R.C., Woods, R.E.: http://www.imageprocessingplace.com/ (Accessed 27 August 2021).
Kwitt, R., Peter, M.: Salzburg Texture Image Dataset, http://www.wavelab.at/sources/STex/, (Accessed 27 August 2021).