Semantically Enriched Keyword Prefetching Based on Usage and Domain Knowledge

Authors

  • Sonia Setia Department of Computer Science and Engineering, SET, Sharda University, Greater Noida, Uttar Pradesh, India, 201310
  • Jyoti Faculty of Computer Science, J. C. Bose University of Science and Technology, YMCA, Faridabad, India, 121006
  • Neelam Duhan Faculty of Computer Science, J. C. Bose University of Science and Technology, YMCA, Faridabad, India, 121006
  • Aman Anand ITS Engineering College, Greater Noida, Uttar Pradesh, India, 201310
  • Nikita Verma Greater Noida Institute of Technology, Engineering Institute, Greater Noida, Uttar Pradesh, India, 201310

DOI:

https://doi.org/10.13052/jwe1540-9589.2332

Keywords:

Semantic prediction, web usage mining (WUM), web content mining (WCM), domain knowledge, usage data, access logs

Abstract

In intelligent web systems [2], web prefetching [27] plays a crucial role. In order to make accurate predictions for web prefetching, it is important but challenging to uncover valuable information from web use statistics [16]. Using statistics and domain expertise, this study presents a new approach dubbed SPUDK for efficient prefetching. In this paper, it is shown how web access logs can be used efficiently for browsing prediction. Our main focus is on the technique needed to manage the queries found in web access logs so that valuable information can be attained. We further process these access logs using a taxonomy and a thesaurus, WordNet, to find the semantics of queries. SPUDK, a system that organises use data into semantic clusters, is one example of this approach. Our contributions in this paper are as follows: (1) A technique to exploit query keywords from access logs. (2) An approach to enrich queries with semantic information. (3) A new similarity measure for finding similarity among URLs present in access logs. (4) A novel clustering technique to find semantic clusters of URLs. (5) Experimental evaluation of the proposed system. The proposed SPUDK system is evaluated using American Online (AOL) logs, which gives improvement of 39% in precision of prediction, 35% in hit ratio and reduction of 50.6% in latency on average as compared to other prediction techniques in the literature.

Downloads

Download data is not yet available.

Author Biographies

Sonia Setia, Department of Computer Science and Engineering, SET, Sharda University, Greater Noida, Uttar Pradesh, India, 201310

Sonia Setia is presently working as Associate Professor in Department of Computer Science and Engineering, Sharda University, Greater Noida. She received her Ph.D in Computer Science and Engineering from YMCAUST, Faridabad, India. She has broad research interests in Web prediction, data mining, Information Retrieval, Artificial Intelligence and Natural language processing. She has published more than 20 papers in reputed journals and conferences.

Jyoti, Faculty of Computer Science, J. C. Bose University of Science and Technology, YMCA, Faridabad, India, 121006

Jyoti is presently working as Associate Professor in Department of Computer Engineering, J. C. Bose University of Science and Technology, Faridabad, India. She received her Ph.D in Computer Science Engineering from Maharishi Dayanand University, Rohtak in 2011. She has broad research interests in Data Mining, Information Retrieval. She has published more than 35 papers in refereed journals at national and international level.

Neelam Duhan, Faculty of Computer Science, J. C. Bose University of Science and Technology, YMCA, Faridabad, India, 121006

Neelam Duhan is presently working as Associate Professor in Department of Computer Engineering, J. C. Bose University of Science and Technology, Faridabad, India. She received her Ph.D in Computer Science Engineering from Maharishi Dayanand University, Rohtak in 2011. She has broad research interests in Data Mining, Information Retrieval and Databases. She has published more than 40 papers in reputed conferences and refereed journals.

Aman Anand, ITS Engineering College, Greater Noida, Uttar Pradesh, India, 201310

Aman Anand is presently working as Assistant Professor in Department of Computer Science and Engineering, ITS Engineering college, Greater Noida. He pursuing his Ph.D in Computer Science and Engineering from Gautam Buddha University, Greater Noida, India. He has broad research interests in Networking, IoT, Software Engineering, Artificial Intelligence and Natural language processing. He has published more than 10 papers in reputed journals and conferences.

Nikita Verma, Greater Noida Institute of Technology, Engineering Institute, Greater Noida, Uttar Pradesh, India, 201310

Nikita Verma is presently working as Assistant professor in Department of Computer Science & engineering (AI & ML), Greater Noida of Institute of Technology, Engg. Institute, Greater Noida, Knowledge park -2, U.P. India. She is pursuing Ph.D in Computer Science From Banasthali University, Tonk District Rajasthan, India. She has Broad research interest in Wireless networking, Software Engineering, IOT. She has published 8 papers in reputed conferences and refereed journals.

References

P. M. Bharti and T. J. Raval, “Improving Web Page Access Prediction using Web Usage Mining and Web Content Mining,” 2019 3rd International conference on Electronics, Communication and Aerospace Technology (ICECA), Coimbatore, India, 2019, pp. 1268–1273, doi: 10.1109/ICECA.2019.8821950.

Acharya, Anal; Sinha, Devadatta, “An Intelligent Web-Based System for Diagnosing Student Learning Problems Using Concept Maps”, Journal of Educational Computing Research, vol. 55, no. 3, pp. 323–345, Jun 2017.

K. Mani and K. R. Suneetha, “Performance evaluation of Compact Prediction Tree algorithm for Web Page Prediction,” 2020 International Conference on Emerging Trends in Information Technology and Engineering (ic-ETITE), Vellore, India, 2020, pp. 1–7, doi: 10.1109/ic-ETITE47903.2020.166.

V. Luckose, J. Chembath, J. A. R. Ponnusamy, S. Sharma, P. Kaur and S. Smiley, “Web Usage Pattern Detection Using Cohesive Markov Model With Apriori Algorithm,” 2022 IEEE International Conference on Automatic Control and Intelligent Systems (I2CACIS), Shah Alam, Malaysia, 2022, pp. 226–229, doi: 10.1109/I2CACIS54679.2022.98 15465.

X. Zhu et al., “Similarity-Maintaining Privacy Preservation and Location-Aware Low-Rank Matrix Factorization for QoS Prediction Based Web Service Recommendation,” in IEEE Transactions on Services Computing, vol. 14, no. 3, pp. 889–902, 1 May-June 2021, doi: 10.1109/TSC.2018.2839741.

P. T. Siva Gurunathan, R. S, R. S and N. S, “Web Application-based Diabetes Prediction using Machine Learning,” 2023 7th International Conference on Computing Methodologies and Communication (ICCMC), Erode, India, 2023, pp. 296–302, doi: 10.1109/ICCMC56507.2023. 10083583.

Suguna. R and Sharmila. D, “An Overview of Web Usage Mining”, International Journal of Computer Applications, vol. 39, no. 13, pp. 11–13, 2012, https://doi.org/10.5120/4879-7314.

CU. O and Bhargavi. P, “Analysis of Web Server Log by Web Usage Mining for Extracting Users Patterns”, International Journal of Computer Science Engineering and Information Technology Research, vol. 3, no. 2, pp. 123–136, 2013.

Goel. N, Gupta. S and Jha. C K, “Analyzing Web Logs of an Astrological Website Using Key Influencers”, International Research Journal, vol. 5, no. 1, pp. 2–11, 2015.

N. Ahmad, O. MaliIk, M. Hassan, M. S. Qureshi, and A. Munir, “Reducing User Latency in Web Prefetching Using Integrated Techniques”, IEEE Computer, 2011.

B. Parhami, “Introduction to Parallel Processing Algorithms and Architectures”, Kluwer Academic Publishers New York, Boston, pp. 111–112, 2002.

Thi Thanh Sang Nguyen, Hai Yan Lu, and Jie Lu, “Web-Page Recommendation Based on Web Usage and Domain Knowledge”, IEEE Transactions On Knowledge and Data Engineering, vol. 26, no. 10, 2014.

Yuening Hu, Changsung Kang, Jiliang Tang, Dawei Yin, and Yi Chang, “Large-scale Location Prediction for Web Pages”, IEEE Transactions on Knowledge and Data Engineering, vol. 29, no. 9, 2017.

S. Wang, T. Liu, J. Nam and L. Tan, “Deep Semantic Feature Learning for Software Defect Prediction,” in IEEE Transactions on Software Engineering, vol. 46, no. 12, pp. 1267–1293, 1 Dec. 2020, doi: 10.1109/TSE.2018.2877612.

C. I. Arthi, R. L. Priya and R. Rautela, “Analysis and Prediction of health issues for teaching profession using Semantic Techniques,” 2018 International Conference on Smart City and Emerging Technology (ICSCET), 2018, pp. 1–5, https://doi.org/10.1109/ICSCET.2018.8537368.

Sonia Setia, Jyoti, Neelam Duhan, “HPM: A Hybrid Model for User’s Behavior Prediction Based on N-Gram Parsing and Access Logs”, Scientific Programming, Hindawi vol. 2020, 2020, https://doi.org/10.1155/2020/8897244.

Kalaivani. S and Shyamala. K, “A Novel Technique to Pre-Process Web Log Data Using SQL Server Management Studio”, International Journal of Advanced Engineering, Management and Science. Vol 2(7), pages 973–977, 2016.

Sonia Setia, Jyoti, Neelam Duhan, “Efficient query keyword interpretation for semantic information retrieval”, IIOAB Journal, vol. 11, no. 2, pp. 64–68, May 2020.

Sonia Setia, Jyoti, Neelam Duhan, “A novel approach for Density based Optimal Semantic Clustering of Web Objects via identification of KingPins”, Recent Advances in Computer Science and Communications, vol. 14, no. 3, 2021.

Z. Wu and M. Palmer, “Verb semantics and lexical selection”, In Proc. 32nd annual meeting of the Association for Computational Linguistics, 1994.

Lee, D., “Methods for Web Bandwidth and Response Time Improvement”, World Wide Web: Beyond the Basics, 1998; 25.

http://www.researchpipeline.com/mediawiki/index.php?title=AOL_Search_Query_Logs accessed on Jan 2021.

Cheng-Zhong Xu; Tamer I. Ibrahim, “A Keyword-Based Semantic Prefetching Approach in Internet News Service”, Journal of IEEE Transactions on Knowledge and Data Engineering, vol. 16, no. 5, 2004.

C. D. Gracia and S. Sudha, “A case study on memory efficient prediction models for web prefetching”, In Proc. International Conference on Emerging Trends in Engineering, Technology and Science (ICETETS), Pudukkottai, pp. 1–6, 2016.

Jyoti, A.K. Sharma and Amit Goel, “A novel approach to determine the rules for Web Page Prediction using Dynamically chosen K-order Markov Models”, International Journal of Research in Computer and Communication Technology, vol. 2, no. 12, 2013.

Setia Sonia, Verma Jyoti and Duhan Neelam, “A novel approach for semantic web prefetching using semantic information and semantic association”, Big data analytics, pp. 471–479, 2018.

Setia Sonia, Verma Jyoti and Duhan Neelam, “Semantic Prefetching Based Hybrid Prediction Model”, International Journal of Scientific & Technology Research, vol. 8, no. 12, pp. 3936–3941, December 2019.

Downloads

Published

2024-05-25

How to Cite

Setia, S., Jyoti, Duhan, N., Anand, A., & Verma, N. (2024). Semantically Enriched Keyword Prefetching Based on Usage and Domain Knowledge. Journal of Web Engineering, 23(03), 341–376. https://doi.org/10.13052/jwe1540-9589.2332

Issue

Section

Articles