Semantically Enriched Keyword Prefetching Based on Usage and Domain Knowledge
DOI:
https://doi.org/10.13052/jwe1540-9589.2332Keywords:
Semantic prediction, web usage mining (WUM), web content mining (WCM), domain knowledge, usage data, access logsAbstract
In intelligent web systems [2], web prefetching [27] plays a crucial role. In order to make accurate predictions for web prefetching, it is important but challenging to uncover valuable information from web use statistics [16]. Using statistics and domain expertise, this study presents a new approach dubbed SPUDK for efficient prefetching. In this paper, it is shown how web access logs can be used efficiently for browsing prediction. Our main focus is on the technique needed to manage the queries found in web access logs so that valuable information can be attained. We further process these access logs using a taxonomy and a thesaurus, WordNet, to find the semantics of queries. SPUDK, a system that organises use data into semantic clusters, is one example of this approach. Our contributions in this paper are as follows: (1) A technique to exploit query keywords from access logs. (2) An approach to enrich queries with semantic information. (3) A new similarity measure for finding similarity among URLs present in access logs. (4) A novel clustering technique to find semantic clusters of URLs. (5) Experimental evaluation of the proposed system. The proposed SPUDK system is evaluated using American Online (AOL) logs, which gives improvement of 39% in precision of prediction, 35% in hit ratio and reduction of 50.6% in latency on average as compared to other prediction techniques in the literature.
Downloads
References
P. M. Bharti and T. J. Raval, “Improving Web Page Access Prediction using Web Usage Mining and Web Content Mining,” 2019 3rd International conference on Electronics, Communication and Aerospace Technology (ICECA), Coimbatore, India, 2019, pp. 1268–1273, doi: 10.1109/ICECA.2019.8821950.
Acharya, Anal; Sinha, Devadatta, “An Intelligent Web-Based System for Diagnosing Student Learning Problems Using Concept Maps”, Journal of Educational Computing Research, vol. 55, no. 3, pp. 323–345, Jun 2017.
K. Mani and K. R. Suneetha, “Performance evaluation of Compact Prediction Tree algorithm for Web Page Prediction,” 2020 International Conference on Emerging Trends in Information Technology and Engineering (ic-ETITE), Vellore, India, 2020, pp. 1–7, doi: 10.1109/ic-ETITE47903.2020.166.
V. Luckose, J. Chembath, J. A. R. Ponnusamy, S. Sharma, P. Kaur and S. Smiley, “Web Usage Pattern Detection Using Cohesive Markov Model With Apriori Algorithm,” 2022 IEEE International Conference on Automatic Control and Intelligent Systems (I2CACIS), Shah Alam, Malaysia, 2022, pp. 226–229, doi: 10.1109/I2CACIS54679.2022.98 15465.
X. Zhu et al., “Similarity-Maintaining Privacy Preservation and Location-Aware Low-Rank Matrix Factorization for QoS Prediction Based Web Service Recommendation,” in IEEE Transactions on Services Computing, vol. 14, no. 3, pp. 889–902, 1 May-June 2021, doi: 10.1109/TSC.2018.2839741.
P. T. Siva Gurunathan, R. S, R. S and N. S, “Web Application-based Diabetes Prediction using Machine Learning,” 2023 7th International Conference on Computing Methodologies and Communication (ICCMC), Erode, India, 2023, pp. 296–302, doi: 10.1109/ICCMC56507.2023. 10083583.
Suguna. R and Sharmila. D, “An Overview of Web Usage Mining”, International Journal of Computer Applications, vol. 39, no. 13, pp. 11–13, 2012, https://doi.org/10.5120/4879-7314.
CU. O and Bhargavi. P, “Analysis of Web Server Log by Web Usage Mining for Extracting Users Patterns”, International Journal of Computer Science Engineering and Information Technology Research, vol. 3, no. 2, pp. 123–136, 2013.
Goel. N, Gupta. S and Jha. C K, “Analyzing Web Logs of an Astrological Website Using Key Influencers”, International Research Journal, vol. 5, no. 1, pp. 2–11, 2015.
N. Ahmad, O. MaliIk, M. Hassan, M. S. Qureshi, and A. Munir, “Reducing User Latency in Web Prefetching Using Integrated Techniques”, IEEE Computer, 2011.
B. Parhami, “Introduction to Parallel Processing Algorithms and Architectures”, Kluwer Academic Publishers New York, Boston, pp. 111–112, 2002.
Thi Thanh Sang Nguyen, Hai Yan Lu, and Jie Lu, “Web-Page Recommendation Based on Web Usage and Domain Knowledge”, IEEE Transactions On Knowledge and Data Engineering, vol. 26, no. 10, 2014.
Yuening Hu, Changsung Kang, Jiliang Tang, Dawei Yin, and Yi Chang, “Large-scale Location Prediction for Web Pages”, IEEE Transactions on Knowledge and Data Engineering, vol. 29, no. 9, 2017.
S. Wang, T. Liu, J. Nam and L. Tan, “Deep Semantic Feature Learning for Software Defect Prediction,” in IEEE Transactions on Software Engineering, vol. 46, no. 12, pp. 1267–1293, 1 Dec. 2020, doi: 10.1109/TSE.2018.2877612.
C. I. Arthi, R. L. Priya and R. Rautela, “Analysis and Prediction of health issues for teaching profession using Semantic Techniques,” 2018 International Conference on Smart City and Emerging Technology (ICSCET), 2018, pp. 1–5, https://doi.org/10.1109/ICSCET.2018.8537368.
Sonia Setia, Jyoti, Neelam Duhan, “HPM: A Hybrid Model for User’s Behavior Prediction Based on N-Gram Parsing and Access Logs”, Scientific Programming, Hindawi vol. 2020, 2020, https://doi.org/10.1155/2020/8897244.
Kalaivani. S and Shyamala. K, “A Novel Technique to Pre-Process Web Log Data Using SQL Server Management Studio”, International Journal of Advanced Engineering, Management and Science. Vol 2(7), pages 973–977, 2016.
Sonia Setia, Jyoti, Neelam Duhan, “Efficient query keyword interpretation for semantic information retrieval”, IIOAB Journal, vol. 11, no. 2, pp. 64–68, May 2020.
Sonia Setia, Jyoti, Neelam Duhan, “A novel approach for Density based Optimal Semantic Clustering of Web Objects via identification of KingPins”, Recent Advances in Computer Science and Communications, vol. 14, no. 3, 2021.
Z. Wu and M. Palmer, “Verb semantics and lexical selection”, In Proc. 32nd annual meeting of the Association for Computational Linguistics, 1994.
Lee, D., “Methods for Web Bandwidth and Response Time Improvement”, World Wide Web: Beyond the Basics, 1998; 25.
http://www.researchpipeline.com/mediawiki/index.php?title=AOL_Search_Query_Logs accessed on Jan 2021.
Cheng-Zhong Xu; Tamer I. Ibrahim, “A Keyword-Based Semantic Prefetching Approach in Internet News Service”, Journal of IEEE Transactions on Knowledge and Data Engineering, vol. 16, no. 5, 2004.
C. D. Gracia and S. Sudha, “A case study on memory efficient prediction models for web prefetching”, In Proc. International Conference on Emerging Trends in Engineering, Technology and Science (ICETETS), Pudukkottai, pp. 1–6, 2016.
Jyoti, A.K. Sharma and Amit Goel, “A novel approach to determine the rules for Web Page Prediction using Dynamically chosen K-order Markov Models”, International Journal of Research in Computer and Communication Technology, vol. 2, no. 12, 2013.
Setia Sonia, Verma Jyoti and Duhan Neelam, “A novel approach for semantic web prefetching using semantic information and semantic association”, Big data analytics, pp. 471–479, 2018.
Setia Sonia, Verma Jyoti and Duhan Neelam, “Semantic Prefetching Based Hybrid Prediction Model”, International Journal of Scientific & Technology Research, vol. 8, no. 12, pp. 3936–3941, December 2019.