MODIFIED PAGERANK FOR CONCEPT BASED SEARCH
Keywords:PageRank, Semantic web based search, Concept based search, Physical link, Concept link, UNL
Traditional PageRank algorithm computes the weight for each hyper-linked document, which indicates the importance of a page, based on the in-links and out-links. This is an off-line and query independent process which suits a keyword based search strategy. However, owing to the problems like polynymy, synonymy etc.., existing in keyword based search, new methodologies for search like concept based search, semantic web based search etc., have been developed. Concept based search engines generally go in for content based ranking by imparting semantics to the web pages. While this approach is better than the keyword based ranking strategies, they do not consider the physical link structure between documents which is the basis of the successful PageRank algorithm. Hence, we made an attempt to combine the power of link structures with content information to suit the concept based search engines. Our main contribution includes, two modifications to the traditional PageRank Algorithm, both specifically to cater to the concept based search engines. Inspired by the topic sensitive PageRank algorithm, we have multiple PageRanks for a document, rather than just one for each document, as given in the traditional implementation of the PageRank algorithm. We have compared our methodologies with an existing concept based search engine‟s ranking methodology, and found that our modifications considerably improve the ranking of the conceptual search results. Furthermore, we performed statistical significance test and found out that our Version-2 modification to the PageRank algorithm is statistically significant in its P@5 performance compared to the baseline.
Agirre, E. and Soroa, A., Personalizing PageRank for Word Sense Disambiguation. in Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics, 33-41, Athens (2009).
Balaji, J., Geetha, T. V., Parthasarathy, R. and Karky, M., Morpho-Semantic Features for Rule-based Tamil Enconversion. in International Journal of Computer Applications IJCA, 2011, 26, 11-18.
Bar-Yossef, Z. and Mashiach, L-T., Local Approximation of PageRank and Reverse PageRank. in SIGIR, ACM, 865–866, Singapore (2008).
Borgs, C., Brautbar, M., Chayes, J. and Teng, S.-H., A Sublinear Time Algorithm for PageRank Computations. in Bonato, A. and Janssen, J. eds. WAW 2012: LNCS, vol. 7323, 41–53. Springer, Heidelberg (2012).
Brin, S. and Page, L., The Anatomy of a Large-Scale Hypertextual Web Search Engine. in Computer Networks and ISDN Systems, Elsevier, 1998, 30, 107-117.
Deisy, C., Rajeswari, A. M., Indra, R. M., Jayalakshmi, N. and Mehalaa Devi, P. K., A Novel Relation Based Probability Algorithm for Page Ranking in Semantic Web Search Engine. in 5th
International Conference on Information Systems, Technology and Management, 2011, 138–148, Gurgoan (2011).
Efendioglu, D., Faschetti, C. and Parr, T., Chronica: a temporal web search engine. in Proceedings of the 6th International Conference on Web Engineering, California (2006).
Freyne, J., Smyth, B., Coyle, M., and Balfe, E. and Briggs, P., Further Experiments on Collaborative Ranking in Community-Based Web Search. in Artificial Intelligence Review, 2004, 21, 229–252.
Haav, H-M. and Lubi, T-L., A Survey of Concept-based Information Retrieval Tools on the Web. 5th East- European Conference, ADBIS 2001, 29–41, Vilnius (September 2001).
Haveliwala, T. H., Topic-Sensitive PageRank: A Context-Sensitive Ranking Algorithm for Web Search. in IEEE Transactions On Knowledge And Data Engineering, 2003, 15(4), 784-796.
Kamvar, S., Haveliwala,T. H. and Golub, G., Adaptive Methods for the Computation of PageRank. in Proceedings of International Conference on the Numerical Solution of Markov Chains, 2003, 31–44.
Kurland,O. and Lee, L., PageRank without hyperlinks: Structural re-ranking using links induced by language models. in SIGIR‟05, Salvador (2005).
Lin, J., PageRank without Hyperlinks: Reranking with Related Document Networks. Technical Report LAMP-TR-146/HCIL-2008-01.
Liu, F., Yu, C. and Meng, W., Personalized Web search for improving retrieval effectiveness. in IEEE transactions on Knowledge and Data Engineering, Jan 2004, 16(1), 28-40.
Madhu, G., Govardhan, A. and Rajinikanth, T. V., Intelligent Semantic Web Search Engines: A Brief Survey. in International Journal of Web and Semantic Technology, 2011, 2(1), 34-42.
Maehara, T., Akiba, T., Iwata, Y. and Kawarabayashi, K., Computing Personalized PageRank Quickly by Exploiting Graph Structures. in Very Large Data Bases, Hangzhou (2014), Proceedings of the VLDB Endowment, 7(12), 1023-1034.
Mihalcea, R., Tarau, P. and Figa, E., PageRank on Semantic Networks, with Application to Word Sense Disambiguation. in Proceedings of the 20st International Conference on Computational Linguistics (COLING 2004), Geneva (2004).
Noack, D., Spatial Variation in Search Engine Results. in Proceedings of the 43rd Hawaii International Conference on System Sciences, Hawaii (2010).
One Way ANOVA - University of Wisconsin - Stevens Point. [Online]. Available: http://www.uwsp.edu/psych/stat/12/anova-1w.ht
Qiu, L., Liang, Y. and Chen, J., Finding Important Nodes in Social Networks Based on Modified PageRank. in Computer Science and Information Technology, 2014, 6(1), 39-44.
Rasolofo, Y. and Savoy, J., Term Proximity Scoring for Keyword-Based Retrieval Systems. in European Conference on IR Research, 207 – 218, Pisa (2003).
Thelwall, M. and Vaughan, L., New versions of PageRank employing alternative Web document models. in ASLIB Proceedings, 2004, 56(1), 24-33.
Umamaheswari, E., Geetha, T. V., Parthasarathi, R. and Karky, M., A Multilevel UNL Concept based Searching and Ranking. in Proceedings of WEBIST 2011, 282-289, Noordwijkerhout (2011).
Wu, Y. and Raschid, L., ApproxRank: Estimating Rank for a Subgraph. in Proceedings of IEEE International Conference on Data Engineering, Shanghai (2009).
Zhu, F., Fang, Y., Chang, K.C.-C. and Ying, J., Incremental and Accuracy-aware Personalized PageRank through Scheduled Approximation. in PVLDB, 2013, 6(6), 481–492.