A HYBRID APPROACH USING PSO AND K-MEANS FOR SEMANTIC CLUSTERING OF WEB DOCUMENTS

Authors

  • J. AVANIJA Velammal College of Engineering &Technology,Tamilnadu,India.
  • Dr.K. RAMAR Einstein College of Engineering,Tamilnadu,India.

Keywords:

Ontology, Clustering, Particle Swarm Optimization, Semantic Similarity, K-Means

Abstract

With the massive growth and large volume of the web it is very difficult to recover results based on the user preferences. The next generation web architecture, semantic web reduces the burden of the user by performing search based on semantics instead of keywords. Even in the context of semantic technologies optimization problem occurs but rarely considered. In this paper Document clustering is applied to recover relevant documents. We propose a ontology based clustering algorithm using semantic similarity measure and Particle Swarm Optimization(PSO), which is applied to the annotated documents for optimizing the result. The proposed method uses Jena API and GATE tool API and the documents can be recovered based on their annotation features and relations. A preliminary experiment comparing the proposed method with K-Means shows that the proposed method is feasible and performs better than K-Means.

 

Downloads

Download data is not yet available.

References

Ahmad Kayed, Eyas El-Qawasmeh & Zakariya Qawaqneh. (2010,December). Ranking Web Sites Using Domain Ontology Concepts. International Journal of Information & Management. Vol 47 pp.350-355.

Aleman-Meza, B., Halaschek, C., Arpinar, I., & Amith Sheth.(2003).A Context_Aware Semantic Association Ranking. Proc.First Int’l Workshop Semantic Web and Databses pp.33-50.

Amy, J.C., Trappey, Charles, V., Trappey,Fu-Chiang Hsu, & David Hsiao, W. (2009,June).A Fuzzy Ontological Knowledge Document Clustering Methodology.IEEE Transactions on Systems,Man and Cybernetics.Vol. 39, pp.806-814.

Atanas Kiryakov, Borislav Popov, Damyan Ognyanoff, Dimitar Manov, Angel Kirilov, & Miroslav Goranov. (2004,December).Semantic Annotation, Indexing, and Retrieval. Elsevier's Journal of Web Semantics. Vol 2 pp.49-79.

Danushka Bollegala, Yutaka Matsuo, & Mitsuru Ishizuka.(2011,July).A Web Search Engine-Based Approach to Measure Semantic Similarity between Words. IEEE Transactions on Knowledge and Data Engineering,Vol 23 pp.977-990.

David, A., Grossman,& Ophir Frieder.(2004).Information Retrieval: Algorithms and Heuristics, Springer.

DW Van Der Merwe.(2003,December).Data Clustering using Particle Swarm Optimization. The 2003 Congress on Evolutionary Computation, 2003. CEC '03. Vol.1 ISSN: Print ISBN: 0-7803-7804-0 .

Grigoris Antoniou, & Frank Vanhamln.(2010). Semantic Web Primer,PHI Learning Pvt Ltd.

Hmway Hmway Tar, & Thi Thi Soe Nyunt.(2011).Ontology -Based Concept Weighting for Text Documents. International Conference on Information Communication and Management, IACSIT Press, Singapore IPCSIT. Vol.16 .

Ioan Cristian Trelea. (2003).The particle swarm optimization algorithm: convergence analysis and parameter selection. Inf. Process. Letter. 85(6) pp.317-325.

John Hebler, Mathew Fisher, &Ryan Blaces.(2009).Semantic Web Programming ,Wiley India.

Kalyani , S., & Swarup,K.S.(2011). Particle swarm optimization based K-means clustering approach for security assessment in power systems Expert Systems with Applications. Vol.38(9).pp.10839-10846.

Kennedy, J.,& Eberhart, R.C.(1995,Nov/Dec).Particle Swarm Optimization. Proceedings of the IEEE international conference on neural networks IV. Vol 4 pp. 1942–1948.

Kuncheva, L., & Bezdek, J. (1998).Nearest Prototype Classification:Clustering, Genetic Algorithms, or Random Search?. IEEE Transactionson Systems, Man, and Cybernetics-Part C: Applications and Reviews.Vol. 28(1), pp.160-164.

Maedche, A., Staab S., Stojanovic N., Studer R., & Sure Y.(2003).Semantic Portal: The SEAL Approach.Spinning the Semantic Web. pp.317-359.

Mahamed, G.H., Omran, Andries P., Engelbrecht,& Ayed Salman.(2005). Dynamic Clustering using Particle SwarmOptimization with Application in Unsupervised Image Classification, World Academy of Science, Engineering and Technology. Vol 9 ISSN:1307 6884,pp.199-204.

Maurice Clerc,&James Kennedy.(2002). The Particle Swarm - Explosion, Stability, and Convergence in a Multidimensional Complex Space. IEEE Trans. Evolutionary Computation. 6(1), pp.58-73.

Montserrat Batet, Aida Valls, & Karina Gibert.(2008,December).Improving classical clustering with ontologies.IASC,Japan.

Neelam, A.,& Sharma K.(2010,August).A Novel Approach for Organizing Web Search Results using Ranking and Clustering. International Journal of Computer Applications(0975 – 8887). 5(10).

Punitha , Mugunthadevi, & Punithavalli.(2011,May).Impact of Ontology based Approach on Document Clustering.International Journal of Computer Applications (0975 – 8887).22(2).

Stefania Gallova.(2007,November).Fuzzy Ontology and Information Access on the Web. IAENG International Journal of Computer Science,IJCS_34_2_11. 34(2).

Sridevi, U.K., Nagaveni. N.(2011).Semantically Enhanced Document Clustering Based on PSO Algorithm. European Journal of Scientific Research ISSN 1450-216X. Vol.57 pp. 485-493.

Thangamani, M., & Thangaraj, P.(2010,July).Ontology Based Fuzzy Document Clustering Scheme.International Journal of Modern Applied Science. Vol. 4.

Turi, R.H. (2001).Clustering-Based Colour Image Segmentation, PhD Thesis, Monash University, Australia.

Xiaohui Cui, Thomas E., Potok, Cui, X.,& Potok, T.K.(2005).Document Clustering Analysis Based on Hybrid PSO+K-means Algorithm. Journal of Computer Sciences. pp. 27-33.

Yang Cheng.(2008). Ontology-Based Fuzzy Semantic Clustering. Third International Conference on Convergence and Hybrid Information Technology. pp.128-133.

Zongmin Ma.(2006).Soft Computing in Ontologies and Semantic Web ISBN-10 3-540-33472-6 Springer Berlin Heidelberg New York.

Downloads

Published

2013-01-28

How to Cite

AVANIJA, J. ., & RAMAR, D. . (2013). A HYBRID APPROACH USING PSO AND K-MEANS FOR SEMANTIC CLUSTERING OF WEB DOCUMENTS. Journal of Web Engineering, 12(3-4), 249–264. Retrieved from https://journals.riverpublishers.com/index.php/JWE/article/view/4159

Issue

Section

Articles