SEMANTIC SIMILARITY BASED CONTEXT-AWARE WEB SERVICE DISCOVERY USING NLP TECHNIQUES
Keywords:Web service discovery, Semantic similarity, Automatic tagging, Service classification, Natural language processing
Due to the high availability and also the distributed nature of published web services on the Web, efficient discovery and retrieval of relevant services that meet user requirements can be a challenging task. In this paper, we present a semantics based web service retrieval framework that uses natural language processing techniques to extract a service’s functional information. The extracted information is used to compute the similarity between any given service pair, for generating additional metadata for each service and for classifying the services based on their functional similarity. The framework also adds natural language querying capabilities for supporting exact and approximate matching of relevant services to a given user query. We present experimental results that show that the semantic analysis & automatic tagging effectively captured the inherent functional details of a service and also the similarity between different services. Also, a significant improvement in precision and recall was observed during Web service retrieval when compared to simple keyword matching search, using the natural language querying interface provided by the proposed framework.
M. Klusch, “Service discovery,” in Encyclopedia of Social Networks and Mining (ESNAM), R.
Alhajj and J. Rokne, Eds. Springer.
J. Fan and S. Kambhampati, “A snapshot of public web services,” ACM SIGMOD Record, vol. 34,
no. 1, pp. 24–32, 2005.
E. Al-Masri and Q. H. Mahmoud, “A broker for universal access to web services,” in Communication
Networks and Services Research Conference, CNSR’09. Seventh Annual, IEEE.
E. Al-Masri and Q. H. Mahmoud, “Investigating web services on the world wide web,” in 17th
Intl. Conf. on World Wide Web, pp. 795–804, ACM, 2008.
A. Patil, S. Oundhakar, A. Sheth, and K. Verma, “Meteor-s web service annotation framework,”
in 13th international conference on World Wide Web, pp. 553–562, ACM, 2004.
K. Verma, K. Sivashanmugam, A. Sheth, A. Patil, S. Oundhakar, and J. Miller, “Meteor-s WSDI:
A scalable P2P infrastructure of registries for semantic publication and discovery of web services,”
Information Technology and Management, vol. 6, no. 1, 2005.
D. Martin, M. Paolucci, S. McIlraith, M. Burstein, et al., “Bringing semantics to web services:
The OWL-S approach,” in Semantic Web Services and Web Process Composition, Springer, 2005.
N. Srinivasan, M. Paolucci, and K. Sycara, “An efficient algorithm for OWL-S based semantic
search in UDDI,” in Semantic Web Services and Web Process Composition, Springer, 2005.
U. Keller, R. Lara, A. Polleres, I. Toma, M. Kifer, and D. Fensel, “WSMO web service discovery,”
WSML Working Draft D, vol. 5, 2004.
X. Dong, A. Halevy, J. Madhavan, E. Nemes, and J. Zhang, “Similarity search for web services,”
in 30th Intl. Conf. on Very large data bases-Volume 30, pp. 372–383, VLDB Endowment, 2004.
N. Steinmetz et al., “Web service search on large scale,” in Service-Oriented Computing, pp. 437–
, Springer, 2009.
S. Brockmans et al., “Service-finder: First steps toward the realization of web service discovery at
web scale,” Camogli (Genova), Italy June 25th, 2009 Co-located with SEBD, p. 73, 2009.
R. Nayak and B. Lee, “Web service discovery with additional semantics and clustering,” in Web
Intelligence, IEEE/WIC/ACM Intl. Conf. on, 2007.
A. Sajjanhar, J. Hou, and Y. Zhang, “Algorithm for web services matching,” in Advanced Web
Technologies and Applications, pp. 665–670, Springer, 2004.
N. Chan, W. Gaaloul, and S. Tata, “A web service recommender system using vector space model
and latent semantic indexing,” in Advanced Information Networking and Applications (AINA),
IEEE International Conference on, pp. 602–609, IEEE, 2011.
Y. Hao, Y. Zhang, and J. Cao, “Web services discovery and rank: An information retrieval approach,”
Future generation computer systems, vol. 26, no. 8, pp. 1053–1062, 2010.
A. V. Paliwal, B. Shafiq, J. Vaidya, H. Xiong, and N. Adam, “Semantics-based automated service
discovery,” Services Computing, IEEE Transactions on, vol. 5, no. 2, pp. 260–275, 2012.
T. Hofmann, “Probabilistic latent semantic indexing,” in 22nd ACM SIGIR Intl. Conf. on Research
and development in IR, pp. 50–57, 1999.
C. Wu et al., “An empirical approach for semantic web services discovery,” in Software Engineering,
19th Australian Conference on, pp. 412–421, IEEE.
J. Ma, Y. Zhang, and J. He, “Web services discovery based on latent semantic approach,” in Web
Services, 2008. IEEE Intl. Conf. on, pp. 740–747, IEEE.
L. Fang, L. Wang, M. Li, J. Zhao, Y. Zou, and L. Shao, “Towards automatic tagging for web
services,” in Web Services (ICWS), IEEE 19th International Conference on, IEEE, 2012.
K. Elgazzar, A. E. Hassan, and P. Martin, “Clustering WSDL documents to bootstrap the discovery
of web services,” in Web Services (ICWS), IEEE International Conference on, IEEE, 2010.
R. L. Cilibrasi and P. M. Vitanyi, “The google similarity distance,” Knowledge and Data Engineering,
IEEE Transactions on, vol. 19, no. 3, pp. 370–383, 2007.
M. E. Newman, “Power laws, pareto distributions and zipf’s law,” Contemporary physics, vol. 46,
no. 5, pp. 323–351, 2005.
G. A. Miller et al., “Introduction to wordnet: An on-line lexical database*,” Intl. Journal of
lexicography, vol. 3, no. 4, pp. 235–244, 1990.
H. Shima, “Ws4j-wordnet similarity for java,” 2013. Available from https://code.google.com/
T. Pedersen, S. Patwardhan, and J. Michelizzi, “Wordnet:: Similarity: measuring the relatedness
of concepts,” in Demonstration papers at HLT-NAACL 2004, pp. 38–41, 2004.
J. M. Ponte and W. B. Croft, “A language modeling approach to information retrieval,” in 21st
ACM SIGIR Intl. Conf. on Research and development in IR, pp. 275–281, 1998.
S. Bird, “NLTK: the natural language toolkit,” in COLING/ACL on Interactive presentation
sessions, pp. 69–72, Association for Computational Linguistics, 2006.
V. Klema and A. J. Laub, “The singular value decomposition: Its computation and some applications,”
Automatic Control, IEEE Transactions on, vol. 25, no. 2, pp. 164–176, 1980.
G. Ertek, “Text mining with rapidminer,” RapidMiner: Data Mining Use Cases and Business
Analytics Applications, p. 241, 2013.
G. Holmes, “Weka: A machine learning workbench,” in Intelligent Information Systems, 1994.
Second Australian and New Zealand Conf. on, pp. 357–361, IEEE, 1994.
K. P. Murphy, “Naive bayes classifiers,” University of British Columbia, 2006.
P. H. Swain and H. Hauska, “The decision tree classifier: Design and potential,” Geoscience Electronics,
IEEE Transactions on, vol. 15, no. 3, pp. 142–147, 1977.
L. Breiman, “Random forests,” Machine learning, vol. 45, no. 1, pp. 5–32, 2001.
M. Skurichina and R. P. Duin, “Bagging, boosting and the random subspace method for linear
classifiers,” Pattern Analysis & Applications, vol. 5, no. 2, pp. 121–135, 2002.
D. W. Hosmer and S. Lemeshow, “Introduction to the logistic regression model,” Applied Logistic
Regression, Second Edition, pp. 1–30, 2000.
C. M. Bishop, Neural networks for pattern recognition. Oxford university press, 1995.
R. Kohavi, “A study of cross-validation and bootstrap for accuracy estimation and model selection,”
in Ijcai, vol. 14, pp. 1137–1145, 1995.
K. Toutanova, “Stanford log-linear part-of-speech tagger,” 2000.
M. Marcus, A. Taylor, and B. Santorini, “Building a large annotated corpus of english: The penn
treebank,” Computational linguistics, vol. 19, no. 2, pp. 313–330, 1993.
M. Frické, “Measuring recall,” Journal of Information Science, vol. 24, no. 6, pp. 409–417, 1998.