WEB EVENT STATE PREDICTION MODEL: COMBINING PRIOR KNOWLEDGE WITH REAL TIME DATA
Keywords:
web event, hidden Markov model, topic detection and tracking, multi-factor analysisAbstract
The state prediction plays a key role in the evolution analysis of web events. There are two issues for the state prediction of web events: one is what factors impact on the state transition of web events; and the other is how the prior knowledge can guide the state transition of web events. For the first issue, we discuss two types of temporal features observed from the real time webpages covering an event, i.e., the statistical ones and the knowledge structural ones. For the second issue, Fuzzy Cognitive Map (FCM) and conditional dependency matrix are mined from the training web events. As the prior knowledge, they represent the relations between the states transition and the relations of unobserved space (i.e., the six states of web events) and observed space (i.e., the two types of features). Based on that, an improved hidden Markov model is developed to predict the state transition of web events. Experimental results show that the model has good performance and robustness because it combines the prior knowledge and the real time data of web events.
Downloads
References
Dixon, W.J. and F.J. Massey Jr, Introduction to statistical analysis. 1957.
Valenzuela, O., et al., Hybridization of intelligent techniques and ARIMA models for time series
prediction. Fuzzy Sets and Systems, 2008. 159(7): p. 821-845.
Haykin, S.S., Kalman filtering and neural networks2001: Wiley Online Library.
de Oliveira, F.A., et al. The use of artificial neural networks in the analysis and prediction of stock
prices. in Systems, Man, and Cybernetics (SMC), 2011 IEEE International Conference on. 2011.
IEEE.
Baum, L.E. and T. Petrie, Statistical inference for probabilistic functions of finite state Markov
chains. The Annals of Mathematical Statistics, 1966. 37(6): p. 1554-1563.
Rabiner, L.R., A tutorial on hidden Markov models and selected applications in speech
recognition. Proceedings of the IEEE, 1989. 77(2): p. 257-286.
Choi, H. and R.G. Baraniuk, Multiscale image segmentation using wavelet-domain hidden
Markov models. Image Processing, IEEE Transactions on, 2001. 10(9): p. 1309-1321.
Do, M.N. and M. Vetterli, Rotation invariant texture characterization and retrieval using
steerable wavelet-domain hidden Markov models. Multimedia, IEEE Transactions on, 2002. 4(4):
p. 517-527.
Rossi, A. and G.M. Gallo, Volatility estimation via hidden Markov models. Journal of Empirical
Finance, 2006. 13(2): p. 203-230.
Mannila, H., H. Toivonen, and A. Inkeri Verkamo, Discovery of frequent episodes in event
sequences. Data Mining and Knowledge Discovery, 1997. 1(3): p. 259-289.
Perusich, K. and M.D. McNeese, Using fuzzy cognitive maps for knowledge management in a
conflict environment. IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications
and Reviews, 2006. 36(6): p. 810-821.
Allan, J., et al., Topic detection and tracking pilot study: final report, in Proceedings of the
DARPA Broadcast News Transcription and Understanding Workshop1998: Lansdowne, VA,
USA. p. 194-218.
Allan, J., R. Papka, and V. Lavrenko, On-line new event detection and tracking, in Proceedings of
the 21st Annual International ACM SIGIR Conference on Research and Development in
Information Retrieval1998, ACM: Melbourne, Australia. p. 37-45.
Makkonen, J., Investigations on event evolution in TDT, in Proceedings of the 2003 Conference of
the North American Chapter of the Association for Computational Linguistics on Human
Language Technology: Proceedings of the HLT-NAACL 2003 student research workshop -
Volume 32003, Association for Computational Linguistics: Edmonton, Canada. p. 43-48.
Golshani, M.A., A.M. Zarehbidoki, and V. Derhami, Slash-based relevance propagation model
for topic distillation. Journal of Web Engineering, 2013. 12(3-4): p. 265-290.
Liu, Y. and A. Agah, Topical crawling on the web through local site-searches. Journal of Web
Engineering, 2013. 12(3-4): p. 203-214.
Neumann, G. and S. Schmeier, Interactive Topic Graph Extraction and Exploration of Web
Content, in Multi-source, Multilingual Information Extraction and Summarization, T. Poibeau, et
al., Editors. 2013, Springer Berlin Heidelberg. p. 137-161.
Suhara, Y., et al., Automatically generated spam detection based on sentence-level topic
information, in Proceedings of the 22nd international conference on World Wide Web
companion2013, International World Wide Web Conferences Steering Committee: Rio de Janeiro,
Brazil. p. 1157-1160.
Lin, T., et al., The dual-sparse topic model: mining focused topics and focused terms in short text,
in Proceedings of the 23rd international conference on World wide web2014, International World
Wide Web Conferences Steering Committee: Seoul, Korea. p. 539-550.
Guan, Q., et al., Research and design of internet public opinion analysis system, in Proceedings of
the 2009 IITA International Conference on Services Science, Management and Engineering2009,
IEEE Computer Society. p. 173-177.
Li, X., The design and implementation of internet public opinion monitoring and analyzing
system, in 2nd International Conference e-Business and Information System Security
(EBISS)2010, IEEE: Wuhan, China. p. 1-5.
Quinn, C.J., T.P. Coleman, and N. Kiyavash, A generalized prediction framework for granger
causality, in IEEE Conference on Computer Communications Workshops (INFOCOM
WKSHPS)2011, IEEE. p. 906-911.
Radinsky, K. and E. Horvitz. Mining the web to predict future events. in Proceedings of the 6th
ACM International Conference on Web Search and Data Mining. 2013. ACM.
Amodeo, G., R. Blanco, and U. Brefeld. Hybrid models for future event prediction. in
Proceedings of the 20th ACM international conference on Information and knowledge
management. 2011. ACM.
Joo, R., et al., Hidden markov models: the best models for forager movements? PloS one, 2013.
(8): p. e71246.
Cui, X., H. Jing, and C. Jen-Tzung, Multi-view and multi-objective semi-supervised learning for
HMM-based automatic speech recognition. IEEE Transactions on Audio, Speech, and Language
Processing, 2012. 20(7): p. 1923-1935.
Chin-De, L., C. Yi-Nung, and C. Pau-Choo, An interaction-embedded HMM framework for
human behavior understanding: with nursing environments as examples. IEEE Transactions on
Information Technology in Biomedicine, 2010. 14(5): p. 1236-1246.
Bharath, A. and S. Madhvanath, HMM-based lexicon-driven and lexicon-free word recognition
for online handwritten indic scripts. IEEE Transactions onPattern Analysis and Machine
Intelligence, 2012. 34(4): p. 670-682.
Xuan, J., et al. Building hierarchical keyword level association link networks for web events
semantic analysis. in IEEE 9th International Conference on Dependable, Autonomic and Secure
Computing (DASC). 2011. IEEE.
Luo, X., et al., Building association link network for semantic link on web resources. Automation
Science and Engineering, IEEE Transactions on, 2011. 8(3): p. 482-494.
Agrawal, R., T. Imieliński, and A. Swami. Mining association rules between sets of items in large
databases. in ACM SIGMOD Record. 1993. ACM.
Jin, X., et al. Topic initiator detection on the world wide web. in Proceedings of the 19th
International Conference on World Wide Web. 2010. ACM.
Deng, Y. and R. Lau, On delay adjustment for dynamic load balancing in distributed virtual
environments, IEEE Trans. on Visualization and Computer Graphics, 18(4):529-537, 2012.
Fan, J., X. Lin, X. Jia, and R. Lau, Edge-pancyclickity of twisted cubes, LNCS 3827, Springer, pp.
-1099, Dec. 2005.
To, D., R. Lau, and M. Green, An adaptive multi-resolution method for progressive model
transmission, Presence, MIT Press, 10(1):62-74, Feb. 2001.
Li, Q, R. Lau, E. Leung, F. Li, V. Lee, B. Wah, and H. Ashman, Emerging Internet technologies
for e-learning, IEEE Internet Computing, 13(4):11-17, July 2009.
Borzeshi, E.Z., Perez Concha, O., Xu, R.Y.D., Piccardi, M., Joint Action Segmentation and
Classification by an Extended Hidden Markov Model, Signal Processing Letters, 2013. 20(12):
p.1207-1210, IEEE.
Durrieu, J.-L.; Thiran, J.-P., Source/Filter Factorial Hidden Markov Model, With Application to
Pitch and Formant Tracking, Audio, Speech, and Language Processing, IEEE Transactions on ,
, 21(12): p.2541-2553, IEEE.