SEMANTIC SPAM FILTERING FROM PERSONALIZED ONTOLOGIES
Keywords:
Ontology, Spam FilteringAbstract
One of the biggest problems that Internet faces is the increase of email spam. The main drawback with previous anti-spam filters is that they are based only on 1) the syntactical features of words lacking semantic analysis, or 2) on what the majority of users regard as spam without considering the individual preferences of a particular user. In this paper we present a spam email filter that personalizes its filtering process using an email user profile that contains the user’s preferences regarding emails. Our innovative email user profile is based not only on some common user profiling techniques but also on the knowledge contained in a domain ontology. The user profile is used to learn which spam emails (although unsolicited and large-scale sent) are interesting for the user, despite they are spam. The encouraging experimental results provide empirical evidence of the effectiveness of using an ontological approach to user profiling in an email spam filter.
Downloads
References
Agrawal, R., & Shafer, J. (1996). Parallel Mining of Association Rules. IEEE Transactions on Knowledge and Data
Engineering , 8 (6), 962--969.
Brewer, D., Thirumalai, S., Gomadam, K., & Li, K. (2006). Towards an Ontology Driven Spam Filter. ICDEW '06:
Proceedings of the 22nd International Conference on Data Engineering Workshops (ICDE 2006) (p. 79).
Washington, DC, USA: IEEE Computer Society.
Brisoux, L., Gregoire, E., & Sais, L. (2001). Checking depth-limited consistency and inconsistency in knowledgebased
systems. International Journal of Intelligent Systems , 16 (3), 319 - 331.
Chen, X., Zhou, X., Scherl, R., & Geller, J. (2003). Using an Interest Ontology for Improved Support in Rule
Mining. DaWaK: Data Warehousing and Knowledge Discovery, 5th International Conference, (pp. 320-329).
Prague, Czech Republic.
Davis, R., & Lenat, D. (1982). Knowledge Based Systems in Artificial Intelligence. New York: McGraw Hill Int.
Book Company.
Giannakoudi,, J., & Sakkopoulos, E. (2007). An Integrated Technique for Web Site Usage Semantic Analysis: the
Organ System. Journal of Web Engineering (JWE) , 6 (2), 261-280.
Giles, C., & Omlib, C. (1992). Inserting rules into recurrent neural networks. Neural Networks for Signal
Processing [1992] II., Proceedings of the 1992 IEEE-SP Workshop , 13-22.
Giles, C., & Omlin, C. (1993). Rule refinement with recurrent neural networks. IEEE International Conference on
Neural Networks , 2, 801-806.
Ginsberg, A. (1988). Automatic refinement of expert system knowledge bases. San Francisco, CA, USA: Morgan
Kaufmann Publishers Inc.
Ginsberg, A. (1990). Theory Reduction, Theory Revision, and Retranslation. AAAI, (pp. 777-782).
Godoy, D., & Amandi, A. (2005). User Profiling for Web Page Filtering. IEEE Internet Computing , 9 (4), 56-64.
Heymann, P., Koutrika, G., & Garcia-Molina, H. (2007). Fighting Spam on Social Web Sites: A Survey of
Approaches and Future Challenges. IEEE Internet Computing , 11 (6), 36-45.
Kelbassa, H.-W. (2002). Context Refinement - Investigating the Rule Refinement Completeness of SEEK/SEEK2.
ECAI, (pp. 205-209).
Kelbassa, H.-W. (2003). Optimal Case-Based Refinement of Adaptation Rule Bases for Engineering Design.
ICCBR, (pp. 201-215).
Kelbassa, H.-W., & Knauf, R. (2003). The Rule Retranslation Problem and the Validation Interface. Proceedings of
the Sixteenth International Florida Artificial Intelligence Research Society Conference , 213-217.
Kim, J., Dou, D., Liu, H., & Kwak, D. (2007). Constructing a User Preference Ontology for Anti-spam Mail
Systems. Canadian Conference on AI, (pp. 272-283). Montreal, Canada.
Knauf, R., Gonzalez, A., & Abel, T. (2002). A framework for validation of rule-based systems. IEEE Transactions
on Systems, Man, and Cybernetics, Part B , 32 (3), 281-295.
Knauf, R., Philippow, I., & Gonzalez, A. (2000). Towards validation and refinement of rule-based systems. journal
of experiment and theoretical artificial intelligence , 12 (4), 421-431.
Knauf, R., Philippow, I., Gonzalez, A., Jantke, K., & Salecker, D. (2002). System Refinement in Practice - Using a
Formal Method to Modify Real-Life Knowledge. Proceedings of the Fifteenth International Florida Artificial
Intelligence Research Society Conference, (pp. 216-220).
Li, W., Zhong, N., & Liu, C. (2006). ECPIA: An Email-Centric Personal Intelligent Assistant. Rough Sets and
Knowledge Technology, First International Conference, RSKT 2006, (pp. 502-509). Chongquing, China.
Maedche, A., & Staab, S. (2001). Ontology Learning for the Semantic Web. IEEE Intelligent Systems , 16 (2), 72--
Pazzani, M., & Billsus, D. (1997). Learning and Revising User Profiles: The Identification of Interesting Web
Sites. Machine Learning , 27 (3), 313--331.
Politakis, P. (1998). Empirical Analysis for Expert Systems. San Francisco, CA, USA: Morgan Kaufmann
Publishers Inc.
Shah, D., Lakshmanan, L., Ramamritham, K., & Sudarshan, S. (1999). Interestingness and Pruning of Mined
Patterns. ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery.
Shen, B., Yao, M., Wu, Z., Zhang, Y., & Yi, W. (2006). Ontology-based Association Rules Retrieval using Protege
Tools. ICDMW '06: Proceedings of the Sixth IEEE International Conference on Data Mining - Workshops (pp.
--769). Washington, DC, USA: IEEE Computer Society.
Song, M., Song, I.-Y., Hu, X., & Allen, R. (2005). Semantic Query Expansion Combining Association Rules with
Ontologies and Information Retrieval Techniques. Data Warehousing and Knowledge Discovery. 3589, pp. 326-
Springer-Verlag Berlin Heidelberg 2005.
Tresp, V., Hollatz, J., & Ahmad, S. (1993). Network Structuring and Training Using Rule-Based Knowledge.
Advances in Neural Information Processing Systems 5, [NIPS Conference] (pp. 871--878). San Francisco, CA,
USA: Morgan Kaufmann Publishers Inc.
Witten, I., & Frank, E. (2005). Data Mining: practical machine learning tools and techniques.
Youn, S., & McLeod, D. (2007). Efficient Spam Email Filtering using Adaptive Ontology. ITNG '07: Proceedings
of the International Conference on Information Technology (pp. 249--254). Washington, DC, USA: IEEE
Computer Society.
Zhuge, H. (2002). A Knowledge Grid Model and Platform for Global Knowledge Sharing. Expert Systems with
Applications , 22 (4), 313-320.
Zhuge, H. (1998). Inheritance rules for flexible model retrieval. Decision Support Systems , 22 (4), 379--390.
Zhuge, H. (1995). Research on Object Analogical Reasoning. Journal of Software , 6, 5260.
Zhuge, H., Sun, Y., & Guo, W. (2003). Theory and algorithm for rule base refinement. IEA/AIE'2003: Proceedings
of the 16th international conference on Developments in applied artificial intelligence (pp. 187--196). Springer
Springer Verlag Inc.