GOAL DRIVEN APPROACH TO MODEL INTERACTION BETWEEN VIEWPOINTS OF A MULTI-VIEW KDD PROCESS
Keywords:
KDD process, Viewpoint, Goal analysis, Ontologies, SWRLAbstract
Knowledge Discovery in Databases (KDD) is a highly complex, iterative and interactive process, with a goal-driven and domain dependent nature. The complexity of KDD is mainly due to the nature of the analyzed data (which are massive, distributed, incomplete, and heterogeneous) and the nature of the process itself (since the process is by definition interactive and iterative). Given this complexity, a KDD user faces two major challenges: on the one hand, he must manipulate prior domain knowledge to better understand data and business objectives. On the other hand, he must be able to choose, configure, compose and execute tools and methods from various fields (e.g., machine learning, statistics, artificial intelligence, databases) to achieve goals. Furthermore, in the business real world, a data mining project is usually held by several actors (domain experts, data analysts, KDD experts …), each with a different viewpoint. In this paper we propose to tackle the complexity of KDD process, and to enhance coordination and knowledge sharing between actors of a multi-view KDD analysis through a goal driven modeling of interactions between viewpoints. After a brief review of our approach of viewpoint in KDD, we will first develop a semantic Model of Goals that allows identification and representation of business objectives during the business understanding step of KDD process. Then, based on this Goal Model, we define a set of semantic relations between viewpoints of a multi-view analysis; namely equivalence, inclusion, conflict and requirement.
Downloads
References
Fayyad, U. M., Piatetsky-Shapiro, G., and Smyth, P. The KDD process for extracting useful
knowledge from volumes of data. Communications of the ACM, 39(11), 1996, 27-34.
Shearer, C. The CRISP-DM model: the new blueprint for data mining. Journal of Data
Warehousing, Vol. l5(4), 2000, 13-19.
Kurgan, L. A., and Musilek, P. A survey of Knowledge Discovery and Data Mining process
models. The Knowledge Engineering Review, Cambridge University Press, USA, Vol. 21(1),
March 2006, 1-24.
Behja, H., Trousse, B., and Marzak, A. Prise en compte des points de vue pour l'annotation d'un
processus d'Extraction de Connaissances à partir de Données. In S. Pinson and N. Vincent (Eds.),
Revue des Nouvelles Technologies de l'Information (RNTI-E-3) : Vol. 1, Cépaduès-Editions
, 245-256.
Behja, H. Plateforme objet d’évaluation orientée point de vue d’un système d’information. Ph.D.
thesis, Faculty of Science Ben M’sik, Casablanca, Morocco, February 2009.
Trousse, B. Viewpoint Management for Cooperative Design. In Proceedings of the IEEE
Computationnal Engineering in Systems Applications (CESA'98), P. Borne, A. E. Kamel (editors),
UCIS - Ecole Centrale de Lille - CD-Rom, april 1998.
Zemmouri, E., Behja, H., Marzak, A., and Trousse, B. Ontology-Based Knowledge Model for
Multi-View KDD Process. International Journal of Mobile Computing and Multimedia
Communications (IJMCMC), 4(3), 2012, 21-33.
Schreiber, A., Akkermans, J. M., Anjewierden, A., Hoog, R., Shadbolt, N. Van de Velde, W., and
Wielinga, B. J. Knowledge engineering and management: The CommonKADS methodology. MIT
Press, Cambridge, Massachusetts, London, 2000.
Zemmouri, E., Behja, H., and Benghabrit, Y. OntoECD: A CRISP-DM Based Ontology for KDD
Process and Data Mining. Journées Doctorales en Technologies de l'Information et de la
Communication, JDTIC’2012, Casablanca, Morocco, 08-10 novembre 2012.
Bernstein, A., Provost, F., and Hill, S. Towards intelligent assistance for a data mining process: an
ontology based approach for cost-sensitive classification. IEEE Transactions on Knowledge and
Data Engineering, 17(4), (2005), 503-518.
Diamantini, C., Potena, D., and Storti, E. Ontology-driven KDD process composition. In N. M.
Adams, C. Robardet, A. Siebes, and J.-F. Boulicaut (Eds.), Lecture Notes in Computer Science:
Vol. 5772, Advances in Intelligent Data Analysis VIII, 285-296, Springer Verlag, 2009.
Euler, T. Publishing operational models of data mining case studies. In Proceedings of Workshop
on Data Mining Case Studies at the 5th IEEE ICDM, 99-106, Houston, Texas, USA 2005.
Morik, K., and Scholz, M. The MiningMart approach to knowledge discovery in databases. In N.
Zhong and J. Liu (Eds.), Intelligent Technologies for Information Analysis, 47-65, Springer
Verlag, 2004.
Cannatro, M., and Comito, C. A data mining ontology for grid programming. In Proceedings of
st International Workshop on Semantics in Peer-to-Peer and Grid Computing, in conjunction
with WWW2003, 113-134, 2003.
Hilario, M., Nguyen, P., Do, H., Woznica, A., and Kalousis, A. Ontology-based meta-mining of
knowledge discovery workflows. In N. Jankowski, W. Duchs, and K. Grabczewski (Eds.), Studies
in Computational Intelligence, Vol. 358/2011, Meta-Learning in Computational Intelligence, 273-
, Springer Verlag, 2011.
Kietz, J-U., Serban, F., and Bernstein, A. eProPlan: A tool to model automatic generation of data
mining workflows. In P. Brazdil, A. Bernstein, and J-U. Kietz (Eds.), Proceedings of the 3rd
Planning to Learn Workshop (WS9) at ECAI 2010, 2010.
Behja, H., Zemmouri, E., and Marzak, A. Viewpoint-based annotations for knowledge discovery
in databases. In Proceedings of IEEE International Conference on Machine and Web Intelligence
ICMWI, 299-302, Algiers, Algeria, 2010.
Anton, A. I. Goal based requirements analysis. In Proceedings of the 2nd International Conference
on Requirements Engineering. ICRE'1996, Colorado Springs, Colorado, pp. 136-144, 15-18 April
Anton, A. I. Goal Identification and Refinement in the Specification of Information Systems,
Ph.D. Thesis, Georgia Institute of Technology, June 1997.
Mylopoulos, J., Chung, L. and Yu, E., (1999). From Object-Oriented to Goal-Oriented
Requirements Analysis. Communications of the ACM, Vol. 42 No. 1, pp. 31-37, January 1999.
Nuseibeh B., and Easterbrook S. Requirements engineering: a roadmap. In Proceedings of the
Conference on The Future of Software Engineering ICSE '00, 35-46, ACM, New York, NY, USA,
Giorgini P., Rizzi S., and Garzetti M. Goal-oriented requirement analysis for data warehouse
design. In Proceedings of the 8th ACM international workshop on Data warehousing and OLAP
(DOLAP '05), 47-56, ACM, New York, NY, USA, 2005.
Khouri, S., Bellatreche, L., and Marcel, P. Towards a Method for Persisting Requirements and
Conceptual Models in Data Warehousing Context. 27ième Journées Bases de Données Avancées
BDA2011, Rabat, Morocco, October 2011.
Kumar M., Gosain A., and Singh Y. Stakeholders Driven Requirements Engineering Approach for
Data Warehouse Development. Journal of Information Processing Systems, Vol. 6(3), 385-402,
Glinz, M. On Non-Functional Requirements. 15th IEEE International Requirements Engineering
Conference RE 2007, 21-26, October 2007.
R. Jarvis, G. McArthur, J. Mylopoulos, P. Rodriguez-Gianolli, and S. Zhou, “Semantic Models for
Knowledge Management,” In Proceedings of the Second International Conference on Web
Information Systems Engineering WISE'01, Vol. 1, pp. 8-16, IEEE Computer Society,
Washington, DC, USA, 2001.
Giorgini P., Nicchiarelli E., Mylopoulous J., and Sebastiani R. Formal Reasoning Techniques for
Goal Models. Journal on Data Semantics, LNCS, Vol. 2800/2003, pp. 1-20, Springer, 2004.
Aybuke A., and Claes W. Engineering and Managing Software Requirements. Springer-Verlag
Berlin, 2005.
Chapman, P., Clinton, J., Kerber, R., Khabaza, T., Reinartz, T., Shearer, C., and Wirth, R. CRISPDM
0 Step-by-step data mining guide. (Tech. Rep.) CRISM-DM consortium, 1999. Retrieved
from, http://www.crispdm.org
Rolland, C., and Salinesi, C. Modeling Goals and Reasoning with Them. In Aybüke Aurum and
Claes Wohlin (Eds.). Engineering and Managing Software Requirements (EMSR), 189-217,
Springer-Verlag 2005.
FaCT++, “FaCT++ OWL-DL reasoner,” http://owl.man.ac.uk/factplusplus/
Pellet: OWL 2 Reasoner for Java, http://clarkparsia.com/pellet/
Apache Jena: a Java framework for building Semantic Web applications, http://jena.apache.org/