MARIA: A PROCESS TO MODEL ENTITY RECONCILIATION PROBLEMS

Authors

  • J.G. ENRÍQUEZ Computer Languages and Systems Department. University of Seville. Av. Reina Mercedes s/n, 41012, Seville, Seville.
  • M. OLIVERO Computer Languages and Systems Department. University of Seville. Av. Reina Mercedes s/n, 41012, Seville, Seville.
  • A. JIMÉNEZ-RAMÍREZ Computer Languages and Systems Department. University of Seville. Av. Reina Mercedes s/n, 41012, Seville, Seville.
  • M.J. ESCALONA Computer Languages and Systems Department. University of Seville. Av. Reina Mercedes s/n, 41012, Seville, Seville.
  • M. MEJÍAS Computer Languages and Systems Department. University of Seville. Av. Reina Mercedes s/n, 41012, Seville, Seville.

Keywords:

Web Engineering, Model-Driven Engineering, Entity Reconciliation, NDT

Abstract

Within the development of software systems, the development of web applications may be one of the most widespread at present due to the great number of advantages they provide such as: multiplatform, speed of access or the not requiring extremely powerful hardware among others. The fact that so many web applications are being developed, makes enormous the volume of information that it is generated daily. In the management of all this information, the entity reconciliation (ER) problem occurs, which is to identify objects referring to the same real-world entity. This paper proposes to give a solution to this problem through a web perspective based on the Model-Driven Engineering paradigm. To this end, the Navigational Development Techniques (NDT) methodology, that provides a formal and complete set of processes that bring support to the software lifecycle management, has been taken as a reference and it has been extended adding new activities, artefacts and documents to cover the ER. All these elements are defined by a process named Model-Driven Entity ReconcilIAtion (MaRIA), that can be integrated in any software development methodology and allows one to define the ER problem from the early stages of the development. In addition, this proposal has been validated in a real-world case study helping companies to reduce costs when a software product that must give a solution to an ER problem has to be developed.

Downloads

Download data is not yet available.

References

J. Gubbi, R. Buyya, S. Marusic, and M. Palaniswami, “Internet of Things (IoT): A vision,

architectural elements, and future directions,” Futur. Gener. Comput. Syst., vol. 29, no. 7, pp.

–1660, 2013.

J. G. Enríquez, R. Blanco, F. J. Domínguez-Mayo, J. Tuya, and M. J. Escalona, “Towards an

MDE-based Approach to Test Entity Reconciliation Applications,” in Proceedings of the 7th

International Workshop on Automating Test Case Design, Selection, and Evaluation, 2016, pp.

–77.

N. Koch, A. Knapp, G. Zhang, and H. Baumeister, “UML-based web engineering: An

Approach Based on Standards,” Web Eng. Model. Implement. Web Appl., pp. 157–191, 2008.

S. Ceri, P. Fraternali, and A. Bongio, “Web modeling language (WebML): a modeling

language for designing Web sites,” Comput. Networks, vol. 33, no. 1, pp. 137–157, 2000.

S. Meliá, J. Gómez, S. Pérez, and O. Díaz, “A model-driven development for GWT-based rich

internet applications with OOH4RIA,” in Proceedings - 8th International Conference on Web

Engineering, ICWE 2008, 2008, pp. 13–23.

M. Linaje, J. C. Preciado, and F. Sánchez-Figueroa, “Engineering rich internet application user

interfaces over legacy web models,” IEEE Internet Comput., vol. 11, no. 6, pp. 53–59, 2007.

M. J. Escalona and G. Aragón, “NDT. A model-driven approach for web requirements,” IEEE

Trans. Softw. Eng., vol. 34, no. 3, pp. 377–394, 2008.

F. J. DOMINGUEZ-MAYO, M. J. ESCALONA, M. MEJIAS, M. ROSS, and G. STAPLES,

“Towards a Homogeneous Characterization of The Model-Driven Web Development

Methodologies,” J. web Eng., vol. 13, no. 1–2, pp. 129–159, 2014.

J.G. Enríquez, F.J. Domínguez-Mayo, M.J. Escalona, M. Ross, and G. Staples, “Entity

Reconciliation in Big Data Sources: a Systematic Mapping Study,” Expert Syst. Appl., vol. 80,

pp. 14–27, 2017.

S.-M.-R. Beheshti, B. Benatallah, S. Venugopal, S. H. Ryu, H. R. Motahari-Nezhad, and W.

Wang, “A systematic review and comparative analysis of cross-document coreference

resolution methods and tools,” Computing, pp. 1–37, 2016.

G. Papadakis, J. Svirsky, A. Gal, and T. Palpanas, “Comparative Analysis of Approximate

Blocking Techniques for Entity Resolution,” Pvldb, vol. 9, no. 9, pp. 684–695, 2016.

S. Cucerzan, “Large-Scale Named Entity Disambiguation Based on Wikipedia Data,” in

EMNLP-CoNLL 2007, 2007, pp. 708–716.

A. Moro, A. Raganato, and R. Navigli, “Entity Linking meets Word Sense Disambiguation: a

Unified Approach,” Trans. Assoc. Comput. Linguist., vol. 2, no. 0, pp. 231–244, 2014.

W. Shen, J. Wang, and J. Han, “Entity linking with a knowledge base: Issues, techniques, and

solutions,” IEEE Trans. Knowl. Data Eng., vol. 27, no. 2, pp. 443–460, 2015.

X. Liu, Y. Li, H. Wu, M. Zhou, and Y. L. Furu Wei, “Entity Linking for Tweets,” Acl ’13, pp.

–1311, 2013.

L. García-Borgoñón, “Un marco de referencia para facilitar la interoperabilidad y

mantenibilidad de los modelos de procesos de software,” 2015.

D. C. Schmidt, “Guest Editor’s Introduction : Model-Driven Engineering,” IEEE Comput., vol.

, no. 2, pp. 25–31, 2006.

M. Brambilla, J. Cabot, and M. Wimmer, Model-Driven Software Engineering in Practice, vol. 1, no. 1. 2012.

A. Metzger, “A Systematic Look at Model Transformations,” Nature, vol. 451, no. 7, pp. 644–

, 2008.

S. Mellor, K. Scott, A. Uhl, and D. Weise, “MDA Distilled - Principles of Model Driven

Architecture,” Addison Wesley, 2004.

L. Thiry and B. Thirion, “Functional metamodels for systems and software,” J. Syst. Softw.,

vol. 82, no. 7, pp. 1125–1136, 2009.

J. Bézivin, “On the unification power of models,” Softw. Syst. Model., vol. 4, no. 2, pp. 171–

, 2005.

F. J. Domínguez-Mayo, M. J. Escalona, and M. Mejías, “QuEF (Quality Evaluation

Framework) for model-driven web methodologies,” in Lecture Notes in Computer Science

(including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in

Bioinformatics), 2010, vol. 6385 LNCS, pp. 571–575.

F. J. Domínguez-Mayo, M. J. Escalona, and M. Mej ías, “Quality issues on model-driven web

engineering methodologies,” in Information Systems Development: Asian Experiences, 2011,

pp. 295–306.

F. J. Dominguez-Mayo, M. J. Escalona, M. Mejias, and a. H. Torres, “A Quality Model in a

Quality Evaluation Framework for MDWE methodologies,” Res. Challenges Inf. Sci. (RCIS),

Fourth Int. Conf., 2010.

L. Getoor and A. Machanavajjhala, “Entity resolution: Theory, practice & open challenges,”

Proc. VLDB Endow., vol. 5, no. 12, pp. 2018–2019, 2012.

F. Wang, H. Wang, J. Li, and H. Gao, “Graph-based reference table construction to facilitate

entity matching,” J. Syst. Softw., vol. 86, no. 6, pp. 1679–1688, 2013.

A. McCallum, K. Nigam, and L. H. Ungar, “Efficient clustering of high-dimensional data sets

with application to reference matching,” Proc. sixth ACM SIGKDD Int. Conf. Knowl. Discov.

data Min. KDD 00, pp. 169–178, 2000.

S. E. Whang and H. Garcia-Molina, “Incremental entity resolution on rules and data,” VLDB

J., vol. 23, no. 1, pp. 77–102, 2014.

ISO/IEC JTC 1, “ISO/IEC CD 20546 - Big data report,” vol. 31, no. 5, pp. 498–513, 2014.

ISO/IEC/IEEE, “INTERNATIONAL STANDARD ISO/IEC/IEEE 29119,” vol. 2013, 2013.

R. S. Pressman, Software Engineering A Practitioner’s Approach 7th Ed - Roger S. Pressman.

J. G. Enríquez, J. A. García-García, F. J. Domínguez-Mayo, and M. J. Escalona, “ALAMEDA

Ecosystem: Centering efforts in Software Testing Development,” Qual. Control Assur. - An

Anc. Greek Term Re-Mastered, vol. 1, no. 1, pp. 155–172, 2017.

J. J. Chilenski, “An investigation of three forms of the modified condition decision coverage

(MCDC) criterion,” Security, no. April, 2001.

J. Tuya, M. J. Suárez-Cabal, and C. De La Riva, “Full predicate coverage for testing SQL

database queries,” Softw. Test. Verif. Reliab., vol. 20, no. 3, pp. 237–288, 2010.

R. Blanco, J. Tuya, and R. V. Seco, “Test adequacy evaluation for the user-database

interaction: A specification-based approach,” in Proceedings - IEEE 5th International

Conference on Software Testing, Verification and Validation, ICST 2012, 2012, pp. 71–80.

Goverment of Spain. Retrieved January 2018 from: “http://datos.gob.es/.”

The World Bank (TWB). Retrieved January 2018 from: “www.worldbank.org.”

International Labour Organzacion (ILO). Retrieved January 2018 from: “www.ilo.org”

Downloads

Issue

Section

Articles