LOD Construction Through Supervised Web Relation Extraction and Crowd Validation


  • Goran Rumin Infobip, Zagreb, Croatia
  • Igor Mekterović University of Zagreb Faculty of Electrical Engineering and Computing, Croatia


Relation Extraction, Machine Learning, RDF, Linked Open Data, Crowd validation, Semantic Web, Web Application


Free, unstructured text is the dominant format in which information is stored and published. To interpret such vast amount of data one must employ a programmatic approach. In this paper, we describe a novel approach – a pipeline in which interesting relations are extracted from web portals news texts, stored as RDF triplets, and finally validated by end user via browser extension. In the process, different machine learning algorithms were tested on relation extraction, enhanced with our own set of features and thoroughly evaluated, with excellent precision and recall results compared to models used for semantic knowledge expansion. Building on those results, we implement and describe the component to resolve discovered entities to existing semantic entities from three major online repositories. Finally, we implement and describe the validation process in which RDF triplets are presented to the web portal reader for validation via Chrome extension.


Author Biographies

Goran Rumin, Infobip, Zagreb, Croatia

Goran Rumin is a software engineer in Infobip company currently working on infrastructure application development. He received his B.Sc. and M.Sc. in computing from the University of Zagreb (Croatia), Faculty of Electrical Engineering and Computing in 2015 and 2017 respectively. He worked on various projects related to machine learning, high availability, service monitoring and security. His interests include machine learning and semantic web.

Igor Mekterović, University of Zagreb Faculty of Electrical Engineering and Computing, Croatia

Igor Mekterović is currently an associate professor at the University of Zagreb (Croatia), Faculty of Electrical Engineering and Computing. He received his PhD. degree in 2008 from the same university. His research interests are in the areas of databases, data warehouses, web development and bioinformatics.


