Privacy-Preserving Reengineering of Model-View-Controller Application Architectures Using Linked Data
Keywords:Privacy by design, Web of data, Software architecture, Model-View-Controller
When a legacy system's software architecture cannot be redesigned, implementing additional privacy requirements is often complex, unreliable and costly to maintain. This paper presents a privacy-by-design approach to reengineer web applications as linked data-enabled and implement access control and privacy preservation properties. The method is based on the knowledge of the application architecture, which for the Web of data is commonly designed on the basis of a model-view-controller pattern. Whereas wrapping techniques commonly used to link data of web applications duplicate the security source code, the new approach allows for the controlled disclosure of an application's data, while preserving non-functional properties such as privacy preservation. The solution has been implemented and compared with existing linked data frameworks in terms of reliability, maintainability and complexity.
A. Aksac, O. Ozturk, and E. Dogdu. A novel semantic web browser for user centric information retrieval: PERSON. Expert Systems with Applications, 39(15):12001–12013, 2012.
M. Amundsen. APIs to affordances: A new paradigm for services on the web. In C. Pautasso, E. Wilde, and R. Alarcon, editors, REST: Advanced Research Topics and Practical Applications, pages 91–106. Springer, 2014.
S. Auer, S. Dietzold, J. Lehmann, S. Hellmann, and D. Aumueller. Triplify: Light-weight linked data publication from relational databases. In Proc. of the 18th Int. Conf. on World Wide Web, pages 621–630, 2009.
K. Bednar, S. Spiekermann, and M. Langheinrich. Engineering privacy by design: Are engineers ready to live up to the challenge? The Information Society, 35(3):122–142, 2019.
C. Bizer and R. Cyganiak. D2R Server – Publishing Relational Databases on the Semantic Web. In Proc. of the 5th International Semantic Web Conference, Athens, Georgia, USA, 2006.
M. E. Bonfanti. Enhancing cybersecurity by safeguarding information privacy: The European Union and the implementation of the “data protection by design” approach. In Proc. of the 13th International Conference on Availability, Reliability and Security, pages 64:1–64:6, 2018.
F. P. Brooks. The Mythical Man-Month: Essays on Software Engineering, 20th Anniversary Edition. Addison-Wesley Professional, 1995.
R. D. Caytiles and S. Lee. A review of MVC framework based software development. Int. Journal of Software Engineering and its Applications, 8(10):213–220, 2014.
V. Ciriani, S. de Capitani di Vimercati, S. Foresti, and P. Samarati. Microdata protection. In T. Yu and S. Jajodia, editors, Secure Data Management in Decentralized Systems, pages 291–321. Springer-Verlag, 2007.
F. Dotsika. Semantic APIs: Scaling up towards the Semantic Web. Int. Journal of Information Management, 30(4):335–342, August 2010.
C. Dwork. Differential privacy. In Proc. of the 33rd Int. Conf. on Automata, Languages and Programming – Volume Part II, pages 1–12, Berlin, Heidelberg, 2006. Springer-Verlag.
O. Erling. Declaring RDF views of SQL data. In W3C Workshop on RDF Access to Relational Databases, 2007.
O. Erling and I. Mikhailov. RDF support in the Virtuoso DBMS. In T. Pellegrini, S. Auer, K. Tochtermann, and S. Schaffert, editors, Networked Knowledge – Networked Media, volume 221 of Studies in Computational Intelligence, pages 8–24. Springer, 2009.
EU. Regulation (EU) 2016/679 of the European Parliament and of the Council of 27 april 2016 on the protection of natural persons with regard to the processing of personal data and on the free movement of such data, and repealing directive 95/46/EC (General Data Protection Regulation), 2016.
B. C. M. Fung, K. Wang, R. Chen, and P. S. Yu. Privacy-preserving data publishing: A survey of recent developments. ACM Computing Surveys, 42(4):14:1–14:53, 2010.
G. K. Gill and C. F. Kemerer. Cyclomatic complexity density and software maintenance productivity. IEEE Transactions on Software Engineering, 17(12):1284–1288, 1991.
P. Groth, A. Loizou, A. J. G. Gray, C. Goble, L. Harland, and S. Pettifer. API-centric Linked Data integration: The OpenPHACTS Discovery Platform case study. Web Semantics: Science, Services and Agents on the World Wide Web, 29:12–18, 2014.
M. Hausenblas. Exploiting Linked Data to Build Web Applications. IEEE Internet Computing, 13(4):68–73, 2009.
T. Heath and C. Bizer. Linked Data: Evolving the Web into a Global Data Space. Morgan & Claypool, 2011.
J. Hendler, N. Shadbolt, W. Hall, T. Berners-Lee, and D. Weitzner. Web Science: an interdisciplinary approach to understanding the Web. Communications of the ACM, 51(7):60–69, 2008.
A. Hundepool, J. Domingo-Ferrer, L. Franconi, S. Giessing, E. S. Nordholt, K. Spicer, and P. de Wolf. Statistical Disclosure Control. Wiley, 2012.
P. Hustinx. Privacy by design: delivering the promises. Identity in the Information Society, 3(2):253–255, 2010.
B. Hyland, G. Atemezing, and B. Villazón-Terrazas. Best practices for publishing linked data. Technical Report TR/LDP, W3C, January 2014.
A. Jentzsch, R. Isele, and C. Bizer. Silk – generating RDF links while publishing or consuming linked data. In Proc. of the ISWC, Posters & Demonstrations Track, volume 658, pages 53–56, 2010.
B. Jöerg, I. Ruiz-Rube, M.A. Sicilia, J. Dvořák, K. Jeffery, T. Hoellrigl, H. S. Rasmussen, A. Engfer, T. Vestdam, and E. García-Barriocanal. Connecting closed world research information systems through the linked open data web. International Journal of Software Engineering and Knowledge Engineering, 22(3):345–364, 2012.
S. Joksimovic, J. Jovanovic, D. Gasevic, A. Zouaq, and Z. Jeremic. An empirical evaluation of ontology-based semantic annotators. In Proc. of the 7th Int. Conf. on Knowledge Capture, pages 109–112. ACM, 2013.
S. Kirrane, S. Villata, and M. d'Aquin. Privacy, security and policies: A review of problems and solutions with semantic web technologies. Semantic Web, 9(2):153–161, 2018.
M. Lanthaler. Creating 3rd Generation Web APIs with Hydra. In Proc. of the 22nd Int. Conf. on World Wide Web, pages 35–38, 2013.
J. Lehmann, R. Isele, M. Jakob, A. Jentzsch, D. Kontokostas, P. N. Mendes, S. Hellmann, M. Morsey, P. van Kleef, S. Auer, and C. Bizer. DBpedia – A Large-scale, Multilingual Knowledge Base Extracted from Wikipedia. Semantic Web – Interoperability, Usability, Applicability, 6(2):167–195, 2015.
N. Li, T. Li, and S. Venkatasubramanian. t-Closeness: Privacy Beyond k-Anonymity and l-Diversity. In 23rd Int. Conf. on Data Engineering, pages 106–115. IEEE, 2007.
A. Machanavajjhala, D. Kifer, J. Gehrke, and M. Venkitasubramaniam. L-diversity: Privacy beyond k-anonymity. ACM Transactions on Knowledge Discovery from Data, 1(1), March 2007.
S. Martínez, D. Sánchez, and A. Valls. A semantic framework to protect the privacy of electronic health records with non-numerical attributes. Journal of Biomedical Informatics, 46(2):294–303, 2013.
T. J. McCabe and C. W. Butler. Design complexity measurement and testing. Communications of the ACM, 32(12):1415–1425, 1989.
E. McCallister, T. Grance, and K. A. Scarfone. Guide to protecting the confidentiality of Personally Identifiable Information (PII). Technical Report SP 800-122, NIST, 2010.
B. J. Oates. Researching Information Systems and Computing. Sage, 2005.
E. Oren, R. Delbru, M. Catasta, R. Cyganiak, H. Stenzhorn, and G. Tummarello. Sindice.com: a document-oriented lookup index for open linked data. Int. Journal of Metadata, Semantics and Ontologies, 3(1):37–52, 2008.
E. Oren, B. Heitmann, and S. Decker. ActiveRDF: Embedding semantic web data into object-oriented languages. Web Semantics: Science, Services and Agents on the World Wide Web, pages 191–202, 2008.
K. Pol, N. Patil, S. Patankar, and C. Das. A survey on web content mining and extraction of structured and semistructured data. In Emerging Trends in Engineering and Technology, pages 543–546, 2008.
E. Rajabi, M. A. Sicilia, and S. Sanchez-Alonso. An empirical study on the evaluation of interlinking tools on the web of data. Journal of Information Science, 40(5):637–648, 2014.
L. Rocher, J. M. Hendrickx, and Y. de Montjoye. Estimating the success of re-identifications in incomplete datasets using generative models. Nature Communications, 10(3069), 2019.
M. Rodriguez-Garcia, M. Batet, and D. Sánchez. A semantic framework for noise addition with nominal data. Knowledge-Based Systems, 122:103–118, 2017.
M. Rodriguez-Garcia, M. Batet, and D. Sánchez. Utility-preserving privacy protection of nominal data sets via semantic rank swapping. Information Fusion, 45:282–295, 2019.
I. Ruiz-Rube, J. M. Dodero, and R. Colomo-Palacios. A framework for software process deployment and evaluation. Information and Software Technology, 59(3):205–221, 2015.
P. Samarati. Protecting respondents' identities in microdata release. IEEE Transactions on Knowledge and Data Engineering, 13(6):1010–1027, November 2001.
P. Samarati and L. Sweeney. Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression. Technical Report SRI-CSL-98-04, SRI International, 1998.
F. Scharffe, G. Atemezing, R. Troncy, F. Gandon, S. Villata, B. Bucher, F. Hamdi, L. Bihanic, G. Képéklian, F. Cotton, J. Euzenat, Z. Fan, P.-Y. Vandenbussche, and B. Vatant. Enabling linked data publication with the Datalift platform. In AAAI Workshop on the 26th Conference on Artificial Intelligence, pages 25–30. AAAI Publications, 2012.
J. Soria-Comas and J. Domingo-Ferrer. Probabilistic k-anonymity through microaggregation and data swapping. In IEEE International Conference on Fuzzy Systems, 2012.
D.-E. Spanos, P. Stavrou, and N. Mitrou. Bringing relational databases into the semantic web: A survey. Semantic Web – Interoperability, Usability, Applicability, 3(2):169–209, 2010.
R. N. Taylor, N. Medvidović, and E. M. Dashofy. Software Architecture. Foundations, Theory, and Practice. John Wiley & Sons, 2010.
H. Tillwick and M. S. Olivier. A layered security architecture blueprint. In Proc. of the 4th Annual Information Security South Africa Conference, 2004.
US. HIPAA. Health Insurance Portability and Accountability Act, 2002.
V. K Vaishnavi and W. Kuechler. Design Science Research Methods and Patterns. CRC Press, 2nd edition, 2015.
H. H. Wang, D. Damljanovic, T. Payne, N. Gibbins, and K. Bontcheva. Transition of Legacy Systems to Semantic Enabled Application: TAO Method and Tools. Semantic Web – Interoperability, Usability, Applicability, 3(2):157–168, 2012.
S. Wölger, K. Siorpaes, T. Bürger, E. Simperl, S. Thaler, and C. Hofer. A survey on data interlinking methods. Technical Report 2011-03-31, Semantic Technology Institute, march 2011.