Self-sovereign and Secure Data Sharing Through Docker Containers for Machine Learning on Remote Node

Authors

  • Jungchul Seo Department of Computer Engineering, Hongik University, Seoul South Korea https://orcid.org/0009-0009-9606-7202
  • Younggyo Lee Department of Computer Engineering, Hongik University, Seoul South Korea
  • Young Yoon Department of Computer Engineering, Hongik University, Seoul South Korea

DOI:

https://doi.org/10.13052/jwe1540-9589.2352

Keywords:

Self-sovereignty, trusted execution environment, data sharing, containers, Web3.0

Abstract

Collecting personal data from various sources and using it for machine learning (ML) is prevalent. However, there are increasing concerns about the monopolization and potential breach of private data by greedy and malicious organizations. Interest in Web 3.0 systems is on the rise as an alternative. These systems aim to guarantee the self-sovereignty of personal data in a decentralized setting. Users can share data with others directly for fair compensation. Nevertheless, malicious remote users can still violate the integrity and confidentiality of personal data. Therefore, this paper proposes a novel method of preventing unwanted leakage and counterfeiting of the private data lent on the premise of remote users. This paper focuses on the decentralized nature of Web 3.0 to leverage existing personal storage so that the burden of collecting secure data is relieved. Data owners create a lightweight Docker container to encapsulate their private data sources. The data owners generate another container to be deployed on a remote premise for taking and executing any ML algorithms remote users create. Between the containers forming a distributed trusted execution environment (TEE), data are read through a secure channel. Since the TEE is strictly controlled by the data owner, no malicious ML application can leak or breach the private information. This paper explains the engineering details of how this new method is realized.

Downloads

Download data is not yet available.

Author Biographies

Jungchul Seo, Department of Computer Engineering, Hongik University, Seoul South Korea

Jungchul Seo is a doctoral student at Hongik University. He also works as a developer at PHI Digital Healthcare. His research interests include computer security, artificial intelligence, distributed networks, and new Web 3.0 themes. Mr. Seo earned a master’s degree in computer engineering from Pukyong National University in 2003.

Younggyo Lee, Department of Computer Engineering, Hongik University, Seoul South Korea

Younggyo Lee is currently a senior undergraduate student at Hongik University. His research interests include cloud service security, network design, and new Web 3.0 problems. He joined the undergraduate program in computer engineering at Hongik University in 2019.

Young Yoon, Department of Computer Engineering, Hongik University, Seoul South Korea

Young Yoon is an associate professor in computer engineering at Hongik University. He also serves as a CTO for Neouly Incorporated. His research interest is in distributed systems, middleware, cyber security, AI applications and emerging Web 3.0 issues. Yoon earned a B.A. and M.S. in computer sciences at the University of Texas at Austin in 2003 and 2006, respectively. He also earned his Ph.D. in computer engineering at the University of Toronto in 2013.

References

Jungmin Kim and Kangho Bong. Survey on artificial intelligence industry. Technical report, IITP, 2023. https:/spri.kr/posts/view/23578?code=sw_reports&s_year=&data_page=1 [Accessed: July 2, 2024].

Magnus Redeker, Sören Volgmann, Florian Pethig, and Johannes Kalhoff. Towards data sovereignty of asset administration shells across value added chains. In 2020 25th IEEE International Conference on Emerging Technologies and Factory Automation (ETFA), volume 1, pages 1151–1154. IEEE, 2020.

Atilla Aydın and Türksel Kaya Bensghir. Digital data sovereignty: towards a conceptual framework. In 2019 1st International Informatics and Software Engineering Conference (UBMYK), pages 1–6. IEEE, 2019.

Georgios A Kaissis, Marcus R Makowski, Daniel Rückert, and Rickmer F Braren. Secure, privacy-preserving and federated machine learning in medical imaging. Nature Machine Intelligence, 2(6):305–311, 2020.

Joon-Woo Lee, HyungChul Kang, Yongwoo Lee, Woosuk Choi, Jieun Eom, Maxim Deryabin, Eunsang Lee, Junghyun Lee, Donghoon Yoo, Young-Sik Kim, et al. Privacy-preserving machine learning with fully homomorphic encryption for deep neural network. IEEE Access, 10:30039–30054, 2022.

Sabrina Sicari, Alessandra Rizzardi, and Alberto Coen-Porisini. Insights into security and privacy towards fog computing evolution. Computers & Security, page 102822, 2022.

Lizhi Sun, Shuocheng Wang, Hao Wu, Yuhang Gong, Fengyuan Xu, Yunxin Liu, Hao Han, and Sheng Zhong. Leap: Trustzone based developer-friendly tee for intelligent mobile apps. IEEE Transactions on Mobile Computing, 2022.

Soo-Yong Shin. Issues and solutions of healthcare data de-identification: the case of south korea. Journal of Korean Medical Science, 33(5), 2018.

Emily M Weitzenboeck, Pierre Lison, Malgorzata Cyndecka, and Malcolm Langford. The gdpr and unstructured data: is anonymization possible? International Data Privacy Law, 12(3):184–206, 2022.

Young Yoon, Dae-hyun Ban, Sung-Won Han, Hong-Uk Woo, Eun-ho Heo, Sang-Ho Shin, Jung-kyuen Lee, and Dong-hyeok An. Terminal, cloud apparatus, driving method of terminal, method for processing cooperative data, computer readable recording medium, January 18 2022. US Patent 11,228,653.

Alexandra Wood, Micah Altman, Aaron Bembenek, Mark Bun, Marco Gaboardi, James Honaker, Kobbi Nissim, David R O’Brien, Thomas Steinke, and Salil Vadhan. Differential privacy: A primer for a non-technical audience. Vand. J. Ent. & Tech. L., 21:209, 2018.

Steven Ruggles, Catherine Fitch, Diana Magnuson, and Jonathan Schroeder. Differential privacy and census data: Implications for social and economic research. In AEA papers and proceedings, volume 109, pages 403–408. American Economic Association 2014 Broadway, Suite 305, Nashville, TN 37203, 2019.

Craig Gentry. Computing arbitrary functions of encrypted data. Communications of the ACM, 53(3):97–105, 2010.

Kundan Munjal and Rekha Bhatia. A systematic review of homomorphic encryption and its contributions in healthcare industry. Complex & Intelligent Systems, 9(4):3759–3786, 2023.

Young Yoon and Jaehoon Kim. Homomorphic matching on publish/subscribe brokers based on simple integer partition and factorization for secret forwarding. In Proceedings of the 23rd International Middleware Conference Demos and Posters, pages 11–12, 2022.

Young Yoon and Juno Moon. Verifying the integrity of private transaction information in smart contract using homomorphic encryption. In 2019 IEEE Eurasia Conference on IOT, Communication and Engineering (ECICE), pages 38–40. IEEE, 2019.

Wonkyung Jung, Eojin Lee, Sangpyo Kim, Jongmin Kim, Namhoon Kim, Keewoo Lee, Chohong Min, Jung Hee Cheon, and Jung Ho Ahn. Accelerating fully homomorphic encryption through architecture-centric analysis and optimization. IEEE Access, 9:98772–98789, 2021.

Youngjin Bae, Jung Hee Cheon, Wonhee Cho, Jaehyung Kim, and Taekyung Kim. Meta-bts: Bootstrapping precision beyond the limit. In Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security, pages 223–234, 2022.

Youngjin Bae, Jung Hee Cheon, Jaehyung Kim, Jai Hyun Park, and Damien Stehlé. Hermes: Efficient ring packing using mlwe ciphertexts and application to transciphering. In Annual International Cryptology Conference, pages 37–69. Springer, 2023.

Brendan McMahan, Eider Moore, Daniel Ramage, Seth Hampson, and Blaise Agueray Arcas. Communication-efficient learning of deep networks from decentralized data. In Artificial intelligence and statistics, pages 1273–1282. PMLR, 2017.

Federated Learning. Collaborative machine learning without centralized training data. Publication date: Thursday, April, 6, 2017.

Otkrist Gupta and Ramesh Raskar. Distributed learning of deep neural network over multiple agents. Journal of Network and Computer Applications, 116:1–8, 2018.

Zongshun Zhang, Andrea Pinto, Valeria Turina, Flavio Esposito, and Ibrahim Matta. Privacy and efficiency of communications in federated split learning. IEEE Transactions on Big Data, 2023.

Chandra Thapa, Pathum Chamikara Mahawaga Arachchige, Seyit Camtepe, and Lichao Sun. Splitfed: When federated learning meets split learning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pages 8485–8493, 2022.

Peter Kairouz, H Brendan McMahan, Brendan Avent, Aurélien Bellet, Mehdi Bennis, Arjun Nitin Bhagoji, Kallista Bonawitz, Zachary Charles, Graham Cormode, Rachel Cummings, et al. Advances and open problems in federated learning. Foundations and Trends® in Machine Learning, 14(1–2):1–210, 2021.

globalplatform.org. Globalplatform specifications archive. {https:/globalplatform.org/specs-library/?filter-committee=tee} [Accessed: 02.22.24].

trustedfirmware.org. Op-tee documentation. {https:/optee.readthedocs.io} [Accessed: 02.22.24].

Nezer Jacob Zaidenberg, Raz Ben Yehuda, and Roee Shimon Leon. Arm hypervisor and trustzone alternatives. Encyclopedia of Criminal Activities and the Deep Web, pages 1150–1162, 2020.

Wikipidia. Docker. {https:/en.wikipedia.org/wiki/Docker_(software)} [Accessed: 02.22.24].

Kubernetes.io. Kubernetes documentation. {https:/kubernetes.io/docs/home/} [Accessed: 02.22.24].

docker.com. Use containers to build, share and run your applications. {https:/www.docker.com/resources/what-container} [Accessed: 02.22.24].

ietf. Totp: Time-based one-time password algorithm. {https:/datatracker.ietf.org/doc/html/rfc6238} [Accessed: 02.22.24].

Hyeonmin Kim and Young Yoon. An ensemble of text convolutional neural networks and multi-head attention layers for classifying threats in network packets. Electronics, 12(20):4253, 2023.

Downloads

Published

2024-08-23

How to Cite

Seo, J., Lee, Y., & Yoon, Y. (2024). Self-sovereign and Secure Data Sharing Through Docker Containers for Machine Learning on Remote Node. Journal of Web Engineering, 23(05), 637–656. https://doi.org/10.13052/jwe1540-9589.2352

Issue

Section

Web 3.0 Applications Supported by Artificial Intelligence and Blockchain Technol