ISSN: 2245-4578 (Online Version) ISSN:2245-1439 (Print Version)
Data Deduplication Method Based on CSP-DLP Asymmetric Homomorphic Encryption Algorithm for High-Density Scenarios
PDF
HTML

Keywords

Internet of vehicles
conjugate search problem
discrete logarithm problem
homomorphic encryption
federated learning
data deduplication

How to Cite

[1]
Q. . Mo, W. . Liao, H. . Liao, J. . Guo, X. . Feng, and R. . Guo, “Data Deduplication Method Based on CSP-DLP Asymmetric Homomorphic Encryption Algorithm for High-Density Scenarios”, JCSANDM, vol. 15, no. 03, pp. 747–776, Jun. 2026.

Abstract

With the continuous growth of data scale in high-density scenarios such as the Internet of Things and the Internet of Vehicles, the repeated storage and frequent transmission of massive data not only causes waste of computing and storage resources, but also significantly increases the risk of sensitive information leakage. Therefore, this study innovatively proposes a data deduplication method that integrates asymmetric homomorphic encryption and federated learning. First, a novel asymmetric homomorphic encryption algorithm is designed using the conjugate search problem and the discrete logarithm problem. This algorithm ensures the indistinguishability of ciphertext while providing a cryptographic foundation for data comparability in the ciphertext state, resolving the inherent conflict between privacy protection and data deduplication. Based on this, the proposed encryption algorithm is combined with a federated learning framework to construct an efficient data processing flow that supports ciphertext deduplication, achieving secure identification and filtering of redundant privacy data. The experimental findings reveal that the encryption cost of the introduced encryption algorithm under 128-bit security strength is only 62.5% of the traditional Paillier scheme, and the ciphertext size is reduced by about 42.4%. When conducting deduplication testing in Internet of Vehicles scenarios, the proposed method achieves a duplicate detection rate of 97.4% on a million-level dataset. Moreover, under the condition of maintaining full encrypted processing, the storage requirements are reduced by an average of 38.6%, and the cross-node communication overhead is reduced by about 29.4%. In summary, the proposed method combines high security, high detection rate, and low overhead in high-density scenarios, achieving a balance between privacy protection and data deduplication efficiency. This research provides a scalable, deployable and practical engineering value technology path for privacy data management in the Internet of Things, industrial Internet, smart cities and other fields.

https://doi.org/10.13052/jcsm2245-1439.1539
PDF
HTML

References

Guo Z, Liu Q, Gao Z. Modular-Based Compression Scheme for Address Data in the Blockchain System for IoV Applications. IEEE Transactions on Vehicular Technology, 2024, 73(10): 15567–15583. DOI: 10.1109/TVT.2024.3411568.

Yakhni S, Tekli J, Mansour E. Using fuzzy reasoning to improve redundancy elimination for data deduplication in connected environments. Soft Computing, 2023, 27(17): 12387–12418. DOI: 10.1007/s00500-023-07880-z.

Luo R, Jin H, He Q. Enabling balanced data deduplication in mobile edge computing. IEEE Transactions on Parallel and Distributed Systems, 2023, 34(5): 1420–1431. DOI: 10.1109/TPDS.2023.3247061.

Tang X, Zhu Y, Fu M. Comments on “Privacy Aware Data Deduplication for Side Channel in Cloud Storage”. IEEE Transactions on Cloud Computing, 2024, 12(2): 814–817. DOI: 10.1109/TCC.2024.3376996.

Zhou C, Ansari N. Securing federated learning enabled NWDAF architecture with partial homomorphic encryption. IEEE Networking Letters, 2023, 5(4): 299–303. DOI: 10.1109/LNET.2023.3294497.

Xie Q Jiang S, Jiang L. Efficiency optimization techniques in privacy-preserving federated learning with homomorphic encryption: A brief survey. IEEE Internet of Things Journal, 2024, 11(14): 24569–24580. DOI: 10.1109/JIOT.2024.3382875.

Cai Y, Ding W, Xiao Y. Secfed: A secure and efficient federated learning based on multi-key homomorphic encryption. IEEE Transactions on Dependable and Secure Computing, 2023, 21(4): 3817–3833. DOI: 10.1109/TDSC.2023.3336977.

Mantey E A, Zhou C, Anajemba J H. Federated learning approach for secured medical recommendation in internet of medical things using homomorphic encryption. IEEE Journal of Biomedical and Health Informatics, 2024, 28(6): 3329–3340. DOI: 10.1109/JBHI.2024.3350232.

Shah R, Mukherjee K, Tyagi A. R2D2: reducing redundancy and duplication in data lakes. Proceedings of the ACM on Management of Data, 2023, 1(4): 1–25. DOI: 10.1145/3626762.

Hijazi N M, Aloqaily M, Guizani M. Secure federated learning with fully homomorphic encryption for IoT communications. IEEE Internet of Things Journal, 2023, 11(3): 4289–4300. DOI: 10.1109/JIOT.2023.3302065.

Lejun Z, Minghui P, Shen S. Redundant data detection and deletion to meet privacy protection requirements in blockchain-based edge computing environment. China Communications, 2024, 21(3): 149–159. DOI: 10.23919/JCC.fa.2021-0815.202403.

Rani R, Kumar N, Khurana M. Redundancy elimination in IoT oriented big data: A survey, schemes, open challenges and future applications. Cluster Computing, 2024, 27(1): 1063–1087. DOI: 10.1007/s10586-023-04209-1.

Song M, Hua Z, Zheng Y. FCDedup: A two-level deduplication system for encrypted data in fog computing. IEEE Transactions on Parallel and Distributed Systems, 2023, 34(10): 2642–2656. DOI: 10.1109/TPDS.2023.3298684.

Zhao J, Yang Z, Li J. Encrypted data reduction: Removing redundancy from encrypted data in outsourced storage. ACM Transactions on Storage, 2024, 20(4): 1–30. DOI: 10.1145/3685278.

Qi S, Wei W, Wang J. Secure data deduplication with dynamic access control for mobile cloud storage. IEEE Transactions on Mobile Computing, 2023, 23(4): 2566–2582. DOI: 10.1109/TMC.2023.3263901.

Jatoth C, Doriya R. IoV block secure: Blockchain based secure data collection and validation framework for internet of vehicles network. Peer-to-Peer Networking and Applications, 2024, 17(6): 3964–3990. DOI: 10.1007/s12083-024-01802-y.

Yang H, Xue D, Ge M. Fast generation-based gradient leakage attacks: An approach to generate training data directly from the gradient. IEEE Transactions on Dependable and Secure Computing, 2024, 22(1): 132–145. DOI: 10.1109/TDSC.2024.3387570.

Xu Y, Mao Y, Li S. Privacy-preserving federal learning chain for internet of things. IEEE Internet of Things Journal, 2023, 10(20): 18364–18374. DOI: 10.1109/JIOT.2023.3279830.

Guo L, Gao W, Cao Y. Research on medical data security sharing scheme based on homomorphic encryption. Math. Biosci. Eng., 2023, 20(2): 2261–2279. DOI: 10.3934/mbe.2023106.

Song M, Hua Z, Zheng Y. Enabling transparent deduplication and auditing for encrypted data in cloud. IEEE Transactions on Dependable and Secure Computing, 2023, 21(4): 3545–3561. DOI: 10.1109/TDSC.2023.3334475.

Mi B, Zhou J, Huang D. Privacy-preserving data processing method for IoV based on homomorphic conjugacy search problem. IEEE Transactions on Intelligent Transportation Systems, 2024, 25(7): 7374–7387. DOI: 10.1109/TITS.2024.3351837.

Xue J, Yu K, Zhang T. Cooperative deep reinforcement learning enabled power allocation for packet duplication URLLC in multi-connectivity vehicular networks. IEEE Transactions on Mobile Computing, 2024, 23(8): 8143–8157. DOI: 10.1109/TMC.2023.3347580.

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Copyright (c) 2026 Journal of Cyber Security and Mobility

Downloads

Download data is not yet available.