Abstract
With the rapid development of the Internet of Things, massive amounts of multi-modal data continue to emerge. Under diverse data types and dynamic environments, traditional anonymization technologies struggle to address privacy leakage risks due to their lack of multi-modal adaptability and static parameter configurations, failing to balance privacy protection and data utility in dynamic IoT environments, and a more efficient and flexible solution is urgently needed. Aiming at the complexity of the coexistence of multi-modal features such as images, texts, and time series in IoT data, We design a generator structure based on a cross-modal attention mechanism to realize deep modelling and privacy risk expression of various modal features and a multi-discriminator collaborative training strategy is introduced to enhance privacy recognition capabilities. Construct a reinforcement learning framework based on a deep deterministic policy gradient and realize the dynamic trade-off between privacy protection and data utility with the adaptive adjustment mechanism of the reward function. In order to realize the effective coupling between multi-modal adversarial generative network and reinforcement learning, it is proposed to use the generator gradient as the policy input to guide the policy update direction and further improve the accuracy and stability of anonymization. In the context of large-scale IoT data processing, combined with a distributed asynchronous training mechanism, the convergence and consistency of the model under multi-node parallel conditions are ensured. In the IoT data anonymization experiment, after using the collaborative optimization method of multi-modal adversarial generation network and reinforcement learning, the data anonymization efficiency increased by 58.7%, and the data loss rate decreased by 45.3%. The privacy leakage rate in the test set containing 236,000 samples dropped from 34.8% to 67.9%, and the accuracy rate increased by 12.5 percentage points. This method performs better in 76.4% of scenarios, reducing computing resource consumption by 21.3%. This method can improve data availability and processing efficiency while ensuring data privacy.
References
W. Ahsan, W. Q. Yi, Z. J. Qin, Y. W. Liu, and A. Nallanathan, “Resource Allocation in Uplink NOMA-IoT Networks: A Reinforcement-Learning Approach,” Ieee Transactions on Wireless Communications, vol. 20, no. 8, pp. 5083–5098, 2021.
J. J. Alcaraz, F. Losilla, and F. J. Gonzalez-Castaño, “Transmission Control in NB-IoT With Model-Based Reinforcement Learning,” Ieee Access, vol. 11, pp. 57991–58005, 2023.
M. J. F. Alenazi, M. A. Al-Khasawneh, S. Rahman, and Z. Bin Faheem, “Deep Reinforcement Learning Based Flow Aware-QoS Provisioning in SD-IoT for Precision Agriculture,” Computational Intelligence, vol. 41, no. 1, 2025.
T. Allaoui, K. Gasmi, and T. Ezzedine, “Reinforcement learning based task offloading of IoT applications in fog computing: algorithms and optimization techniques,” Cluster Computing-the Journal of Networks Software Tools and Applications, vol. 27, no. 8, pp. 10299–10324, 2024.
S. Anbazhagan and R. K. Mugelan, “Next-gen resource optimization in NB-IoT networks: Harnessing soft actor-critic reinforcement learning,” Computer Networks, vol. 252, 2024.
K. Baek and I. Y. Ko, “Dynamic and Effect-Driven Output Service Selection for IoT Environments Using Deep Reinforcement Learning,” Ieee Internet of Things Journal, vol. 10, no. 4, pp. 3339–3355, 2023.
A. Boni, H. Hassan, and K. Drira, “Oneshot Deep Reinforcement Learning Approach to Network Slicing for Autonomous IoT Systems,” Ieee Internet of Things Journal, vol. 11, no. 10, pp. 17034–17049, 2024.
O. Bushehrian and A. Moazeni, “Deep reinforcement learning-based optimal deployment of IoT machine learning jobs in fog computing architecture,” Computing, vol. 107, no. 1, 2025.
S. Chakravarty and A. Kumar, “IoT Network with Energy Efficiency for Dynamic Sink via Reinforcement Learning,” Wireless Personal Communications, vol. 136, no. 3, pp. 1719–1734, 2024.
X. S. Chen, Y. X. Mao, Y. H. Xu, W. C. Yang, C. X. Chen, and B. Z. Lei, “Energy-efficient multi-hop LoRa broadcasting with reinforcement learning for IoT networks,” Ad Hoc Networks, vol. 169, 2025.
J. W. Chu, C. Y. Pan, Y. F. Wang, X. Yun, and X. H. Li, “Edge Computing Resource Allocation Algorithm for NB-IoT Based on Deep Reinforcement Learning,” Ieice Transactions on Communications, vol. E106B, no. 5, pp. 439–447, 2023.
K. Ergun, R. Ayoub, P. Mercati, and T. Rosing, “Reinforcement learning based reliability-aware routing in IoT networks,” Ad Hoc Networks, vol. 132, 2022.
M. S. Frikha, S. M. Gammar, A. Lahmadi, and L. Andrey, “Reinforcement and deep reinforcement learning for wireless Internet of Things: A survey,” Computer Communications, vol. 178, pp. 98–113, 2021.
R. Ghafari and N. Mansouri, “Fuzzy Reinforcement Learning Algorithm for Efficient Task Scheduling in Fog-Cloud IoT-Based Systems,” Journal of Grid Computing, vol. 22, no. 4, 2024.
F. Habeeb et al., “Dynamic Data Streams for Time-Critical IoT Systems in Energy-Aware IoT Devices Using Reinforcement Learning,” Sensors, vol. 22, no. 6, 2022.
B. Haouari, R. Mzid, and O. Mosbahi, “Investigating the performance of multi-objective reinforcement learning techniques in the context of IoT with harvesting energy,” Journal of Supercomputing, vol. 81, no. 4, 2025.
Y. Huang, C. Y. Hao, Y. J. Mao, and F. H. Zhou, “Dynamic Resource Configuration for Low-Power IoT Networks: A Multi-Objective Reinforcement Learning Method,” Ieee Communications Letters, vol. 25, no. 7, pp. 2285–2289, 2021.
A. Jarwan and M. Ibnkahla, “Edge-Based Federated Deep Reinforcement Learning for IoT Traffic Management,” Ieee Internet of Things Journal, vol. 10, no. 5, pp. 3799–3813, 2023.
J. Jia, R. Y. Yu, Z. J. Du, J. Chen, Q. H. Wang, and X. W. Wang, “Distributed localization for IoT with multi-agent reinforcement learning,” Neural Computing & Applications, vol. 34, no. 9, pp. 7227–7240, 2022.
Y. C. Jiang, Z. J. Wang, and Z. X. Jin, “Iot Data Processing and Scheduling Based on Deep Reinforcement Learning,” International Journal of Computers Communications & Control, vol. 18, no. 6, 2023.
J. N. Jin, S. C. Xing, E. K. Ji, and W. H. Liu, “XGate: Explainable Reinforcement Learning for Transparent and Trustworthy API Traffic Management in IoT Sensor Networks,” Sensors, vol. 25, no. 7, 2025.
A. Kaur, K. Kumar, A. Prakash, and R. Tripathi, “Imperfect CSI-Based Resource Management in Cognitive IoT Networks: A Deep Recurrent Reinforcement Learning Framework,” Ieee Transactions on Cognitive Communications and Networking, vol. 9, no. 5, pp. 1271–1281, 2023.
M. Kim, M. Jaseemuddin, and A. Anpalagan, “Deep Reinforcement Learning Based Active Queue Management for IoT Networks,” Journal of Network and Systems Management, vol. 29, no. 3, 2021.
K. Lavanya, K. V. Devi, and B. R. T. Bapu, “Deep Reinforcement Extreme Learning Machines for Secured Routing in Internet of Things (IoT) Applications,” Intelligent Automation and Soft Computing, vol. 34, no. 2, pp. 837–848, 2022.
L. Li, Y. Luo, J. Yang, and L. N. Pu, “Reinforcement Learning Enabled Intelligent Energy Attack in Green IoT Networks,” Ieee Transactions on Information Forensics and Security, vol. 17, pp. 644–658, 2022.
Z. B. Li, S. Pan, and Y. X. Qin, “Multiuser Scheduling Algorithm for 5G IoT Systems Based on Reinforcement Learning,” Ieee Transactions on Vehicular Technology, vol. 72, no. 4, pp. 4643–4653, 2023.
T. L. Mai, H. P. Yao, N. Zhang, W. J. He, D. Guo, and M. Guizani, “Transfer Reinforcement Learning Aided Distributed Network Slicing Optimization in Industrial IoT,” Ieee Transactions on Industrial Informatics, vol. 18, no. 6, pp. 4308–4316, 2022.
A. F. Y. Mohammed, S. M. Sultan, J. Lee, and S. Lim, “Deep-Reinforcement-Learning-Based IoT Sensor Data Cleaning Framework for Enhanced Data Analytics,” Sensors, vol. 23, no. 4, 2023.
Y. Y. Munaye, R. T. Juang, H. P. Lin, G. B. Tarekegn, and D. B. Lin, “Deep Reinforcement Learning Based Resource Management in UAV-Assisted IoT Networks,” Applied Sciences-Basel, vol. 11, no. 5, 2021.
A. Musaddiq, R. Ali, J. G. Choi, B. S. Kim, and S. W. Kim, “Collision Observation-Based Optimization of Low-Power and Lossy IoT Network Using Reinforcement Learning,” Cmc-Computers Materials & Continua, vol. 67, no. 1, pp. 799–814, 2021.

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
Copyright (c) 2025 Journal of Cyber Security and Mobility
