Distributed Machine Learning Privacy Protection Algorithm for Tax Big Data Analysis

Ying  Fang; Ling  Pu; Ning  Zhang; Jun  Liang; Shaojuan  Ouyang

doi:10.13052/jcsm2245-1439.1513

2026, Articles

2026

Distributed Machine Learning Privacy Protection Algorithm for Tax Big Data Analysis

Articles

https://doi.org/10.13052/jcsm2245-1439.1513

Published 2026-03-26

Ying Fang⁺⁻
Ling Pu⁺⁻
Ning Zhang⁺⁻
Jun Liang⁺⁻
Shaojuan Ouyang⁺⁻

Ying Fang

Department of Economics, Qinhuangdao Vocational and Technical College, Qinhuangdao 066100, China

Ling Pu

Department of Economics, Qinhuangdao Vocational and Technical College, Qinhuangdao 066100, China

Ning Zhang

Department of Economics, Qinhuangdao Vocational and Technical College, Qinhuangdao 066100, China

Jun Liang

Department of Economics, Qinhuangdao Vocational and Technical College, Qinhuangdao 066100, China

Shaojuan Ouyang

Department of Economics, Qinhuangdao Vocational and Technical College, Qinhuangdao 066100, China

PDF

HTML

Keywords

Tax big data
Distributed machine learning
Privacy protection
Homomorphic encryption
Secure multi-party computation
Robustness
Compliance

How to Cite

[1]

Y. . Fang, L. . Pu, N. . Zhang, J. . Liang, and S. . Ouyang, “Distributed Machine Learning Privacy Protection Algorithm for Tax Big Data Analysis”, JCSANDM, vol. 15, no. 01, pp. 67–94, Mar. 2026.

Abstract

With the acceleration of informatization and digitization, the tax system has generated massive amounts of data with diverse types and large scales. However, the data sharing across regions and institutions faces challenges on privacy protection and compliance. Therefore, a distributed machine learning privacy protection algorithm for tax big data is proposed, and a multi-layer secure transmission mechanism combining differential privacy, homomorphic encryption, and secure multi-party computation is designed. In the experiment, real invoices and tax declaration data from provincial tax bureaus, as well as simulated data generated based on these data, are selected to compare various existing methods. The results showed that the accuracy in classification tasks reached 0.87, which was 2.35%–8.75% higher than that of traditional distributed methods. In the regression task, the mean square error and mean absolute error were reduced by 10%–40% and 22.03%, respectively. Compared to homomorphic encryption methods, the designed method reduced communication overhead by 59.67% and achieved a fault tolerance of 96.38% under the 10% node dropout rate. In addition, the accuracy decrease in poisoning attack scenarios was only 25.29%, which was superior to other methods. This algorithm can achieve high predictive performance and robustness while ensuring privacy and compliance, providing effective technical support for intelligent tax governance.

https://doi.org/10.13052/jcsm2245-1439.1513

PDF

HTML

References

O. Tuyishimire and B. F. Murorunkwere. “Applications of big data analytics in tax compliance monitoring: A case study of Rwanda’s value-added tax,” CESifo Econ. Stud., vol. 70, no. 4, pp. 578–587, December, 2024, DOI: 10.1093/cesifo/ifae027.

Y. Chen, L. Xiang, and H. Yang. “Interregional value-added tax in the era of e-commerce: Tax policy design based on big data from online retailing,” J. Social Comput., vol. 5, no. 1, pp. 46–57, January, 2024, DOI: 10.23919/JSC.2024.0006.

R. Belahouaoui and E. H. Attak. “Digital taxation, artificial intelligence and tax administration 3.0: Improving tax compliance behavior–a systematic literature review using textometry (2016–2023),” Account. Res. J., vol. 37, no. 2, pp. 172–191, March, 2024, DOI: 10.1108/ARJ-12-2023-0372.

M. Hasanvand, M. Nooshyar, E. Moharamkhani, and A. Selyari. “Machine learning methodology for identifying vehicles using image processing,” AIA, vol. 1, no. 3, pp. 170–178, April, 2023, DOI: 10.47852/bonviewAIA3202833.

R. A. S. R. Wahab and A. Bakar. “Digital economy tax compliance model in Malaysia using machine learning approach,” Sains Malays., vol. 50, no. 7, pp. 2059–2077, July, 2021, DOI: 10.17576/jsm-2021-5007-20.

S. Ullah, R. Luo, T. S. Adebayo, and M. T. Kartal. “Dynamics between environmental taxes and ecological sustainability: Evidence from top-seven green economies by novel quantile approaches,” Sustain. Dev., vol. 31, no. 2, pp. 825–839, February, 2023, DOI: 10.1002/sd.2423.

K. O. Ariyibi, O. F. Bello, T. F. Ekundayo, and O. Ishola. “Leveraging artificial intelligence for enhanced tax fraud detection in modern fiscal systems,” GSC Adv. Res. Rev., vol. 21, no. 2, pp. 129–137, February, 2024, DOI: 10.30574/gscarr.2024.21.2.0415.

A. Atadoga, U. J. Umoga, O. A. Lottu, and E. O. Sodiya. “Evaluating the impact of cloud computing on accounting firms: A review of efficiency, scalability, and data security,” Global J. Eng. Technol. Adv., vol. 18, no. 2, pp. 65–74, February, 2024, DOI: 10.30574/gjeta.2024.18.2.0027.

A. A. Elamer, M. Boulhaga, and B. A. Ibrahim. “Corporate tax avoidance and firm value: The moderating role of environmental, social, and governance (ESG) ratings,” Bus. Strategy Environ., vol. 33, no. 7, pp. 7446–7461, July, 2024, DOI: 10.1002/bse.3881.

Y. Tan, W. Zheng, J. Cao, and B. Jiang. “Intelligent tax systems: Automating tax audits and improving revenue efficiency,” Open J. Account., vol. 14, no. 3, pp. 156–169, March, 2025, DOI: 10.4236/ojacct.2025.143009.

S. Liu, C. Zheng, Y. Huang, and T. Q. Quek. “Distributed reinforcement learning for privacy-preserving dynamic edge caching,” IEEE J. Sel. Areas Commun., vol. 40, no. 3, pp. 749–760, March, 2022, DOI: 10.1109/JSAC.2022.3142348.

H. Padmanaban. “Privacy-preserving architectures for AI/ML applications: Methods, balances, and illustrations,” J. Artif. Intell. Gen. Sci., vol. 3, no. 1, pp. 235–245, January, 2024, DOI: 10.60087/jaigs.v3i1.117.

Y. Zhu, K. Yu, M. Wei, Y. Pu, and Z. Wang. “AI-enhanced administrative prosecutorial supervision in financial big data: New concepts and functions for the digital era,” J. Adv. Comput. Syst., vol. 4, no. 5, pp. 10–26, May, 2024, DOI: 10.69987/JACS.2024.40502.

M. Li, F. Wang, X. Jia, W. Li, T. Li, and G. Rui. “Multi-source data fusion for economic data analysis,” Neural Comput. Appl., vol. 33, no. 10, pp. 4729–4739, May, 2021, DOI: 10.1007/s00521-020-05531-0.

Y. Tong and R. Zhang. “Investigating the multiple mechanisms of tourism economy affecting sustainable urban development of Chinese cities: Based on multi-source data,” Environ. Dev. Sustain., vol. 26, no. 1, pp. 1781–1808, January, 2024, DOI: 10.1007/s10668-022-02785-7.

L. Shen, Y. Sun, Z. Yu, L. Ding, X. Tian, and D. Tao. “On efficient training of large-scale deep learning models,” ACM Comput. Surv., vol. 57, no. 3, pp. 1–36, November, 2024, DOI: 10.1145/3700439.

D. Usynin, D. Rueckert, and G. Kaissis. “Beyond gradients: Exploiting adversarial priors in model inversion attacks,” ACM Trans. Privacy Secur., vol. 26, no. 3, pp. 1–30, June, 2023, DOI: 10.1145/3592800.

R. Yang, J. Ma, J. Zhang, S. Kumari, S. Kumar, and J. J. Rodrigues. “Practical feature inference attack in vertical federated learning during prediction in artificial Internet of Things,” IEEE Internet Things J., vol. 11, no. 1, pp. 5–16, May, 2023, DOI: 10.1109/JIOT.2023.3275161.

Y. Zhao and J. Chen. “A survey on differential privacy for unstructured data content,” ACM Comput. Surv., vol. 54, no. 10s, pp. 1–28, September, 2022, DOI: 10.1145/3490237.

K. Munjal and R. Bhatia. “A systematic review of homomorphic encryption and its contributions in healthcare industry,” Complex Intell. Syst., vol. 9, no. 4, pp. 3759–3786, August, 2023, DOI: 10.1007/s40747-022-00756-z.

V. Sucasas, A. Aly, G. Mantas, J. Rodriguez, and N. Aaraj. “Secure multi-party computation-based privacy-preserving authentication for smart cities,” IEEE Trans. Cloud Comput., vol. 11, no. 4, pp. 3555–3572, July, 2023, DOI: 10.1109/TCC.2023.3294621.

B. Jia, X. Zhang, J. Liu, Y. Zhang, K. Huang, and Y. Liang. “Blockchain-enabled federated learning data protection aggregation scheme with differential privacy and homomorphic encryption in IIoT,” IEEE Trans. Ind. Informat., vol. 18, no. 6, pp. 4049–4058, June, 2021, DOI: 10.1109/TII.2021.3085960.

J. Dong, A. Roth, and W. J. Su. “Gaussian differential privacy,” J. R. Stat. Soc. Ser. B-Stat. Methodol., vol. 84, no. 1, pp. 3–37, February, 2022, DOI: 10.1111/rssb.12454.

Y. Liu and L. Buckingham. “Academic research network management: Sociocultural perspectives from languages other than English,” J. Lang. Identity Educ., vol. 24, no. 4, pp. 841–857, 2025, DOI: 10.1080/15348458.2023.2196629.

Y. Zheng, S. Lai, Y. Liu, X. Yuan, X. Yi, and C. Wang. “Aggregation service for federated learning: An efficient, secure, and more resilient realization,” IEEE Trans. Dependable Secure Comput., vol. 20, no. 2, pp. 988–1001, January, 2022, DOI: 10.1109/TDSC.2022.3146448.

M. Ahmad, S. Habib, and F. Tariq. “Enhancing model robustness in federated learning: A systematic literature review of Byzantine-resilient aggregation methods,” VFAST Trans. Softw. Eng., vol. 13, no. 2, pp. 196–227, 2025, DOI: 10.21015/vtse.v13i2.2163.

J. So, B. Güler, and A. S. Avestimehr. “CodedPrivateML: A fast and privacy-preserving framework for distributed machine learning,” IEEE J. Sel. Areas Inf. Theory, vol. 2, no. 1, pp. 441–451, March, 2021, DOI: 10.1109/JSAIT.2021.3053220.

J. Wang, A. Pal, Q. Yang, K. Kant, K. Zhu, and S. Guo. “Collaborative machine learning: Schemes, robustness, and privacy,” IEEE Trans. Neural Netw. Learn. Syst., vol. 34, no. 12, pp. 9625–9642, December, 2022, DOI: 10.1109/TNNLS.2022.3169347.

J. Chen, K. Li, and S. Y. Philip. “Privacy-preserving deep learning model for decentralized VANETs using fully homomorphic encryption and blockchain,” IEEE Trans. Intell. Transp. Syst., vol. 23, no. 8, pp. 11633–11642, August, 2022, DOI: 10.1109/TITS.2021.3105682.

R. Wang, H. Qiu, H. Gao, C. Li, Z. Y. Dong, and J. Liu. “Adaptive horizontal federated learning-based demand response baseline load estimation,” IEEE Trans. Smart Grid, vol. 15, no. 2, pp. 1659–1669, September, 2023, DOI: 10.1109/TSG.2023.3318418.

K. Somsuk, “Enhanced Algorithm for Recovering RSA Plaintext when Two Modulus Values Share At least One Common Prime Factor”, JCSANDM, vol. 14, no. 02, pp. 433–456, Jun. 2025.

M. Gao, Z. Zhang, L. Cui, S. Feng, J. Liu, and Y. Jiang, “Temporal and Topological Enhanced Graph Neural Networks for Traffic Anomaly Detection”, JCSANDM, vol. 14, no. 02, pp. 457–474, Jun. 2025.

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

Downloads

Download data is not yet available.

Distributed Machine Learning Privacy Protection Algorithm for Tax Big Data Analysis

Keywords

How to Cite

Download Citation

Abstract

References

Downloads