A Memory Driven Self-learning Combat Agent Architecture in a 3D Virtual Environment

Authors

  • Tianci Zhang School of Automation, Beijing Institution of Technology, Beijing 100081, China
  • Yongyong Wei School of Astronautics, Beijing Institution of Technology, Beijing 100081, China
  • Hao Fang School of Automation, Beijing Institution of Technology, Beijing 100081, China

DOI:

https://doi.org/10.13052/jwe1540-9589.2451

Keywords:

Agent modelling, memory-driven architecture, reinforcement learning, military simulation, 3D virtual environment

Abstract

Agent behavior modeling in 3D virtual environments is a critical challenge in artificial intelligence and military simulation. While rule-based methods (e.g., finite state machines) are widely used, their limitations in adaptability and development efficiency hinder their application in dynamic combat scenarios. To address this, a memory-driven self-learning agent (MDSLA) architecture is proposed, integrating visual, auditory, and game features to simulate human-like battlefield decision-making. The architecture employs an asynchronous advantage actor-critic (A3C) framework to enhance training efficiency and incorporates a memory module for processing historical perception data. Experimental validation in the Vizdoom environment demonstrates that MDSLA outperforms traditional rule-based methods and mainstream reinforcement learning algorithms in convergence speed and combat effectiveness. Furthermore, a parallel simulation mechanism is implemented via high-speed middleware, enabling seamless deployment of the model on both Vizdoom and a high-precision simulation platform (HPSP). Results from HPSP experiments show a 33% reduction in task execution time and a 24.1% improvement in lethality compared to finite state machine-driven agents. This work provides a scalable framework for developing intelligent combat agents with enhanced adaptability and realism in 3D virtual environments.

Downloads

Download data is not yet available.

Author Biographies

Tianci Zhang, School of Automation, Beijing Institution of Technology, Beijing 100081, China

Tianci Zhang was born in Haerbin, China in 1991. He received a B.Sc. degree in automation from Hangzhou Dianzi University, Hangzhou, China in 2013, an M.Sc. degree in artificial intelligence from Beijing Institution of Technology, Beijing in 2015, and is currently pursuing a Ph.D. degree in artificial intelligence from Beijing Institution of Technology. He has authored more than 10 articles. His research interests include agent modeling, optimization algorithms, and military simulation.

Yongyong Wei, School of Astronautics, Beijing Institution of Technology, Beijing 100081, China

Yongyong Wei was born in Guangshui, China in 1978. He received a B.Sc. degree in electronic information from the Beijing Institution of Technology in 2009. He has been a researcher in agent modeling and simulation application designing at Ordnance Science and Research Academy of China since 2010. He has authored more than 10 articles. His research interests include multi-domain simulation, LVC simulation, and complex system evaluation.

Hao Fang, School of Automation, Beijing Institution of Technology, Beijing 100081, China

Hao Fang was born in March 1973, and is a postdoctoral fellow, professor, and doctoral supervisor. He obtained a doctoral degree from Xi’an Jiaotong University in 2002 and worked as a postdoctoral fellow at INRIA Sophia Antipolis in France from April 2002 to July 2003. From November 2003 to December 2004, he worked as a postdoctoral fellow in the LASMEA laboratory at the CNRS National Research Center in France. He started teaching at Beijing Institute of Technology in 2005. His main research directions are intelligent robots, parallel robots, and multi-agent systems.

References

P. K. Davis, ‘Military Applications of Simulation: A Selected Review’, Applied System Simulation: Methodologies and Applications, 2003: 407–435.

M. McPartland, M. Gallagher, ‘Reinforcement learning in first person shooter games’, IEEE Transactions on Computational Intelligence and AI in Games, 2010, 3(1): 43–56.

V. Mnih, K. Kavukcuoglu, D. Silver, et al., ‘Playing atari with deep reinforcement learning’, arXiv preprint arXiv:1312.5602, 2013.

C. Berner, G. Brockman, B. Chan, et al., ‘Dota 2 with large scale deep reinforcement learning’, arXiv preprint arXiv:1912.06680, 2019.

J. Dansie, ‘Game Development in Unity: Game Production, Game Mechanics and the Effects of Gaming’, 2013.

O. Vinyals, I. Babuschkin, W. M. Czarnecki, et al., ‘Grandmaster level in StarCraft II using multi-agent reinforcement learning’, nature, 2019, 575(7782): 350–354.

R. S. Sutton, ‘Reinforcement learning, a Bradford book’, 1998.

N. Osman, C. Sierra, ‘Autonomous agents and multi-agent systems’, Kluwer Academic Publishers, Massachusetts, 2008.

A. Tolk, ‘Engineering Principles of Combat Modeling and Distributed Simulation’, 2012.

J. Schoffel, ‘Half-life 2: deathmatch. Australian PC User, 2005.

R. Chin, ‘First-Person Shooter Game Framework’, Beginning iOS 3D Unreal Games Development, 2012: 283–318.

E. Larsen, ‘Unreal tournament iii’, Personal Computer World, 2008, 31(5), p. 77.

D. Long, M. Fox, ‘Progress in AI planning research and applications’, UPGRADE: The European Journal for the Informatics Professional, 2002, 3(5): 10–25.

M. Humphrys, ‘Action selection methods using reinforcement learning’, From Animals to Animats, 1996, 4: 135–144.

T. Bewley, J. Lawry, ‘Richards A. Modelling agent policies with interpretable imitation learning’, International Workshop on the Foundations of Trustworthy AI Integrating Learning, Optimization and Reasoning. Cham: Springer International Publishing, 2020: 180–186.

H. M. Le, Y. Yue, P. Carr, et al., ‘Coordinated multi-agent imitation learning’, International Conference on Machine Learning. PMLR, 2017: 1995–2003.

M. L. Littman, ‘Markov games as a framework for multi-agent reinforcement learning’, Machine learning proceedings 1994. Morgan Kaufmann, 1994: 157–163.

M, Hannebauer, J, Wendler, E, Pagello, et al., ‘Situation based strategic positioning for coordinating a team of homogeneous agents’ //Balancing Reactivity and Social Deliberation in Multi-Agent Systems: From RoboCup to Real-World Applications. Springer Berlin Heidelberg, 2001: 175–197.

L. Zhang, Y. Gu, X. Zhao, et al., ‘Generalizing soft actor-critic algorithms to discrete action spaces’, Chinese Conference on Pattern Recognition and Computer Vision. Singapore: Springer Nature Singapore, 2024: 34–49.

P. R. Gorton, A. Strand, K. Brathen, ‘A survey of air combat behavior modeling using machine learning’, arXiv preprint arXiv:2404.13954, 2024.

V. Karpov, V. Vorobiev, et al., ‘About some aspects of finite state machine models application to group control’. Mekhatronika, Avtomatizatsiya, Upravlenie, 2023.

J. Li, J. Su, Q. Gu, et al., ‘Behavior Tree Generation Study for Multi-agent’, International Conference on Man-Machine-Environment System Engineering. Singapore: Springer Nature Singapore, 2024: 503–509.

S. Taghipour, H. A. Namoura, M. Sharifi, et al., ‘Real-time production scheduling using a deep reinforcement learning-based multi-agent approach’, INFOR: Information Systems and Operational Research, 2024, 62(2): 186–210.

S. N. A. Jawaddi, A. Ismail, ‘ntegrating OpenAI Gym and CloudSim Plus: A simulation environment for DRL Agent training in energy-driven cloud scaling’, Simulation Modelling Practice and Theory, 2024, 130: 102858.

T. Papagiannis, G. Alexandridis, A. Stafylopatis, ‘Boosting Deep Reinforcement Learning Agents with Generative Data Augmentation’, Applied Sciences, 2023, 14(1): 330.

D. Ye, G. Chen, W. Zhang, et al., ‘Towards playing full moba games with deep reinforcement learning’, Advances in Neural Information Processing Systems, 2020, 33: 621–632.

S. Risi, M. Preuss, ‘From chess and atari to starcraft and beyond: How game AI is driving the world of AI’, KI-Künstliche Intelligenz, 2020, 34(1): 7–17.

C. Badica, L. Braubach, A. Paschke, ‘Rule-based distributed and agent systems’, Rule-Based Reasoning, Programming, and Applications: 5th International Symposium, RuleML 2011–Europe, Barcelona, Spain, July 19-21, 2011. Proceedings 5. Springer Berlin Heidelberg, 2011: 3–28.

E. Bonabeau, ‘Agent-based modeling: Methods and techniques for simulating human systems’, Proceedings of the national academy of sciences, 2002, 99(suppl_3): 7280–7287.

C. M. Macal, M. J. North, ‘Tutorial on agent-based modeling and simulation’, Proceedings of the Winter Simulation Conference, 2005. IEEE, 2005: 14 pp.

Bohemia Interactive. ‘Virtual Battle Space 4 product brochure’, 2024. Retrieved from https://://bisimulations.com/products/vbs4.

M. Kempka, M. Wydmuch, G. Runc, et al., ‘Vizdoom: A doom-based ai research platform for visual reinforcement learning’, 2016 IEEE conference on computational intelligence and games. IEEE, 2016: 1–8.

G. Lample, D. S. Chaplot, ‘Playing FPS games with deep reinforcement learning’, Proceedings of the AAAI conference on artificial intelligence. 2017, 31(1).

Unity, ‘Unity ML-Agents Introduction’, 2024. Available: https://unity.cn/product/machine-learning-agents.

H. da Silva Corrêa Pinto, L. O. Alvares, ‘An extended behavior network for a game agent: An investigation of action selection quality and agent performance in unreal tournament’, Mexican International Conference on Artificial Intelligence. Berlin, Heidelberg: Springer Berlin Heidelberg, 2005: 287–296.

K. Adil, F. Jiang, S. Liu, et al., ‘Training an agent for fps doom game using visual reinforcement learning and Vizdoom’, International Journal of Advanced Computer Science and Applications, 2017, 8(12).

S. Hegde, A. Kanervisto, A. Petrenko, ‘Agents that listen: High-throughput reinforcement learning with multiple sensory systems’,2021 IEEE Conference on Games. IEEE, 2021: 1–5.

Poznan University of Technology. Vizdoom Documentation. 2024. Retrieved from https://vizdoom.cs.put.edu.pl/.

Downloads

Published

2025-08-26

How to Cite

Zhang, T. ., Wei, Y. ., & Fang, H. . (2025). A Memory Driven Self-learning Combat Agent Architecture in a 3D Virtual Environment. Journal of Web Engineering, 24(05), 687–712. https://doi.org/10.13052/jwe1540-9589.2451

Issue

Section

Advanced Practice in Web Engineering in Asia