Novel Parallelization of Discontinuous Galerkin Method for Transient Electromagnetics Simulation Based on Sunway Supercomputers
DOI:
https://doi.org/10.13052/2022.ACES.J.370706Keywords:
discontinuous Galerkin method, electromagnetic analysis, memory access optimization, Sunway TaihuLightAbstract
A novel parallelization of discontinuous Galerkin time-domain (DGTD) method hybrid with the local time step (LTS) method on Sunway supercomputers for electromagnetic simulation is proposed. The proposed method includes a minimum number of roundtrip (MNR) strategy for processor-level parallelism and a double buffer strategy based on the remote memory access (RMA) of the Sunway processor. The MNR strategy optimizes the communication topology between nodes by recursively establishing the minimum spanning tree and the double buffer strategy is designed to make communication overlapped computation when RMA transmission. Combining the two methods, the proposed method achieves an unprecedented massively parallelism of the DGTD method. Several examples of radiation and scattering are used as cases to study the accuracy and validity of the proposed method. The numerical results show that the proposed method can effectively support 16,000 nodes (1,040,000 cores) parallelism on the Sunway supercomputer, which enables the DGTD method to solve the transient electromagnetic field in a very short time.
Downloads
References
J. S. Hesthaven and T. Warburton, Nodal Discontinuous Galerkin Methods: Algorithms, Analysis, and Applications, Springer, New York, NY, USA,2022.
E. Montseny, S. Pernet, X. Ferrières, and G. Cohen, “Dissipative terms and local time-stepping improvements in a spatial high order Discontinuous Galerkin scheme for the time-domain Maxwell’s equations,” J Comput. Phys, vol. 227, no. 14, pp. 6795-6820, 2008.
D. Sármány, M. A. Botchev, and J. J. M. Vegt, “Time-integration methods for finite element discretisations of the second-order Maxwell equation,” Computers and Mathematics with Applications, vol. 65 no. 3, pp. 528-543, 2013. https://doi.org/10.1016/j.camwa.2012.05.023
Q. Zhan, Y. Fang, M. Zhuang, M. Yuan, and Q. H. Liu, “Stabilized DG-PSTD method With nonconformal meshes for electromagnetic waves,” IEEE Transactions on Antennas and Propagation, vol. 68, no. 6, pp. 4714-4726, Jun. 2020. https://doi.org/10.1109/TAP.2020.2970036.
Q. Zhan, Y. Wang, Y. Fang, Q. Ren, S. Yang, and W.-Y. Yin, “An adaptive high-order transient algorithm to solve large-scale anisotropic Maxwell’s equations,” IEEE Transactions on Antennas and Propagation, vol. 70, no. 3, pp. 2082-2092, Mar. 2022. https://doi.org/10.1109/TAP.2021.3111639.
J. P. Webb and B. Forgahani, “Hierarchal scalar and vector tetrahedra,” IEEE Transactions on Magnetics, vol. 29, no. 2, pp. 1495-1498, Mar. 1993. https://doi.org/10.1109/20.250686.
P. Li, Y. Shi, L. J. Jiang, and H. Bağcı, “A hybrid time-domain discontinuous Galerkin-boundary integral method for electromagnetic scattering analysis,” IEEE Transactions on Antennas and Propagation, vol. 62, no. 5, pp. 2841-2846, May 2014. https://doi.org/10.1109/TAP.2014.2307294.
H. X. Qi, Y. H. Wang, J. Zhang, X. H. Wang, and J. G. Wang, “Explicit high-order exponential time integrator for discontinuous Galerkin solution of Maxwell’s equations,” Computer Physics Communications, vol. 267, 2021. https://doi.org/10.1016/j.cpc.2021.108080.
X. Li, L. Xu, H. Wang, Z. H. Yang, and B. Li, “A new implicit hybridizable discontinuous Galerkin time-domain method for solving the 3-D electromagnetic problems,” Applied Mathematics Letters, vol. 93, 2019. https://doi.org/10.1016/j.aml.2019.02.004
J. F. Chen and Q. H. Liu, “Discontinuous Galerkin time-domain methods for multiscale electromagnetic simulations: A review,” Proc. IEEE, vol. 101, no. 2, pp. 242-254, Feb. 2013.
J. M. Jin, Theory and Computation of Electromagnetic Fields, Wiley-IEEE Press, USA, 2010.
L. Raphaël, V. Jonathan, D. Clément, S. Claire, and L. Stéphane, “A parallel non-conforming multielement DGTD method for the simulation of electromagnetic wave interaction with metallic nanoparticles,” Journal of Computational and Applied Mathematics, vol. 270, pp. 330-342, 2014. https://doi.org/10.1016/j.cam.2013.12.042
S. Dosopoulos, J. D. Gardiner, and J. F. Lee, “An MPI/GPU parallelization of an interior penalty discontinuous Galerkin time domain method for Maxwell’s equations,” Radio Science,2011.
H. H. Zhang, P. P. Wang, L. J. Jiang, W. Sha, M. S. Tong, Y. Liu, W. Wu, and G. Shi, “Parallel higher order DGTD and FETD for transient electromagnetic-circuital-thermal co-simulation,” IEEE Transactions on Microwave Theory and Techniques, vol. 70, no. 6, pp. 1, 2022. https://doi.org/10.1109/TMTT.2022.3164703
P. Li, D. R. Chakrabarti, C. Ding, and L. Yuan, “Adaptive software caching for efcient NVRAM data persistence,” 2017 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pp. 112-122, 2017.
K. Zhang, “The research and application of memory access optimization on heterogeneous multi-core platforms (in Chinese),” Dissertation, University of Science and Technology, Beijing,2018.
H. Bai, C. Hu, X. He, B. Zhang, and J. Wang, “Crystal MD: Molecular dynamic simulation software for metal with BCC structure,” Big Data Technology and Applications, Springer, Singapore, pp. 247-258, 2017.
D. Feng, S. Liu, X. Wang, X. Y. Wang, and G. Li, “High-order GPU-DGTD method based on unstructured grids for GPR simulation,” Journal of Applied Geophysics, vol. 202, 2022. https://doi.org/10.1016/j.jappgeo.2022.104666
Z. G. Ban, Y. Shi, and P. Wang, “Advanced parallelism of DGTD method with local time stepping based on novel MPI + MPI unified parallel algorithm,” IEEE Transactions on Antennas and Propagation, vol. 70, no. 5, pp. 3916-3921, 2022. https://doi.org/10.1109/TAP.2021.3137455
H. T. Mengand and J. M. Jin, “Acceleration of the dual-field domain decomposition algorithm using MPI–CUDA on large-scale computing systems,” IEEE Transactions on Antennas and Propagation, vol. 62, no. 9, pp. 4706-4715, 2014. https://doi.org/10.1109/TAP.2014.2330608
G. Chen G, L. Zhao, and W. H. Yu, “A novel acceleration method for DGTD algorithm on sunway TaihuLight,” 2018 IEEE Asia-Pacific Conference on Antennas and Propagation (APCAP), Auckland, New Zealand, 2018. https://doi.org/0.1109/APCAP.2018.8538209
J. Dongarra, “Sunway TaihuLight supercomputer makes its appearance,” Natl Sci Rev., vol. 3, no. 3, pp. 265-266, 2019.
Z. Xu, J. Lin, and S. Matsuoka, “Benchmarking SW26010 many-core processor,” IEEE International Parallel and Distributed Processing Symposium Workshops, 2017.
Z. Xu, J. Lin, and S. Matsuoka, “The compiling system user guide of Sunway TighthuLight,” National Super-computing Wuxi Center, Wuxi, 2016.
Y. Yu, H. An, J. Chen, W. Liang, Q. Xu, and Y. Chen, “Pipelining computation and optimization strategies for scaling gromacs on the Sunway many-core processor,” Algorithms and Architectures for Parallel Processing, vol. 10393, pp. 18-32, 2017.
D. X. Chen and X. Liu, “Parallel programming andoptimization of Sunway Taihulight supercomputer (in Chinese),” National Super-computing Wuxi Center, Wuxi, 2017.
J. Gu, J. W. Feng, X. Y. Hao, T. Fang, C. Zhao, H. An, J. Chen, M. Xu, J. Li, W. Han, C. Yang, F. Li, and D. Chen, “Establishing a non-hydrostatic global atmospheric modeling system at 3-km horizontal resolution with aerosol feedbacks on the Sunway supercomputer of China,” Science Bulletin, vol. 67, no. 11, pp. 1170-1181, 2022. https://doi.org/10.1016/j.scib.2022.03.009
M. Dun, Y. C. Li, Q. X. Sun, H. L. Yang, W. Li, Z. Luan, L. Gan, G. Yang, and D. Qian, “Towards efficient canonical polyadic decomposition on sunway many-core processor,” Information Sciences, vol. 549, pp. 221-248, 2021. https://doi.org/10.1016/j.ins.2020.11.013
R. Berger, S. Dubuisson, and C. Gonzales, “Fast multiple histogram computation using Kruskal’s algorithm,” 2012 19th IEEE International Conference on Image Processing, 2012. https://doi.org/10.1109/ICIP.2012.6467374