A Novel Enhancing Technique for Parallel FDTD Method using Processor Affinity and NUMA Policy
Keywords:
NUMA, parallel FDTD, processor affinity, SMPAbstract
The traditional multiple CPUs mounted on one node in a high performance cluster is based on Symmetric Multi-Processing (SMP) architecture. The memory bandwidth is a major bottleneck in the high performance computing. Recently, Intel and AMD companies developed the (Non-uniform Memory Access (NUMA) architecture for the multi-CPU server that is an important extension of the SMP computer. In the NUMA architecture server, each CPU has its own memory and can also be access to the memory located the nearby of other CPUs through the onboard network. For a parallel code, we can allocate the data for each CPU inside its local memory to accelerate the memory access. In this paper, we investigate a way how to achieve the high performance of parallel FDTD code on a computer cluster that includes 21 nodes with 42 CPU and 168 cores. Numerical experiments have demonstrated that different job binding schemes can significantly affect the performance of parallel FDTD code.
Downloads
References
A. Taflove and S. Hagness, Computational Electromagnetics: The Finite-Difference TimeDomain Method, 3rd ed., Artech House, Norwood, MA, 2005.
Harrington, R. F., Field Computation by Moment Methods, MacMillan, New York, 1968.
J. M. Jin, The Finite Element Method in Electromagnetics (2nd Edition), New York: John Wiley & Sons, 2002.
F. L. Teixeira, “A Summary Review on 25 Years of Progress and Future Challenges in FDTD and FETD Techniques,” Applied Computational Electromagnetics Society (ACES) Journal, vol. 25, no. 1, pp. 1-14, 2010.
V. Demir, “A Stacking Scheme to Improve the Efficiency of Finite-Difference Time-Domain Solutions on Graphics Processing Units,” Applied Computational Electromagnetics Society (ACES) Journal, vol. 25, no. 4, pp. 323 - 330, 2010.
X. Duan, X. Chen, K. Huang, H. Zhou, “A High Performance Parallel FDTD Based on Winsock and Multi-Threading on a PC-Cluster,” Applied Computational Electromagnetics Society (ACES) Journal, vol. 26, no. 3, pp. 241 - 249, 2011.
W. Yu, R. Mittra, T. Su, Y. Liu, and X. Yang, Parallel Finite Difference Time Domain Method, Artech House, Massachusetts, June, 2006.
W. Yu, R. Mittra, X. Yang, and Y. Liu, Electromagnetic Simulation Techniques Based FDTD Method, John Wiley and Sons, 2009.
W. Gropp, E. Lusk, and A. Skjellum, Using MPI: Portable Parallel Programming with the MessagePassing Interface, 2nd ed., MIT Press, Cambridge, MA, 1999.
Optimizing software applications for NUMA Site: http://software.intel.com.
C. Zhang, X. Yuan, and A. Srinivasan, “Processor Affinity and MPI Performance on SMP-CMP Clusters,” IEEE International Symposium on Parallel & Distributed Processing, Workshops and PHD. Forum (IPDPSW), pp. 1-8, 2010.
GEMS-A 3D Parallel EM simulation software package, www.2comu.com, State College, PA, 16801, USA.