Limits for Computational Electromagnetics Codes Imposed by Computer Architecture
Keywords:
Limits for Computational Electromagnetics Codes Imposed by Computer ArchitectureAbstract
The algorithmic complexity of the innermost loops that determine the complexity of algorithms in computational electromagnetics (CEM) codes are analyzed according to their operation count and the impact of an underlying computer hardware. As memory chips are much slower than arithmetic processors, codes that involve a high data movement compared to the number of arithmetic operations are executed comparatively slower. Hence, matrix-matrix multiplications are much faster than matrix-vector multiplications. It is seen that it is not su•cient to compare only the complexity, but also the actual performance of algorithms to judge on faster execution. Implications involve FDTD loops, LU factorizations and iterative solvers for dense matrices. Run times on two reference platforms, namely an Athlon 900 MHz and an HP PA 8600 processor, verify the •ndings
Downloads
References
G. H. Golub and C. F. van Loan, Matrix Computations. Balti-
more and London: The Johns Hopkins University Press, 1996.
Y. Saad, Iterative Methods for Sparse Linear Systems. Boston:
PWS Publishing Company, 1996.
R. C. Whaley, A. Petitet, and J. J. Dongarra, “Automated em-
pirical optimization of software and the atlas project,” To ap-
pear in Parallel Computing, 2001. Also avalable as University
of Tennessee LAPACK Working Note #147, UT-CS-00-448,
(www.netlib.org/lapack/lawns/lawn147.ps).
J. J. Dongarra, J. du Croz, S. Hammarling, and R. J. Hanson,
“An extended set of fortran basic linear algebra subprograms,”
ACM Transactions on Mathematical Software, vol. 14, pp. 1–
, Mar. 1988.
J. J. Dongarra, J. du Croz, S. Hammarling, and I. Duff, “A set
of level 3 basic linear algebra subprograms,” ACM Transac-
tions on Mathematical Software, vol. 16, pp. 1–17, Mar. 1990.
B. K ̊agstr ̈om, P. Ling, and C. van Loan, “GEMM-based level
BLAS: Portability and optimization issues,” ACM Transac-
tions on Mathematical Software, vol. 24, pp. 303–316, Sept.
B. K ̊agstr ̈om, P. Ling, and C. van Loan, “Algorithm 784:
GEMM-based level 3 BLAS: High-performance model imple-
mentations and performance evaluation benchmark,” ACM
Transactions on Mathematical Software, vol. 24, pp. 268–302,
Sept. 1998.
J. W. Demmel, N. J. Higham, and R. S. Schreiber, “Stability
of block lu factorizations,” Num. lin. Alg. with Appl., vol. 2,
no. 2, pp. 173–190, 1995


