Using GPUs for Accelerating Electromagnetic Simulations
Keywords:
Using GPUs for Accelerating Electromagnetic SimulationsAbstract
The computational power and memory bandwidth of graphics processing units (GPUs) have turned them into attractive platforms for general-purpose applications at significant speed gains versus their CPU counterparts [1]. In addition, an increasing number of today's state-ofthe- art supercomputers include commodity GPUs to bring us unprecedented levels of performance in terms of raw GFLOPS and GFLOPS/cost. Inspired by the latest trends and developments in GPUs, we propose a new paradigm for implementing on GPUs some of the major aspects of electromagnetic simulations, a domain traditionally used as a benchmark to run codes in some of the most expensive and powerful supercomputers worldwide. After reviewing related achievements and ongoing projects, we provide a guideline to exploit SIMD parallelism and high memory bandwidth using the CUDA programming model and hardware architecture offered by Nvidia graphics cards at an affordable cost. As a result, performance gains of several orders of magnitude can be attained versus threadlevel methods like pthreads used to run those simulations on emerging multicore architectures
Downloads
References
GPGPU, “General-purpose computation
using graphics hardware”,
http://www.gpgpu.org, 2009.
J. D. Owens, D. Luebke, N. Govindaraju, M.
Harris, J. Kruger, A. E. Lefohn, and T. J.
Purcell, “A survey of general-purpose
computation on graphics hardware,” Journal
of Computer Graphics Forum, vol. 26, pp.
–51, 2007.
S. Guha, S. Krisnan, and S.
Venkatasubramanian, “Data visualization
and mining using the GPU,” Tutorial at 11th
ACM Intl. Conference on Knowledge
Discovery and Data Mining, 2005.
N. K. Govindaraju, B. LLoyd, W. Wang, M.
Lin, and D. Manocha, “Fast Computation of
Database Operations Using Graphics
Processors,” ACM SIGMOD International
Conference on Management of Data, pp.
–226, 2004.
R. Yang and M. Pollefeys, “A Versatile
Stereo Implementation on Commodity
Graphics Hardware”, Real Time Imaging,
vol. 11, no. 1, pp. 7–18, February 2005.
T. Sumanaweera and D. Liu, “Medical
Image Reconstruction with the FFT,” GPU
Gems, March 2004.
I. Viola, A. Kanitsar, and M. E. Groller,
“Hardware Based Nonlinear Filtering and
Segmentation Using High-Level Shading
Languages,” IEEE Visualization, pp. 309–
, October 2003.
M. Hadwiger, C. Langer, H. Scharsach, and
K. Buhler, “State of the art report on GPU-
based segmentation,” VRVis Research
Center, Tech. Rep. TR-VRVIS-2004-17,
W. Wu and P. Heng, “A hybrid condensed
finite element model with GPU acceleration
for interactive 3D soft tissue cutting:
Research articles”, Computer Animation and
Virtual Worlds, vol. 15, no. 3 -4, pp. 219–
, 2004.
M. Harris, “Fast Fluid Dynamics Simulation
on the GPU,” GPU Gems, 2004.
P. Sander, N. Tartachuk, and J. L. Mitchell,
“Explicit Early-Z Culling for Efficient Fluid
Flow Simulation and Rendering”, ATI
Research Journal Technical Report, August
Y. Zhao, Y. Han, Z. Fan, F. Qiu, Y. Kuo,
Kaufman, and K. A., Mueller, “Visual
simulation of heat shimmering and mirage,”
IEEE Trans. on Visualization and Computer
Graphics, vol. 13, no. 1, pp. 179–189, 2007.
CUDA, “Home page maintained by Nvidia”
http://developer.nvidia.com/object/cuda.html.
Brook+, “Web Page maintained by AMD”,
http://ati.amd.com/technology/streamcomputi
ng/AMD-Brookplus.pdf, 2009.
“Nvidia Tesla GPU computing solutions for
HPC” http://www.nvidia.com/object/tesla_
computing_ solutions.html, 2009.
Firestream, “AMD Stream Computing”,
http://ati.amd.com/technology/streamcomputi
ng.
T. K. Group, “The OpenCL Core API
Specification, Headers and Documentation,”
http://www.khronos.org/registry/cl, 2009.
E. Kelmelis, J. Durbano, P. Curt, and J.
Zhang, “Field-programmable gate array
accelerates FDTD calculations,” Laser Focus
World, September 2006.
S.E. Krakiwsky, L.E. Turner, M.M.
Okoniewski“, Acceleration of finite-
difference time-domain (FDTD) using
graphics processing units (GPU),” IEEE
MTT- S Int. Conference, June 2004.
T. Hartley, U. Catalyurek, A. Ruiz, M.
Ujaldon, F. Igual, and R. Mayo“,
“Biomedical Image Analysis on a
Cooperative Cluster of GPUs and
Multicores,” 22nd ACM Intl. Conf. on
Supercomputing, 2008.
P. So, “EM-based simulation tools for signal
and systems analysis”, International
Symposium on Signals, Systems and
Electronics, August 2007.
M. Harris, “Manycore parallel computing
with CUDA”, Keynote Session at the 22nd
ACM Intl. Conference on Supercomputing,
June 2008.
UJALDON: USING GPUS FOR ACCELERATING ELECROMAGNETIC SIMULATIONS
Ageia, “The PhysX co-processor”,
http://www.nvidia.com/object/nvidia_physx.
html.
T. R. Halffill, “Parallel Processing with
CUDA”, MicroProcessor Report Online,
January 2008.
Nvidia Compute Unified Device
Architecture (CUDA) Programming Guide
v. 1.1, Nov. 2007.
Nvidia CUDA CUBLAS Library v. 1.1, Sep.
Nvidia CUDA CUFFT Library v. 1.1, Oct.