Hardware Accelerated Design of Millimeter Wave Antireflective Surfaces: A Comparison of Field-Programmable Gate Array (FPGA) and Graphics Processing Unit (GPU) Implementations

作者

  • Ozlem Kilic Department of Electrical Engineering and Computer Science The Catholic University of America, Washington, DC 20064, USA
  • Charles Conner Department of Electrical Engineering and Computer Science The Catholic University of America, Washington, DC 20064, USA
  • Miaoqing Huang Department of Computer Science and Computer Engineering University of Arkansas, Fayetteville, AR 72701, USA
  • Mark S. Mirotznik Department of Electrical and Computer Engineering University of Delaware, Newark, DE 19716, USA

关键词:

Hardware Accelerated Design of Millimeter Wave Antireflective Surfaces: A Comparison of Field-Programmable Gate Array (FPGA) and Graphics Processing Unit (GPU) Implementations

摘要

Engineered materials that demonstrate a specific response to electromagnetic energy incident on them in antenna and radio frequency component design applications are in high demand due to both military and commercial needs. The design of such engineered materials typically requires numerically intensive computations to simulate their behavior as they may have electrically small features on a large area or often the overall system performance is required, which means modeling the entire integrated system. Furthermore, to achieve an optimal performance these simulations need to be run many times until a desired solution is achieved, presenting a major hindrance in arriving at a feasible solution in a reasonable amount of time. One example of such applications is the design of antireflective (AR) surfaces at millimeter wave frequencies, which often involves sub-wavelength gratings in an electrically large multilayer structure. This paper investigates the use of field-programmable gate arrays (FPGAs) and graphics processing units (GPUs) as coprocessors to the CPU in order to expedite the computation time. Preliminary results show that the hardware implementation (100 MHz) on Xilinx Virtex4LX200 FPGA is able to outperform a single-thread software implementation on Intel Itanium 2 processor (1.66 GHz) by 20 folds. However, the performance of the FPGA implementation lags behind the single-thread implementation on a modern Xeon (2.26 GHz) by 3.6. On the other hand, modern GPUs demonstrate an evident advantage over both CPU and FPGA by achieving 20 speedup than the Xeon processor.

##plugins.generic.usageStats.downloads##

##plugins.generic.usageStats.noStats##

参考

D. H. Raguin and G. M. Morris, “Antireflection Struc-

tured Surfaces for the Infrared Spectral Region,” Applied

Optics, vol. 32, no. 7, pp. 1154–1167, 1993.

P. Lalanne and J. Hugonin, “High-Order Effective-

Medium Theory of Subwavelength Gratings in Classical

Mounting: Application to Volume Holograms,” Journal

of the Optical Society of America A, vol. 15, no. 7, pp.

–1851, 1998.

E. Noponen and J. Turunen, “Eigenmode Method for

Electromagnetic Synthesis of Diffractive Elements with

Three-Dimensional Profiles,” Journal of the Optical So-

ciety of America A, vol. 11, no. 9, pp. 2494–2502, 1994.

Reconfigurable Application-Specific Computing User’s

Guide (007-4718-007), Silicon Graphics, Inc., Jan. 2008.

T. Sterling, D. Becker, M. Warren, T. Cwik, J. Salmon,

and B. Nitzberg, “An Assessment of Beowulf-Class Com-

puting for NASA Requirements: Initial Findings from

the First NASA Workshop on Beowulf-Class Clustered

Computing,” in Proc. 1998 IEEE Aerospace Conference,

pp. 367–381, Mar. 1998.

T. El-Ghazawi, E. El-Araby, M. Huang, K. Gaj,

V. Kindratenko, and D. Buell, “The Promise of High-

Performance Reconfigurable Computing,” IEEE Com-

puter, vol. 41, no. 2, pp. 78–85, Feb. 2008.

A. Fournier and D. Fussell, “On the Power of the Frame

Buffer,” ACM Transactions on Graphics, vol. 7, no. 2, pp.

–128, Apr. 1988.

B. de Ruijsscher, G. N. Gaydadjiev, J. Lichtenauer, and

E. Hendriks, “FPGA Accelerator for Real-Time Skin

Segmentation,” in Proc. the 2006 IEEE/ACM/IFIP Work-

shop on Embedded Systems for Real Time Multimedia

(ESTMED’06), pp. 93–97, 2006.

J. Kr ̈uger and R. Westermann, “Linear Algebra Operators

for GPU Implementation of Numerical Algorithms,” in

Proc. International Conference on Computer Graphics

and Interactive Techniques (ACM SIGGRAPH’03), pp.

–916, 2003.

L. Nyland, M. Harris, and J. Prins, “Fast N-body Simu-

lation with CUDA,” in GPU Gems 3 (H. Nguyen: editor),

Aug. 2007.

Z. K. Baker and V. K. Prasanna, “Efficient Hardware

Data Mining with the Apriori Algorithm on FPGAs,”

in Proc. the 13th Annual IEEE Symposium on Field-

Programmable Custom Computing Machines (FCCM’05),

pp. 3–12, Apr. 2005.

S. Che, J. Li, J. W. Sheaffer, K. Skadron, and J. Lach,

“Accelerating Compute-Intensive Applications with GPUs

and FPGAs,” in Proc. the 2008 Symposium on Application

Specific Processors (SASP’08), pp. 101–107, 2008.

M. G. Moharam, D. A. Pommet, E. B. Grann, and

T. K. Gaylord, “Stable Implementation of the Rigourous

Coupled-Wave Analysis for Surface Relief Gratings: En-

hanced Transmittance Matrix Approach,” Journal of the

Optical Society of America A, vol. 12, no. 5, pp. 1077–

, 1995.

P. Lalanne, “Improved Formulation of the Coupled-Wave

Method for Two-Dimensional Gratings,” Journal of the

Optical Society of America A, vol. 14, no. 7, pp. 1592–

, 1997.

J. W. Demmel, Applied Numerical Linear Algebra.

Philadelphia, PA: Society for Industrical and Applied

Mathematics (siam), 1997.

J. G. F. Francis, “The QR Transformation, I,” The Com-

puter Journal, vol. 4, no. 3, pp. 265–271, 1961.

J. G. F. Francis, “The QR Transformation, II,” The

Computer Journal, vol. 4, no. 4, pp. 332–345, 1962.

V. N. Kublanovskaya, “On Some Algorithms for the

Solution of the Complete Eigenvalue Problem,” USSR

Computational Mathematics and Mathematical Physics,

vol. 1, no. 3, pp. 637–657, 1963.

A. S. Householder, “Unitary Triangularization of a Non-

symmetric Matrix,” Journal of the ACM, vol. 5, no. 4, pp.

–342, Oct. 1958.

O. Kilic, M. S. Mirotznik, and J. P. Durbano, “Ap-

plication of FPGA Based FDTD Simulators to Rotman

Lenses,” in Proc. 22nd ACES Conference, 2006.

J. P. Durbano, J. R. Humphrey, F. E. Ortiz, P. F.

Curt, D. W. Prather, and M. S. Mirotznik, “Hardware

Acceleration of the 3D Finite-Difference Time-Domain

Method,” in Proc. IEEE AP-S International Symposium

and USNC/URSI National Radio Science Meeting, pp. 77–

, Jun. 2004.

Impulse C – http://www.impulsec.com, Impulse Acceler-

ated Technologies, Inc., 2009.

Mitrion C – http://www.mitrionics.com, Mitrionics AB,

Nvidia CUDA Programming Guide 2.3.1, Nvidia Corpo-

ration, Aug. 2009.

Dual-Core Update to the Intel Itanium 2 Processor

Reference Manual, Intel Corporation, Jan. 2006.

http://openmp.org

##submission.downloads##

已出版

2022-05-02

栏目

General Submission