An Approach to the Implementation of Laplace and a Broadband Helmholtz Fast Multipole Method as an Application Independent Library

Sanjay Velamparambil

Ansys Inc., 1995 N. 57th Court, Suite 100, Boulder, CO-80301, USA
sanjay.velamparambil@ansys.com

Submitted On: August 9, 2023; Accepted On: July 8, 2024

ABSTRACT

In this paper, we propose an approach to develop an application independent library of Laplace and Helmholtz fast multipole method (FMM) that can be used in different applications. For this purpose, we consider a generalized problem and a corresponding canonical problem (defined below). In the first main contribution, we show that it is possible to capture the essential characteristics of the canonical summation from sampling the values of certain potentials or signature functions. In the second main contribution, we show that partial derivatives of arbitrary orders acting on the far field can be represented as product of sparse matrices within the library, transparent to the user. Combining the two ideas, we show that once the FMM is configured to compute the canonical summation, the same setup can be used to work with a much wider, general class of problems.

Index Terms: Fast multipole methods, integral equations.

I. INTRODUCTION

In this paper, we describe an approach to develop a stand-alone, application independent library of the fast multipole method (FMM) [1, 2, 3] to be used in different physical applications and developed by multiple teams with differing backgrounds. In particular, we are concerned with the evaluation of summations of the form:

ym =n=1NAmnxn,1mM, (1a)
with:
Amn =Ωgm(x)Ω{\§𝒢§§ΩΩ (1b)
u,v=x,y,z;x,x3,
where and are linear partial differential operators with constant coefficients and:
G(x,x) ={1x-xfor the Laplace Equationeikx-xx-xfor the Helmholtz Equation. (1c)

Note that, in general, ΩΩ. Furthermore, following the terminology used by the method of moments (MoM) practitioners in electromagnetics, we shall refer to fj(x) and gi(x) as the basis and testing functions, respectively, with the understanding that they can be Dirac δ-functions to represent “point” particles.

Such computations can arise from a wide variety of applications, such as electromagnetics, acoustics and elastic wave scattering. Since its development in the 1980s [4, 5] for the Laplace equation to the elegant extension to the Helmholtz and Maxwell equations in the 1990s [2, 3, 6], FMM and its multilevel variants have transformed computational physics, especially, computational electromagnetics. It is beyond the scope of this paper to provide a comprehensive overview of these remarkable developments. Instead, we refer the reader to the recent review by He et al. [7].

Although the theory of FMM has been well established and many high quality implementations exist, both in the public domain and in proprietary products, the development of an application independent library is still a challenging problem due to various practical constraints such as:

Different formulations and physics lead to different choices for and .

Different domain and geometry; the integrals could be defined over surfaces or volumetric regions, discretized with surface elements such as flat or curvilinear triangles and/or quadrilaterals and volume elements such as tetrahedra. It is also possible to have mixed formulations involving both surface elements and volume elements. The quality of such a mesh can often be poor, with highly non-uniform elements.

Different basis and testing procedures in reducing the integral equations to a matrix equation; note that each basis and testing function can be supported by one or more mesh elements.

Familiarity/background knowledge of the library user. Even though one can expect the users to have a sound understanding of the integral equation and the underlying physics, it is not necessary that they have a sufficient understanding of FMM to develop the components needed to integrate their code with a library.

We illustrate this with an example. In an implementation of Laplace FMM that uses spherical harmonics, the library user may have to provide the multipole coefficients of basis/testing functions. Although standard implementations of spherical harmonics are available, it may be tricky to ensure consistency between third party implementation of spherical harmonics and the ones used in the FMM implementation.

One of the ways to address these issues is to replace the basis and testing functions with weighted (point) “particles” as shown in Fig. 1. Although this approach simplifies the implementation of the library, it has several drawbacks. First, as illustrated in Fig. 1, replacing mesh elements and basis/testing functions with weighted particles generally leads to incompatible partitioning of the geometry, requiring error-prone bookkeeping, complicated data interchange between the application and the library, and messy corrections in the near field. In addition, given that the number of quadrature points is generally larger than the number of basis/testing functions, the matrix as seen by the library tends to be larger as well, potentially affecting the performance.

images

Figure 1: The two triangles are near to each other, but some of the quadrature points can still be well separated (shown in ellipses) leading to tricky bookkeeping and near field corrections.

Therefore, the objective of this paper is to present an approach that can handle a wide class of problems through the use of good abstractions to capture the physics, the formulation and the discretization. The main contributions of this work are:

Abstractions to characterize the geometry of the elements in a way that allows consistent handling of non-uniform meshes.

A technique to compute required multipole coefficients of the shape/basis and testing functions from sampled values of their corresponding potentials or signature functions which are easy to define and simple to evaluate, without involving any special functions.

We show that the partial derivatives of the far field, represented by multipole or plane wave expansions, can be represented as products of sparse matrices independent of the basis and testing functions allowing the user to compute these derivatives without having to directly work with multipole expansions.

Finally, we demonstrate that once the library is integrated into an existing code to compute what we call the canonical problem, defined in equation (4), the same implementation can be used to evaluate summations in equation (1) without modifying the library and without adding new subroutines to the application code.

II. PRELIMINARIES

A. General class of problems

We assume that the reader is familiar with the basic mathematical formulation and the octree based computational structure of the Laplace and Helmholtz multi-level fast multipole methods (MLFMM). Except when specificity is needed, we shall use the abbreviation MLFMM to refer to either of the two versions.

In the context of MLFMM, we begin by noting that if gm and fn in equation (1) belong to well separated boxes, the Green’s function in equation (1) does not involve any singularities. Since:

Gu =-Gu,u=x,y,z, (2)

whenever xx, the differential operator appearing on the source point x can be transferred to the observation point x when gm and fn are geometrically well separated. Therefore, for far field computations, it is possible to transform equation (1b) to a linear combination of integrals of the form:

Amn=Ωgm(x)𝒟Ω{\§𝒢§§ΩΩ (3a)
where 𝒟§§§ (3b)
with p,q and r being non-negative integers.

One of the key results presented in this work is that the differential operator 𝒟 can be exactly represented as a product of certain sparse matrices in the calculation of far interactions. Anticipating this result, we define a new summation:

ym =n=1NPmnxn,1mM, (4a)
where Pmn =Ωgm(x)Ωfn(x)G(x,x)dΩdΩ, (4b)
and refer to it as the canonical problem.

B. Some notations and definitions

If r=x and x^=xx for x, we shall often write x=(r,x^)=(r,θ,ϕ) in spherical coordinates. We shall denote the unit sphere centered at the origin by Ω0={r3:|r|=1} and will typically use the symbol s^=(1,θ,ϕ) to denote a vector on Ω0.

Following [8, 9], we define the associated Legendre function Pnm(x) of degree n>0 and order -nmn and by the formula:

Pnm(x) =(-1)m(1-x2)m/22nn!dn+mdn+mx(x2-1)n. (5a)

Following Messiah [9], we define the spherical harmonic Ynm(θ,ϕ) of degree n and order m by the relation:

Ynm(θ,ϕ) =[2n+14π]1/2[(n-m)!(n+m)!]1/2Pnm(cosθ)eimϕ. (5b)

We have:

Yn-m(θ,ϕ)=(-1)m[Ynm(θ,ϕ)], (5c)
0π02πYnm(θ,ϕ)Ynm(θ,ϕ)sinθdθdϕ =δnnδmm, (5d)

where denotes the complex conjugate.

Let x=rx^=(r,θ,ϕ). We define inner and outer multipole functions for the Laplace equation by the formulae [10]:

Onm(x)=4π2n+1(-1)ni-mAnmYnm(θ,ϕ)rn+1, (6a)
Inm(x)=4π2n+1imAnmrnYnm(θ,ϕ), (6b)
where Anm=(-1)n(n-m)!(n+m)!. (6c)

Similarly, for the Helmholtz equation, we define the inner and outer functions, denoted by I~nm(x) and O~nm(x), respectively, by the formulae [10]:

I~nm(x) =jn(kr)Ynm(x^), (7a)
O~nm(x) =hn(kr)Ynm(x^), (7b)
where jn(kr) and hn(kr) are, respectively, spherical Bessel and Hankel functions of the first kind and of order n.

III. ABSTRACTIONS FOR THE FAR FIELD COMPUTATIONS

We first develop the abstractions needed to evaluate the canonical summation in equation (4). Assuming the existence of an octree structure, let Bs and Br be the boxes in which the basis function fj and the testing function gi, respectively, reside1. Referring to Fig. 2, let:

BsDs =def{x3||x-cs|<Rs}, (8)
BrDr =def{x3||x-cr|<Rr}. (9)

We assume that suppfjDs and suppgiDr and that D¯sD¯r=.

images

Figure 2: The basis function, fj, and the testing function, gi, (“generalized particles”) are enclosed in spheres Ds, with center at cs and Dr, with center at cr, respectively, under the condition that DsDr=.

To represent the matrix element in equation (4), we first let:

ψj(x) =suppfjfj(x)G(x,x)dx, (10a)
Pij =suppgigi(x)ψj(x)dx. (10b)

For a given accuracy ϵ>0, let N0 be given. Let 𝒪nm(x) and nm(x), respectively, denote the outer and inner multipoles of either Laplace or Helmholtz kernel. Then, we recall the following foundational results from the theory of MLFMM [2, 10]:

The potential ψj(x) can be approximated by a finite outer multipole series valid outside Ds:

ψj(x) n=0Nm=-nnαnm𝒪nm(x-cs),xD¯s. (11a)

The potential ψj(x) can be approximated as an inner multipole series valid inside Dr:

ψj(x) n=0Nm=-nnβnmnm(x-cr),xDr. (11b)

There exists a linear operator, 𝒯, called a translation operator, relating the coefficients {αnm} and {βnm} by:

βnm =p=0Nq=-pq𝒯\α (11c)

To evaluate Pij, we substitute equation (11b) into equation (10b), and obtain:

Pij n=0min{N,N}m=-nnβnmρnm, (11d)
ρnm =defsuppgigi(x)nm(x-cr)dx. (11e)

Borrowing the terminology from Laplace FMM parlance, we shall refer to αnm as the “Q2M” coefficients and ρnm as the “L2P” coefficients.

Expressing the Q2M and L2P coefficients in terms of multipole functions presents a challenge to implementation when multiple teams are involved. For example, a team developing a stand-alone library of FMM, to be used by multiple product teams, may not be deeply familiar with the applications themselves. On the other hand, a product team, although experts in, say, integral equations, may have little or no familiarity with FMM. Given that it is these coefficients that couple the application side (integral equation) with the FMM implementation, it is imperative that they are evaluated correctly and efficiently.

An obvious way to do this is to let the FMM library be aware of the basis and testing functions, and thus let the library evaluate these functions. This has the advantage that if the library decides to change, say, the definitions of inner functions, it will remain transparent to the application. However, this approach has the severe disadvantage that the library is deeply tied to a particular class of basis and testing functions. It is then left with the task of potentially adding a new class of basis and testing functions for every new application. This is inefficient and will eventually make the code unmaintainable.

A second approach is to let the application evaluate the coefficients (αnm,ρnm) using formulae similar to those in equation (11d) and then hand-over the results to the FMM kernel. This has the great advantage that the FMM library remains completely decoupled from the application. However, this approach has the disadvantage that the application team must have some familiarity with FMM and multipoles. This may not always be the case.

Therefore, our goal is to evaluate the coefficients αnm and ρnm within the library without exposing the functions (𝒪nm,nm) to the application.

We now present an approach that eliminates these difficulties and it is inspired by ideas from [11, 12, 13, 14].

A. Evaluation of Q2M and L2P coefficients for the Laplace equation

We begin by noting the basis function fj is defined by the application and that the application “knows” how to evaluate ψj(x) in equation (10a) at any point x3. Moreover, if xsuppfj, the integral can be accurately evaluated using simple quadrature.

Consider a sphere S=def{x3||x-cs|=R>Rs} so that Ssuppfj=. Let xS. Then x=cs+Rs^ where s^Ω0. Then, substituting the definition of Onm from equation (6a) into equation (11a) and using the orthogonality of Ynm(s^) from equation (5d), we obtain the well known relation (see, for example [14, equations 4-5]):

αnm =im2n+14π(n-m)!(n+m)!Rn+1
Ω0ψj(cs+Rs^)Ynm(s^)dΩs. (12)

Now assume that the degree of approximation used in the FMM is p>0. Then, the maximum degree of spherical harmonics under the integral that is relevant to the computations is 2p. Therefore, if we use a quadrature, such as a composite Gauss-Legendre-Trapezoidal rule, that can integrate spherical harmonics of degree 2p exactly, we will be able to compute the coefficients exactly as well. Let Nq be the number of quadrature points, {(s^k,wk)}, be quadrature points and their corresponding weights on Ω0, respectively, and let xk=cs+Rs^k, for 1kNq. Then, equation (12) can be approximated as:

α~nm im2n+14π(n-m)!(n+m)!Rn+1
k=1Nqwkψj(xk)Ynm(s^k). (13)

Therefore, to compute the coefficients αnm, all the library needs are the values of ψj(x), given by equation (10), at a set points on a specified sphere, which the application can readily supply. As remarked earlier, since the observation points xk are chosen to be far away from the support of fj, the integrand in (10a) is non-singular and hence ψj(xk) can be evaluated using simple quadrature rules.

Using the addition theorem [10]:

1x-x =n=0m=-nn(-1)nIn-m(x-cs)Onm(x-cs), (14a)
for x-cs>x-cs in equation (10a) and comparing with equation (11a), it is easy to see that:
αnm =(-1)nsuppfjfj(x)In-m(x-cs)dx. (14b)

Comparing equation (14b) with the definition of ρnm in equation (11d), we see that:

ρnm=(-1)nαn-m, (15)

for the fj if it were also used as a testing function in a Galerkin MoM. Since:

Ωgm(x)Ωfn(x)G(x,x)dΩdΩ
=Ωfn(x)Ωgm(x)G(x,x)dΩdΩ,

it follows that we can use exactly the same technique (equations (12) and (13)) for computing ρnm of any testing function gi by evaluating the corresponding Q2M coefficients and using the relation in equation (15).

By this process, we have thus decoupled the application and the library for computing the L2P and Q2M coefficients for the Laplace equation: the application does not have to work with Inm functions and the library does not have to know the nature of the basis and testing functions, (fj,gi).

A.1. Comparison of performance

Given a multipole degree p>0 and using a composite Gauss-Trapezoidal rule with 2(p+1)2 quadrature points, it is easy to see that the standard approach to computing Q2M (and L2P) coefficients and the potential based approach proposed above have the same asymptotic complexity. However, the proposed method has a larger constant associated with it. We now demonstrate that this additional cost for engineering a simplified interface is negligible in the context of the much larger initialization time (the time for the matrix-vectors are not affected).

images

Figure 3: Comparing the time for evaluating Q2M coefficients using standard (“Direct Multipole”) and using sampled potentials (“Potential”) for various test cases. We have also plotted the total setup time to provide the larger context.

To do this, we have taken 30 distinct triangular meshes ranging in size from 200 to about 1.2 million. They were generated by Ansys Q3D Extractor for different test cases and do not share a common geometry. For our purposes, it is sufficient to replace the triangles with point sources at their centroids. The canonical matrix in equation (4) is then constructed with a multipole degree of p=5 which gives roughly 4 digits of accuracy in the matrix vector products. All the computations were done on a single core of a virtual machine with two Intel(R) Xeon(R) 6238R @2.20GHz processors with 182GB of RAM. In Fig. 3, we compare the time for evaluating the Q2M coefficients using the standard approach and from the potential samples.

It is important to note that the bulk of the initialization/setup time is spent in computing the near matrix and that the near matrix evaluation for concentrated sources gives the least possible time. This is because the commonly used basis functions require more expensive handling of singularities and near singularities. In other words, we are comparing the cost of Q2M evaluations against the best case scenario for the rest of the initialization. It is clear from Fig. 3 that although the proposed method is slightly more expensive, the increase is negligible when compared with the overall setup time. We have also compared the accuracy of the computed matrix vector products and the largest 2-norm error observed was less than 2×10-4, which is comparable to the overall precision of the matrix-vector products corresponding to the degree of multipole approximation p=5.

B. Evaluation of Q2P and L2P coefficients for the Helmholtz equation

In the case of Helmholtz FMM that uses only multipole expansions involving O~nm and I~nm, it is clear that an approach analogous to the one described in section 3.1 can be readily developed. However, most applications of Helmholtz FMM involves electrically large structures where a diagonalized form is employed [2, 10], often combined with multipole expansions to handle multi-scale structures [15]. For such a mixed-form FMM, it is simpler to compute signature functions [10] rather than work with potentials. We now show how signature functions can be used to decouple the FMM library from the applications.

The signature function of the potential ψj(x) of equation (10a), now with the Green’s function for the Helmholtz equation (1), is defined by formula [10]:

ψ~j(s^) =limrkre-ikrψ(cs+rs^), (16a)
which when approximated by equation (11a) with 𝒪nm=O~nm, can be shown to be:
ψ~j(s^) =n=0m=-nmαnmYnm(s^)in+1. (16b)

Therefore, given a signature function ψ~(s^), the orthogonality of spherical harmonics implies that we can recover the coefficients of the multipole expansion in equation (11a), αnm, using the formula [16]:

αnm =in+1Ω0ψ~j(s^)Ynm(s^)dΩs. (17)

It is well known that [2, 10] the signature function of the Green’s function G(r,r)=eik|r-r||r-r| for rDs is given by:

G~(s^)=keiks^(cs-r).

Therefore, it follows that signature function of ψj(x) of the canonical problem in equation (10a) is given by:

ψ~j(s^) =ksuppfjfj(r)eiks^(cs-r)dr, (18)

and we define the signature of the basis function fj by f~j(s^)=ψ~j(s^).

Using the representation [9, 10]:

I~nm(x-c) =14πinΩ0eik(x-c)s^Ynm(s^)dΩs, (19a)
it is easy to show that the L2P coefficients:
ρnm =14πinΩ0g~i(s^)Ynm(s^)dΩs (19b)
where g~i(s^) =defsuppgigi(x)eik(x-c)s^dx, (19c)
are defined by the signature function of the testing function gi2.

Hence, in a mixed-form broadband FMM that uses both diagonalized and (possibly scaled) multipole expansions, the user only has to provide the signature functions in equation (18) and equation (19c) for setting up the far field calculations in Helmholtz FMM. Furthermore, these functions are easy to define and easy to calculate.

IV. COMPUTATION OF THE DERIVATIVES OF THE FAR FIELD

We claimed in the introduction that it is possible to represent the derivatives of far fields as product of sparse matrices. In this section, we shall demonstrate this claim. The approach relies on the following observation for the Helmholtz equation (A similar argument can be made for the Laplace equation by setting k=0 and replacing I~nm with Inm.)

Consider the potential field ψj(x) due to sources in Ds having an approximation of the form in equation (11b):
ψj(x) =n=0Nm=-nnβnmI~nm(x-cr),xDr. (20a)

Since:

2ψju+k2ψju =u{2ψj+k2ψj}=0,u=x,y,z, (20b)

it follows that ψju also must have an expansion of the form:

ψj(x)u =n=0Nm=-nnδnmI~nm(x-cr),xDr, (20c)

and that the coefficients {βnm} and {δnm} must be related through a linear operator Tu so that:

ψjψju {βnm}{δnm}=Tu({βnm}).

Thus, for non-negative integers p,q,r0, an inductive reasoning shows that:

ψjp+q+rψjxpyqzr {βnm}{δnm}=TzrTyqTxp({βnm}).

Therefore:

suppgigi(x)p+q+rψjxpyqzrdx
=n=0Nm=-nnδnmsuppgigi(x)I~nm(x-cr)dx (21a)
=n=0Nm=-nnδnmρnm, (21b)

where ρnm are the same receiving coefficients defined in equation (11d). Thus, we see that partial derivatives of arbitrary order manifest as product of linear operators that can be applied to the local expansion coefficients βnm transparent to the application/user.

So far, we have not said anything about the nature of Tu. In the next three sections, we shall demonstrate that the finite dimensional representation of Tu is a sparse matrix for the Laplace and low frequency Helmholtz FMM that use explicit multipole expansions and a diagonal matrix for the high frequency Helmholtz FMM that uses signature functions.

A. Derivatives for the Laplace kernel

Let x=(r,θ,ϕ)3. We begin with the integral representation of the inner function Inm(x) derived in [10, equation 2.33]:

Inm(x) =(-1)n+mn!12π
-ππ(z+ixcosα+iysinα)neimαdα. (22)

Without the loss of generality, consider the case of Inm(x)x. From Appendix 6.1, we have:

Inm(x)x =i2(In-1m+1(x)+In-1m-1(x)). (23)

Substituting equation (23) into equation (20a), we get:

ψj(x)x =n=0Nm=-nnβnmInm(x-cr)x
=n=0Nm=-nnβnm
i2(In-1m+1(x-cr)+In-1m-1(x-cr)). (24a)

Since Inm=0 if n<0 or |m|>n, the above equation can be rewritten as:

ψ(x)x =n=0N-1m=-nni2(βn+1m-1+βn+1m+1)Inm(x-cr). (24b)

Identifying:

δnm =i2(βn+1m-1+βn+1m+1),

and N=N-1 in equation (20c), we see that the matrix representation of Tx has at most two non-zero entries per row. It is clear from Appendix 6.1 that a similar reasoning can be made for Ty and Tz as well.

B. Derivatives for the Helmholtz kernel with multipoles

Without the loss of generality, we consider the partial derivative with respect to x. Then, it can be shown that I~nmx can be expressed as a four-term recurrence of the form (see Appendix 6.2 for the derivation and for the explicit expressions):

I~nm(x)x =νnν=|n-1|n+1p=-1,1Tνm+pI~νm+p(x). (25a)

A reasoning similar to that made in the previous section shows that each row of the matrix representation of Tx has at most four entries, once again showing that the derivative can be represented as a sparse matrix. Furthermore, a closer look at equation (25a) shows that if equation (20a) has degree N, then equation (20c) will have degree N=N+1. However, if the degree of the “L2P” coefficients ρnm is truncated to N, then the summation in equation (21b) will be truncated to degree N and we can set N=N.

C. Derivatives for the Helmholtz kernel in diagonalized form

A potential field ψj(x) having an approximation of the form in (20a), can also be represented as [2, 10]:

ψj(x) =i4πΩ0ψ~j(s^)eik(x-cr)s^dΩs, (26a)
where ψ~j(s^) =defn=0Nm=-nmβnmYnm(s^)in+1, (26b)
so that:
Pij =suppgigi(x)ψj(x)dx
=i4πΩ0g~i(s^)ψ~j(s^)dΩs, (26c)
where g~i(s^) =suppgigi(x)eik(x-cr)s^dx, (26d)
and we recognize g~i(s^) as the signature function defined in equation (19c). Since:
ψj(x)x =i4πΩ0(iksx)ψ~j(s^)eik(x-cr)s^dΩs, (26e)
from equation (26a), it follows that:
suppgigi(x)ψj(x)xdx
  =i4πΩ0g~i(s^)(iksx)ψ~j(s^)dΩs, (26f)
demonstrating that the partial differential operator appears as a diagonal modification to the incoming signature function ψ~j(s^).

Note that this result has been known from the very early days of FMM and has been effectively used for computing with derivatives [17]. The novelty in our approach is that we are using it to transfer the responsibility of computing the derivative to the library instead of requiring the application do it.

V. A PSEUDO-EXAMPLE FOR A LIBRARY INTERFACE

We now discuss how the results from sections 3 and 4 can be used to develop an application independent library of Laplace and Helmholtz FMM. Our objective is not to describe a concrete implementation, but to discuss the essential functionality.

To that end, we divide the library functionality into two parts: the matrix build/initialization functions and matrix apply (“matrix-vector” product) functions. During the initialization, we expect the user to provide what are called “call-back” functions. The library, in turn, repeatedly calls these functions to get the information needed to complete matrix initialization. After the matrix is built, the library provides the functionality to evaluate matrix-vector products, potentially involving different combinations of derivatives.

A. Matrix initialization

Building the matrix consists of three phases:

The tree construction phase

The evaluation of the near interactions

The evaluation of the far field representations (“Q2M” and “L2P” coefficients).

A.1. Constructing the tree

Rationale:

In general, each source/receiver can be assigned a “center” and a spatial extent. The former can be captured as a array of three floating point numbers. Given that the mesh used in the discretization can be extremely non-uniform in practical cases, it is often necessary to take the spatial extent into account to ensure sufficient accuracy. This can lead to non-uniform trees with internal nodes also “hosting” sources and/or receivers. The following listing shows an example of an interface.

images

A.2. Computing near interactions with multiple matrices

Rationale:

With the ability to compute arbitrary derivatives of the far field, it becomes necessary to handle multiple matrices as well. As a simple example, consider the following summations:

f(xi) =j=1NajG(xi,xj), (27a)
F(xi) =j=1NajG(xi,xj). (27b)

Although potential and the derivatives of the far field can be computed from a single FMM representation, the four components of the near matrix must be stored simultaneously. In other words, the library needs to allow for the possibility of multiple matrix entries resulting from each pair-wise interaction. If we assume that every pair-wise near interaction results in the same number of matrix entries (although there are exceptions, this usually the case), the following interface would be adequate for most purposes.

images

In the above example, dim(elements)=4 and a user can return the Green’s function and its gradient as:

elements =[G,Gx,Gy,Gz].

A.3. Computing Q2M and L2P coefficients

Rationale:

In section 3, we have shown that the required Q2M and L2P coefficients can be evaluated from sampled potential values or signature functions. For the Laplace equation, this can be implemented in the following way. Given a node with sources (or receivers) in it, we can construct a sphere with center, cs at the center of the box and a sufficiently large radius, Rd where d is the box length, and introduce a Gauss-Trapezoidal rule as discussed in section 3.1. Then, a function of the following nature can be invoked repeatedly to evaluate the potential, as defined by equation (10a) at each integration point xk=cs+Rs^k in equation (13).

images

For the Helmholtz equation, we expect the user to return the signature function, defined by equation (18).

images

Note that in this case, we need both the center of the box cs and the direction vector s^k passed to the application.

B. Evaluating matrix-vector products

In order to offer maximum flexibility, we split the evaluation of the matrix-vector products (MVP) into three steps:

multipole_pass: performs all the source-to-multipole (Q2M), outer-to-outer (M2M, aggregation), outer-to-inner (M2L) and inner-to-inner (L2L) translations in the standard FMM flow. This is a pre-requisite for calling the far field evaluation step below.

far_axpy: evaluates (the integral of) the potential (or its derivative) for every testing function using the results in Section 4.

near_axpy: evaluates the contribution due the near interaction matrix selected by the user (if there are multiple matrices involved).

The library can offer a functionality similar to the following one to accomplish these tasks:

images

where the array deriv specifies the orders of the partial derivatives. For example:

deriv[1,2,2]2z22y2x.

The motivation for this is best illustrated by an example. Towards this, consider the evaluation of the two sums in equation (27). A psuedo-code is given below.

images

Computation of the partial derivatives with respect y and z can be done analogously. Note that the multipole_pass is usually the most dominant part during a matrix-vector evaluation and it is invoked only once.

VI. SUMMARY AND CONCLUSIONS

In this paper, we have proposed an approach to developing a stand-alone, application independent library of Laplace and Helmholtz FMM. Towards this end, we have demonstrated a technique to capture the essential characteristics of the problem needed for setting up the FMM – the basis and testing functions, mesh etc. – using samples of either potentials or the signature functions of a canonical problem. Furthermore, we have shown that it is possible to transfer the responsibility of computing the partial derivatives of the far field to the library, instead of burdening the user to implement complicated integrals. This is accomplished by representing partial derivatives of arbitrary orders as product of certain sparse matrices. We have also outlined a pseudo interface that illustrates how the methods can be used.

Within Ansys Inc., we have developed a flexible library, called Q3Dfastmvpack incorporating these ideas and it is currently being used by multiple Ansys products.

ACKNOWLEDGEMENTS

I would like to thank my colleagues Andy Mathis, Dalian Zheng, Eric Bracken and Indranil Chowdhury for the many discussions. I would also like to thank Werner Thiel for supporting this work.

APPENDICES

A. Partial derivatives of Inm

Let h(α;x)=(z+ixcosα+iysinα) in equation (22). Consider:

Inm(x)x=(-1)n+m(n-1)!i2π-ππh(α;x)n-1eimαcosαdα
=(-1)n+m(n-1)!i2π-ππh(α;x)n-1eimαeiα+e-iα2dα
=i2(-1)n+m(n-1)!12π{-ππh(α;x)n-1ei(m+1)αdα
+-ππh(α;x)n-1ei(m-1)αdα}
=i2(-1)n+m(n-1)!12π-ππh(α;x)n-1ei(m+1)αdα
+i2(-1)n+m(n-1)!12π-ππh(α;x)n-1ei(m-1)αdα
=i2(In-1m+1(x)+In-1m-1(x)),

where we have used the facts n-1+m-1=n+m and that (-1)n+m-2=(-1)n+m. The following results can be similarly derived:

Inm(x)y =12(In-1m+1(x)-In-1m-1(x)),
Inm(x)z =-In-1m(x).

B. Partial derivatives of I~nm(x)

We have, from equation (19a):

p+q+rxpyqzrInm(x)
=(ik)p+q+r4πinΩ0sxpsyqszreikxs^Ynm(s^)dΩs, (28)

where s^=(sx,sy,sz)=(sinθcosϕ,sinθsinϕ,cosθ). Noting that:

sinθeiϕ =(-)8π3Y11(s^), (29)

from the definition of spherical harmonics Ynm(θ,ϕ) in equation (5b), and using equation (5c), we get:

sinθcosϕ =(-)2π3[Y11(s^)-Y1-1(s^)] (30a)
sinθsinϕ =i2π3[Y11(s^)+Y1-1(s^)] (30b)
cosθ =4π3Y10(s^). (30c)

Let:

𝒱§ =14πinΩ0Y1p(s^)eikxs^Ynm(s^)dΩs,p=-1,0,1. (31)

Then, substituting equation (30) into equation (28) and using equation (31), we get:

Inm(x)x=(-)ik2π3(𝒱§𝒱§ (32a)
Inm(x)y=(-)k2π3(𝒱§𝒱§ (32b)
Inm(x)z=ik4π3𝒱§ (32c)

Now, substituting the following well known addition theorem [9, 10]:

eikxs^ =n=0m=-nn4πinjn(kx)Ynm(x^)Ynm(s^),

into equation (31), we can express:

𝒱§ =14πinν=0μ=-νν4πiν(-1)μjν(kx)Yν-μ(x^)
Ω0Y1p(s^)Yνμ(s^)Yn(s^)dΩs (33a)
=ν=0μ=-ννiν-n(-1)μjν(kx)Yν-μ(x^)γ(ν1nμpm), (33b)
where γ is the Gaunt coefficient defined as
γ(n1n2n3m1m2m3) =defΩ0Yn1m1(s^)Yn2m2(s^)Yn3m3(s^)dΩs, (33c)
for integers ni0 and -nimini for i=1,2,3. Using the properties of Gaunt coefficients [9], for γ0, we have to have |n-1|νn+1, ν+n+1 even and μ+m+p=0. Thus, ν=|n-1| and ν=n+1 and therefore:
𝒱§ =νnν=|n-1|n+1iν-n(-1)m+pjν(kx)Yνm+p(x^)
γ(ν1n-(m+p)pm) (33d)
=(-1)m+pνnν=|n-1|n+1iν-n
γ(ν1n-(m+p)pm)Iνm+p(x), (33e)
showing that the right hand sides of equations (32a)-(32c) have at most four non-zero terms.

REFERENCES

[1] L. Greengard, “The rapid evaluation of potential fields in particle systems,” Ph.D. thesis, Yale University, 1987.

[2] V. Rokhlin, “Diagonal forms of translation operators for the Helmholtz equation in three dimensions,” Applied and Comp. Harmonic Analysis, vol. 1, pp. 82-93, 1993.

[3] W. C. Chew, J.-M. Jin, E. Michielssen, and J. M. Song, Fast and Efficient Algorithms in Computational Electromagnetics, chap. 9, Boston: Artech House, 2001.

[4] V. Rokhlin, “Rapid solution of integral equations of classical potential theory,” Journal Of Computational Physics, vol. 60, pp. 187-207, 1985.

[5] J. Carrier, L. Greengard, and V. Rokhlin, “A fast adaptive multipole algorithm for particle simulations,” SIAM Journal on Scientific and Statistical Computing, vol. 9, no. 4, pp. 669-686, July 1988.

[6] V. Rokhlin, “Rapid solution of integral equations of scattering theory in two dimensions,” Journal Of Computational Physics, vol. 86, pp. 414-439,1990.

[7] W.-J. He, X.-W. Huang, M.-L. Yang, and X.-Q. Sheng, “Massively parallel multilevel fast multipole algorithm for extremely large-scale electromagnetic simulations: A review,” Progress In Electromagnetics Research, vol. 173, pp. 37-52,2022.

[8] F. W. J. Olver, D. W. Lozier, R. F. Boisvert, and C. W. Clark, NIST Handbook of Mathematical Functions, Cambridge: Cambridge University Press, Cambridge, 2010.

[9] A. Messiah, Quantum Mechanics, New York: Dover Publications, Inc., 2014.

[10] M. A. Epton and B. Dembart, “Multipole translation theory for the three dimensional Laplace and Helmholtz equations,” SIAM J. of Scientific Computing, vol. 16, no. 4, pp. 865-897, July 1995.

[11] C. R. Anderson, “An implementation of the fast multipole method without multipoles,” SIAM Journal of Scientific and Statistical Computing, vol. 13, no. 4, pp. 923-947, July 1992.

[12] T. Eibert, “A diagonalized multilevel fast multipole method with spherical harmonics expansion of the k-space Integrals,” IEEE Transactions on Antennas and Propagation, vol. 53, no. 2, pp. 814-817, Feb. 2005.

[13] B. Dembart and E. Yip, “A 3D Fast multipole method for electromagnetics with multiple levels,” 11th Annual Review of Progress in Applied Computational Electromagnetics, Monterey, CA, vol. 1, pp. 621-628, Mar. 1995.

[14] J. Phillips and J. White, “A precorrected-FFT method for electrostatic analysis of complicated 3-D structures,” IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 16, no. 10, pp. 1059-1072, 1997.

[15] J.-S. Zhao and W. C. Chew, “Integral equation solution of Maxwell's equations from zero frequency to microwave frequencies,” IEEE Transactions on Antennas and Propagation, vol. 48, no. 10, pp. 1635-1645, Oct. 2000.

[16] B. Dembart and E. Yip, “The accuracy of fast multipole methods for Maxwell’s equations,” IEEE Computational Science and Engineering, vol. 5, no. 3, pp. 48-56, 1998.

[17] J. M. Song and W. C. Chew, “Multilevel fast multipole algorithm for solving combined field integral equations of electromagnetic scattering,” Micro. Opt. Tech. Lett., vol. 10, no. 1, pp. 14-19, Sep. 1995.

FOOTNOTES

1Note that we are not providing a strict definition for the term “reside.” It is simply a rule that the user provides to determine if a given basis/testing function can be considered as belonging to a given box. For example, in the case of point particles, if the location of the particle falls into a box, it can be considered to be a resident of that box.

2Compare with equation (18) for subtle differences.

BIOGRAPHIES

images

Sanjay Velamparam I obtained a Ph.D. in Engineering from the Indian Institute of Science, Bangalore, in 1997. My doctoral work was in the then nascent field of fast multipole methods (FMM). Subsequently, working with Dr. Weng Chew at the University of Illinois at Urbana-Champaign, I pioneered the parallelization of Helmholtz FMM on distributed memory computers.

I was with Ansoft Corporation (currently Ansys Inc.) from 2002-2005 working on fast integral equation methods for signal integrity applications. In 2007, I joined Acceleware Corporation, Calgary, and worked on linear algebraic algorithms for GPU computing. In 2010, I joined Apache Design Solutions, which was later acquired by Ansys. I have been working on fast integral equation methods for several Ansys products such as Ansys Q3D Extractor® and SIwave™ since 2010.

I enjoy long bicycle rides, especially mountainous routes, running, and generally love the outdoors. I am married and we are fortunate enough to live in the foothills of the majestic Rocky Mountains in Northern Colorado.

ABSTRACT

I. INTRODUCTION

images

II. PRELIMINARIES

A. General class of problems

B. Some notations and definitions

III. ABSTRACTIONS FOR THE FAR FIELD COMPUTATIONS

images

A. Evaluation of Q2M and L2P coefficients for the Laplace equation

images

B. Evaluation of Q2P and L2P coefficients for the Helmholtz equation

IV. COMPUTATION OF THE DERIVATIVES OF THE FAR FIELD

A. Derivatives for the Laplace kernel

B. Derivatives for the Helmholtz kernel with multipoles

C. Derivatives for the Helmholtz kernel in diagonalized form

V. A PSEUDO-EXAMPLE FOR A LIBRARY INTERFACE

A. Matrix initialization

images

images

images

images

B. Evaluating matrix-vector products

images

images

VI. SUMMARY AND CONCLUSIONS

ACKNOWLEDGEMENTS

APPENDICES

A. Partial derivatives of Inm

B. Partial derivatives of I~nm(x)

REFERENCES

FOOTNOTES

BIOGRAPHIES