
In this article we consider the application of Euler's homogeneous function theorem together with Stokes' theorem to exactly integrate families of polynomial spaces over general polygonal and polyhedral (polytopic) domains in two and three dimensions, respectively. This approach allows for the integrals to be evaluated based on only computing the values of the integrand and its derivatives at the vertices of the polytopic domain, without the need to construct a sub-tessellation of the underlying domain of interest. Here, we present a detailed analysis of the computational complexity of the proposed algorithm and show that this depends on three key factors: the ambient dimension of the underlying polytopic domain; the size of the requested polynomial space to be integrated; and the size of a directed graph related to the polytopic domain. This general approach is then employed to compute the volume integrals arising within the discontinuous Galerkin finite element approximation of the linear transport equation. Numerical experiments are presented which highlight the efficiency of the proposed algorithm when compared to standard quadrature approaches defined on a sub-tessellation of the polytopic elements.
Citation: Thomas J. Radley, Paul Houston, Matthew E. Hubbard. Quadrature-free polytopic discontinuous Galerkin methods for transport problems[J]. Mathematics in Engineering, 2024, 6(1): 192-220. doi: 10.3934/mine.2024009
[1] | Federica Botta, Matteo Calafà, Pasquale C. Africa, Christian Vergara, Paola F. Antonietti . High-order discontinuous Galerkin methods for the monodomain and bidomain models. Mathematics in Engineering, 2024, 6(6): 726-741. doi: 10.3934/mine.2024028 |
[2] | Paola F. Antonietti, Ilario Mazzieri, Laura Melas, Roberto Paolucci, Alfio Quarteroni, Chiara Smerzini, Marco Stupazzini . Three-dimensional physics-based earthquake ground motion simulations for seismic risk assessment in densely populated urban areas. Mathematics in Engineering, 2021, 3(2): 1-31. doi: 10.3934/mine.2021012 |
[3] | Andrea Borio, Martina Busetto, Francesca Marcon . SUPG-stabilized stabilization-free VEM: a numerical investigation. Mathematics in Engineering, 2024, 6(1): 173-191. doi: 10.3934/mine.2024008 |
[4] | Paola F. Antonietti, Chiara Facciolà, Marco Verani . Unified analysis of discontinuous Galerkin approximations of flows in fractured porous media on polygonal and polyhedral grids. Mathematics in Engineering, 2020, 2(2): 340-385. doi: 10.3934/mine.2020017 |
[5] | Tommaso Tassi, Alberto Zingaro, Luca Dede' . A machine learning approach to enhance the SUPG stabilization method for advection-dominated differential problems. Mathematics in Engineering, 2023, 5(2): 1-26. doi: 10.3934/mine.2023032 |
[6] | Yan Li, Gang Tian, Xiaohua Zhu . Singular Kähler-Einstein metrics on $ \mathbb Q $-Fano compactifications of Lie groups. Mathematics in Engineering, 2023, 5(2): 1-43. doi: 10.3934/mine.2023028 |
[7] | Manuel Friedrich . Griffith energies as small strain limit of nonlinear models for nonsimple brittle materials. Mathematics in Engineering, 2020, 2(1): 75-100. doi: 10.3934/mine.2020005 |
[8] | Andrea Manzoni, Alfio Quarteroni, Sandro Salsa . A saddle point approach to an optimal boundary control problem for steady Navier-Stokes equations. Mathematics in Engineering, 2019, 1(2): 252-280. doi: 10.3934/mine.2019.2.252 |
[9] | Simon Lemaire, Julien Moatti . Structure preservation in high-order hybrid discretisations of potential-driven advection-diffusion: linear and nonlinear approaches. Mathematics in Engineering, 2024, 6(1): 100-136. doi: 10.3934/mine.2024005 |
[10] | Caterina Ida Zeppieri . Homogenisation of high-contrast brittle materials. Mathematics in Engineering, 2020, 2(1): 174-202. doi: 10.3934/mine.2020009 |
In this article we consider the application of Euler's homogeneous function theorem together with Stokes' theorem to exactly integrate families of polynomial spaces over general polygonal and polyhedral (polytopic) domains in two and three dimensions, respectively. This approach allows for the integrals to be evaluated based on only computing the values of the integrand and its derivatives at the vertices of the polytopic domain, without the need to construct a sub-tessellation of the underlying domain of interest. Here, we present a detailed analysis of the computational complexity of the proposed algorithm and show that this depends on three key factors: the ambient dimension of the underlying polytopic domain; the size of the requested polynomial space to be integrated; and the size of a directed graph related to the polytopic domain. This general approach is then employed to compute the volume integrals arising within the discontinuous Galerkin finite element approximation of the linear transport equation. Numerical experiments are presented which highlight the efficiency of the proposed algorithm when compared to standard quadrature approaches defined on a sub-tessellation of the polytopic elements.
Over the past 10–15 years there has been increasing widespread interest in the development of numerical methods for the approximation of partial differential equations (PDEs) based on employing general element shapes such as polygons in 2D and polyhedra in 3D (which we collectively refer to as polytopic elements), cf. [1,3,5,6,9], for example, and the references cited therein. Exploiting such flexible computational meshes is highly attractive for a number of key reasons: Complicated geometries may be meshed by employing relatively coarse meshes, without the need to simplify the boundary or other features within the given domain; moving meshes and/or overlapping meshes arising, for example, in applications such as fluid structure interaction can easily be accommodated; hanging nodes are treated in a very simple and natural manner; finally, multi-level solvers, such as domain decomposition methods and multigrid can easily be implemented employing embedded polytopic meshes.
However, a major bottleneck in the exploitation of general polytopic elements within, for example, finite element methods is the construction of suitable element quadratures needed for the assembly of the underlying matrix system. With this in mind several approaches have been proposed within the literature. The simplest approach is to construct a sub-tessellation of a given polytope into standard element shapes, for example, simplices and/or tensor-product elements (quadrilaterals in 2D and hexahedra in 3D) and employ known quadrature rules on these sub-elements. We remark that when the underlying polytopic mesh is constructed from the agglomeration of a given fine mesh consisting of standard element shapes, then the sub-tessellation need not be computed, as it will already be provided. The major problem with this approach is that the number of quadrature points can be extremely large, depending on the cardinality of the sub-tessellation and the required accuracy. Of course, quadrature is naturally highly parallisable, and hence such an implementation can be accelerated, cf. [10] for example, who employed a GPU approach. Alternatively, one may for example, use this initial quadrature and attempt to optimise it by successively removing points until a minimal number is attained; here, we mention the moment fitting approaches, cf. [19,20,21,24,27], for example, and the references cited therein.
In this article we pursue an alternative approach based on the integration of homogeneous functions. This idea was first developed in the articles [7,18], cf. also [8]. Here, the essential idea is to employ Euler's homogeneous function theorem, together with Stokes' theorem, which allows for the integral of a homogeneous function over a given polytope to be written as a boundary integral. Recursively applying this approach allows for the integral to be exactly evaluated based on only computing values of the integrand and its derivatives at the vertices of the polytopic domain, and hence leads to an exact quadrature rule whose quadrature points are the vertices of the polytope. In our recent paper [2] we considered the application of this algorithm within the discontinuous Galerkin finite element (DGFEM) approximation of the Poisson problem. In this article, we extend this work by presenting a detailed complexity analysis of both a quadrature based algorithm and the proposed quadrature-free approach. Here, the primary focus is to consider the number of flops required to integrate an entire space of polynomials up to a given fixed degree, which is typically required within a finite element implementation. In particular, we show that the computational complexity of the proposed algorithm depends on three key factors: the ambient dimension of the underlying polytopic domain; the size of the requested polynomial space to be integrated; and the size of a directed graph related to the polytopic domain. By a careful selection of the so-called local origins needed on each of the polytope's lower dimensional facets, which is needed in the specification of the quadrature-free algorithm, we are able to considerably optimise the number of required flops within this approach by reducing the size of the associated directed graph. To demonstrate the application of this optimised quadrature-free algorithm, we utilise this within a DGFEM approximation of the linear transport equation, though we stress that this approach can naturally be employed for the numerical approximation of a wide range of PDE problems. We point out that the specific application we have in mind is the numerical approximation of the linear Boltzmann transport equation. This is a high-dimensional integro-differential equation employed within applications including neutron and radiation transport. In our recent article [13], we illustrate that by employing a judicious choice of local basis functions and quadratures within the angular and energy domains, the DGFEM approximation of the underlying problem may be computed by simply solving a sequence of linear transport equations in space corresponding to each angular quadrature point (which defines a constant advection direction) and each energy quadrature point. In typical 3D applications utilising either source iteration, or a source iteration preconditioner within a Krylov space solver, means that an extremely large number of these transport solves must be computed. With that in mind, efficiency is of paramount concern, and hence the need to develop optimised quadrature-free implementations of the linear transport equation. Numerical experiments presented here highlight the gains in efficiency with our proposed optimised quadrature-free algorithm in comparison with standard sub-tessellation based quadrature. In particular, we study the dependence of both algorithms on the shape of the underlying polytope.
The outline of this article is as follows. In Section 2 we recall the quadrature-free integration technique introduced in [7] and its application to the integration of families of monomial functions over general d-dimensional polytopes as in [2]. The time and space (memory) complexities of this algorithm in the case where the integration domain is polygonal or polyhedral are analysed. In Section 3 we introduce the DGFEM approximation of a linear first-order transport equation on general polytopic meshes. In Section 4 we compare the time complexities associated with the assembly of the local DGFEM element matrices via quadrature based and quadrature-free based approaches, where we exploit a known decomposition of the integrands into a linear combination of monomials. For simplicity of presentation, we do not consider the application of the quadrature-free approach for the evaluation of the face terms arising within the DGFEM, but instead refer to our previous paper [2] where this issue is discussed. In particular, we recall that the monomial coefficients of the underlying basis restricted to a given face must be computed online, i.e., for every face individually. As a consequence, the numerical experiments presented in [2] indicate that when the faces consist of standard shapes, for example, simplices or tensor-product shapes (when a sub-mesh is not required to compute an appropriate quadrature) the speed-up facilitated by employing the quadrature-free approach is relatively modest, compared to the cost of exploiting a standard quadrature rule. Though, again, we stress that when the faces consist of arbitrary planar polytopes the application of a sub-mesh quadrature approach may become very expensive. Several numerical experiments are performed in Section 5 in order to demonstrate the accelerated assembly time of the quadrature-free based approach. Finally, in Section 6 we summarise the work undertaken in this article and discuss future extensions.
In this section we review the procedure for the numerical integration of homogeneous functions over a polytopic domain as introduced by Lasserre [17,18] for convex polytopes and extended by Chin et al. in [7] to non-convex ones.
We consider the problem of evaluating integrals of the form ∫Pf(x)dx, where:
● f:P→R denotes a (positively) homogeneous function of degree q∈R; that is, for all λ>0 we have that f(λx)=λqf(x) for all x∈P. The function f satisfies Euler's homogeneous function theorem [22]: If f is a continuously differentiable positively homogeneous function of degree q, then we have
qf(x)=x⋅∇f(x) |
for all x in the domain of definition of f.
● P⊂Rd, d=2,3, denotes a closed polytope whose boundary ∂P is defined by m(P) polytopic facets {Fi}m(P)i=1 of dimension (d−1). To each facet Fi, we associate the hyperplane Hi containing Fi defined for any x0∈Rd by
Hi={x∈Rd:(x−x0)⋅ni=ai} |
for some ai∈R and some vector ni∈Rd of unit length. We note that ni may be chosen to be the unit outward normal to P on Fi and that ai denotes the (signed) distance between Hi and x0.
We also recall the generalised Stokes' theorem [26]: For a continuously differentiable vector field X:P→Rd defined on a neighbourhood of P, we have
∫P∇⋅Xdx=∫∂PX⋅ndσ, |
where dσ denotes the Lebesgue measure on ∂P and n denotes the unit outward normal to P on ∂P.
By setting X=(x−x0)f(x) and invoking Euler's homogeneous function theorem, it can be shown that
∫Pf(x)dx=1d+q[m(P)∑i=1ai∫Fif(x)dσ+∫Px0⋅∇f(x) dx]. | (2.1) |
Equation (2.1) relates integration of f over P in terms of integration of f over the facets of P, as well as the integration of ∇f over P. By selecting x0=0, (2.1) reduces to the more common expression relating the integrals of f over P and ∂P [2,7,8,18]:
∫Pf(x)dx=1d+qm(P)∑i=1ai∫Fif(x)dσ. |
As shown in [2,7,17,18], for example, one may recursively apply the generalised Stokes' theorem to express the integral ∫Fif(x)dσ in terms of integrals over the (d−2)-dimensional boundary facets {Fij}m(Fi)j=1 of Fi:
∫Fif(x)dσ=1d−1+q[m(Fi)∑j=1aij∫Fijf(x) dν+∫Fix1⋅∇f(x) dσ], | (2.2) |
where dν denotes the Lebesgue measure on ∂Fi, x1∈Hi is arbitrary and aij denotes the Euclidean distance from x1 to Fij.
Equations (2.1) and (2.2) can be generalised to give the integral of f over any k-dimensional facet F, 0≤k≤d, in terms of the integral of the same function over the boundary ∂F={∂Fi}m(F)i=1 and the integral of ∇f over F:
∫Ff(x)ds=1dimF+q[m(F)∑i=1dist(∂Fi,xF)∫∂Fif(x)dξ+∫FxF⋅∇f(x)ds], | (2.3) |
where ds (respectively dξ) denotes the k-dimensional (respectively (k−1)-dimensional) Lebesgue measure on F (respectively ∂F), xF is an arbitrary point contained in F (or the k-dimensional hyperplane containing F), and dist(∂Fi,xF) denotes the Euclidean distance from xF to (the k-dimensional hyperplane containing) ∂Fi. Finally, in the case where F=xF∈Rd (that is, dimF=0), the right-hand-side of (2.3) can be replaced with the point evaluation f(xF).
In the case where f(x)=xα=∏dk=1xαkk is a monomial function in d variables, (2.3) gives the following recursive formula for the integrals
I(F,α):=∫Fxαds=1dimF+|α|[m(F)∑i=1dist(∂Fi,xF)I(∂Fi,α)+d∑j=1αj(xF)jI(F,α−ej)], | (2.4) |
where ei, i=1,2,…,d, denote the standard unit basis vectors in Rd.
Equation (2.4) can be used to generate sets of integrals of monomial functions over P. To this end, we consider the problem of evaluating the following set of integrals:
I(F,J)={I(F,α)=∫Fxαdx:α∈J}. |
Here, J⊂Nd0 denotes a set of multi-indices satisfying the following property: For each α∈J and 1≤i≤d, we either have αi=0 or α−ei∈J. This motivates the definition of Algorithm 1, cf. [2].
Algorithm 1: Evaluation of the set I(F,J) for a given k-dimensional facet F, 0≤k≤d. |
1. procedure COMPUTEINTEGRALS F, J} |
2: I(F,α)←0 for all α∈J |
3: Select xF as any point in (the (dimF)-dimensional hyperplane containing) F |
4: Get boundary facets {∂Fi}m(F)i=1 |
5: for i=1,…,m(F) do |
6: if dist(∂Fi,xF)≠0 and I(∂Fi,J) not already computed then |
7: I(∂Fi,J)←COMPUTEINTEGRALS(∂Fi,J) |
8: end if |
9: end for |
10: for α∈J do |
11: I(F,α)←1dimF+|α|[∑m(F)i=1dist(∂Fi,xF)I(∂Fi,α)+∑dj=1αj(xF)jI(F,α−ej)]. |
12: end for |
13: return I(F,J) |
14: end procedure |
Remark 1 (Termination of Algorithm 1). The recursion in Algorithm 1 terminates when COMPUTEINTEGRALS(F,J) is called with dimF=0; that is, F=xF is a single point. Since ∂F=xF, the recursive function call in line 7 will not be executed.
The computational complexity of Algorithm 1 can be understood in terms of the size of the requested monomial set J, as well as the complexity of the domain of integration P. With this in mind, the main result of this article is stated below.
Theorem 1 (Time complexity of Algorithm 1). The time complexity of Algorithm 1 including the evaluation of the set
{dist(∂Fi,xF):1≤i≤m(F),0≤dimF≤d} |
is O(χ1(P)|J|+χ3(P)), where
χ1(P)=d∑k=0∑F⊆PdimF=k(m(F)+d),χ3(P)=dd∑k=0∑F⊆PdimF=kk2m(F), |
and |J| denotes the number of requested monomial integrals.
The proof of Theorem 1 will be pursued in the following two sections: Firstly, in Section 2.2.1 we consider both the time and space complexity of Algorithm 1 in the case when the required set of distance functions is pre-computed; the cost of the evaluation of this latter set is then determined in Section 2.2.2.
In this section we study the computational cost of computing I(P,J) using Algorithm 1 in the case when the desired set of distance functions is pre-computed; here, I(P,J) is defined in an analogous manner to I(F,J), with F replaced by P in (2.4). The main result of this section is given in Lemma 1 below; furthermore we also consider potential simplifications to Algorithm 1 which can be utilised to further reduce the computational expense.
Lemma 1 (Time and space complexity of Algorithm 1). Assuming that the set
{dist(∂Fi,xF):1≤i≤m(F),0≤dimF≤d} |
is pre-computed, the time complexity of Algorithm 1, measured as the total number of floating-point operations required to assemble I(P,J), is O(χ1(P)|J|), where |J| denotes the number of requested monomial integrals and
χ1(P)=d∑k=0∑F⊆PdimF=k(m(F)+d). |
The space complexity of Algorithm 1, measured as the total number of floating-point numbers required to store I(P,J), is O(χ2(P)|J|), where
χ2(P)=d∑k=0card{F⊆P:dimF=k}. |
Proof. We first analyse the number of floating-point operations required to compute the right-hand side of (2.4) for a single facet F:
● The sum
S1=m(F)∑i=1dist(∂Fi,xF)I(∂Fi,α) |
requires m(F) products and m(F)−1 additions;
● The sum
S2=d∑j=1αj(xF)jI(F,α−ej) |
requires 2d products and d−1 additions;
● The sum
S3=dimF+|α|=dimF+d∑j=1αj |
requires d additions;
● The final result I(F,α)=S1+S2S3 requires one addition and one division.
We deduce that the right-hand-side of (2.4) may be computed in 2m(F)+4d floating-point operations. Since this operation is called for each α∈J, lines 10–12 of COMPUTEINTEGRALS requires cF=(2m(F)+4d)|J| floating-point operations.
It is not difficult to see that COMPUTEINTEGRALS is executed exactly once for each k-dimensional facet F⊆P with 0≤k≤d. Thus, summing cF over each F, the time complexity given in the statement of the theorem is proven. The space complexity can be proven based on noting that I(F,J) is a set with |J| elements which must be stored for each facet F⊆P.
Remark 2 (Simplifications for dim(F)=d and dim(F)=0). When dim(F)=d, any selection of xF∈Rd can be made. By choosing xF=0, the second sum in (2.4) is eliminated and lines 10–12 in Algorithm 1 can be performed in (2m(F)+d)|J| floating-point operations.
When dim(F)=0 (i.e., F=xF∈Rd), I(F,α) can be performed in a couple of ways:
● Direct computation of I(F,α)=∏dk=1(xF)αkk, in this case, line 11 of Algorithm 1 can be performed in O(d+∑dk=1log(1+αi)) floating-point operations via binary exponentiation [16].
● Recursive computation of I(F,α)=αk(xF)kI(F,α−ek) for some 1≤k≤d, in this case, line 11 of Algorithm 1 can be performed in O(1) floating-point operations.
The scalings of the time and space complexities reported in Lemma 1 as functions of the geometric complexity of P can be understood by visualising the recursive nature of Algorithm 1 as a directed acyclic graph G=G(P)=(V,E). The vertex set V=V(P) is defined as the set of k-dimensional facets F⊆P for 0≤k≤d. The edge set E=E(P) is defined as follows: For any facets F1,F2∈V, the directed edge (F1,F2)∈E if and only if dimF2=dimF1−1 and F2 lies on the boundary of F1. Figure 1 gives an example of the construction of G(P).
Remark 3 (Facet lattice of P). It is convenient to add an extra vertex ∅ to V(P), which we define as having dimension −1, and extra directed edges to E(P) from each vertex of P to ∅. The resulting graph G(P) then resembles a Hasse diagram representing the facet lattice formed from the facets of P ordered by inclusion [4,14].
Remark 4 (Algorithm 1 as a depth-first search). The recursion of Algorithm 1 can be understood as a depth-first search of the graph G(P) starting at P where I(F,J) is assembled at each unvisited face F. By contrast, the implementation of Algorithm 1 given in [8] can be understood as a breadth-first search of the transpose graph G′(P) starting at ∅, where G′(P) differs from G(P) only by reversal of the directed edges.
The time complexity of Algorithm 1 can be understood in terms of the sizes of the vertex and edge sets V(P) and E(P). In particular, we have that
d∑k=0∑F⊂PdimF=k(2m(F)+4d)=2d∑k=0∑F⊂PdimF=km(F)+4dd∑k=0card{F⊆P:dimF=k}=2|E(P)|+4d|V(P)| |
and the time and space complexities reported in Lemma 1 can be alternatively be written as O((|E(P)|+d|V(P)|)|J|) and O(|V(P)||J|), respectively. Table 1 gives the time and space complexities of Algorithm 1 for a number of different classes of polytopes, as well as the sizes of the vertex and edge sets V(P) and E(P) in the graph G(P), respectively.
Family | |V(P)| | |E(P)| | Time complexity | Space complexity |
d-dimensional simplex | 2d+1 | 2d(d+1) | O(2dd|J|) | O(2d|J|) |
d-dimensional hypercube | 3d+1 | 2⋅3d−1d+2d | O(3dd|J|) | O(3d|J|) |
n-sided polygon | 2(n+1) | 4n | O(n|J|) | O(n|J|) |
n-gonal prism | 2(3n+2) | 15n+2 | O(n|J|) | O(n|J|) |
n-based pyramid | 4(n+1) | 2(5n+1) | O(n|J|) | O(n|J|) |
In practical finite element applications, P may denote a d-dimensional mesh element, d=2,3, in a polytopic mesh T of some domain of interest Ω. Historically, simplicial or tensor-product elements have been used to partition the domain, though general polytopic elements have been proposed more recently, cf. [1,3,5,6,9], for example. The following theorem characterises the time and space complexities of Algorithm 1 in these special cases.
Lemma 2 (Complexity of Algorithm 1 for d=2,3). Let P⊂Rd, d=2,3, denote a convex polygon or polyhedron. The time and space complexities of Algorithm 1, measured in the sense described in Lemma 1, are O(e|J|), where e denotes the number of (1-dimensional) edges of P.
Proof. It suffices to show that |V(P)|=O(e) and |E(P)|=O(e).
For the case d=2, a polygon P with e edges also has e vertices. Therefore, we have that |V(P)|=2(e+1) and |E(P)|=4e, as required.
For the case d=3, let v (respectively f) denote the number of vertices (respectively number of faces) of P. Since P is convex, the Euler characteristic of the surface of P is equal to 2 [12], and hence
v−e+f=2. |
The size of the vertex set V(P) is given by
|V(P)|=v+e+f+2=2(e+2). |
To compute the size of the edge set E(P), we note that each edge (or 1-dimensional facet) F in G(P) has in-degree and out-degree 2, since each edge lies on the boundary of two faces and has two vertices on its own boundary. We therefore have that
|E(P)|=v+2e+2e+f=5e+2. |
Remark 5 (Extension to non-convex polyhedra). The argument presented in the proof of Lemma 2 remains valid when both P and its (d−1)-dimensional facets {Fi}m(P)i=1 are simply-connected. That is, Lemma 2 holds if neither P nor any of its facets have any holes.
One may reduce the time and space complexities of Algorithm 1 through judicious selection of the reference points xF. When xF is chosen as a vertex of F, one may avoid a number of recursive calls to COMPUTEINTEGRALS for some boundary facets of F. We shall refer to the resulting implementation as pruned - this is illustrated in Figure 2. While a complete time and space complexity analysis of Algorithm 1 with pruning is not presented here, pruning can lead to significant computational savings on simple geometries. For instance, the time complexity of Algorithm 1 in the case where P is a d-dimensional simplex is O(2dd|J|); with pruning, this can be reduced to O(d2|J|). On more complicated domains, the effects of pruning are likely to be less significant. To our knowledge, finding an optimal pruning of G(P) for a general polytope, that is, a shortest sequence of visited facets (Fn)n≥0 and corresponding reference points (xFn)n≥0 for which Algorithm 1 can compute I(P,J) - is an open problem.
It is convenient to omit the computation of distances of the form dist(∂Fi,xF) in the proof of Lemma 1 since such quantities do not need to be re-computed for each I(F,α). In cases where |J| is very small or P has many facets, the evaluation of these distances can become the most computationally-expensive part of Algorithm 1. For instance, it is shown in [4] that the time complexity of the computation of the volume of a d-dimensional hypercube via the quadrature-free integration method is O(d43d), which is a factor of O(d3) larger than that predicted by Lemma 1. To remedy this, the following lemma measures the complexity of operations omitted in the proof of Lemma 1.
Lemma 3 (Time complexity of distance pre-computation). The time complexity of evaluating the set
{dist(∂Fi,xF):1≤i≤m(F),0≤dimF≤d} |
is O(χ3(P)), where
χ3(P)=dd∑k=0∑F⊆PdimF=kk2m(F). |
Alternatively, the time complexity may be expressed as O(d3|E(P)|).
Proof. Let F denote a k-dimensional facet with selected reference point xF. Let ∂F denote any of its (k−1)-dimensional boundary facets. Suppose that ∂F has n vertices {xi}n−1i=0 and note that k≤n. The (k−1)-dimensional hyperplane H containing ∂F can be uniquely defined by the points {xi}k−1i=0: For any x∈H, there exists t=(ti)k−1i=1∈Rk−1 such that
x=x(t)=x0+k−1∑i=1(xi−x0)ti. |
The distance dist(∂F,xF) (or, more precisely, the distance between H and xF) is the minimum value of ||x(t)−xF||2 over t∈Rk−1; this occurs when t solves
At=f, |
where the entries of A∈R(k−1)×(k−1) and f∈Rk−1 are given by (A)ij=(xi−x0)⋅(xj−x0) and (f)i=(x0−xF)⋅(xi−x0), respectively. The number of floating-point operations required to assemble and solve the linear system above is O(k2d) and O(k3), respectively; thus, the time complexity of computing dist(∂F,xF) for a single (k−1)-dimensional facet ∂F is O(k2d).
For each k-dimensional facet F, 0≤k≤d, dist(∂Fi,xF) is computed for each 1≤i≤m(F). Summing the time complexity of a single computation of dist(∂F,xF) over these limits yields the first result stated in the proof; the second result is obtained by bounding the complexity of a single computation of dist(∂F,xF) from above by O(d3).
The proof of Theorem 1 now follows immediately by combining Lemma 1, together with Lemma 3. Furthermore, employing the notation introduced in Section 2.2.1, the statement of Theorem 1 may be rewritten in the following compact form.
Corollary 1 (Time complexity of Algorithm 1). The time complexity of Algorithm 1 including the evaluation of the set
{dist(∂Fi,xF):1≤i≤m(F),0≤dimF≤d} |
is O(d3|E(P)|+(d|V(P)|+|E(P)|)|J|).
In this section we consider the application of the numerical integration algorithm outlined in the previous section for the computation of the volume integrals arising within the DGFEM discretisation of the linear transport equation. To this end, given an open bounded polyhedral domain Ω⊂Rd, d=2,3, we consider the following advection-reaction equation: Find u:Ω→R such that
∇⋅(bu)+cu=f in Ω,u=g on Γin, | (3.1) |
where c,f:Ω→R, g:Γin→R and b:Ω→Rd are given data terms, and Γin={x∈∂Ω:b⋅n(x)<0} denotes the inflow boundary of Ω, where n(x) denotes the outward unit normal to Ω at x∈∂Ω. In the following, we assume that b is a constant velocity vector, while c is a given as a piecewise-constant function with respect to the elements in the underlying finite element mesh T defined below. For the type of applications we have in mind, namely the numerical approximation of the linear Boltzmann transport problem, cf. [13], these assumptions are not restrictive. That said, the proposed quadrature-free implementation can easily be extended to include the case when b, c, f and g are polynomial functions (or piecewise-polynomial with respect to the elements of T).
We discretise the linear transport equation (3.1) using a DGFEM approach. To this end, let T denote a subdivision of the spatial domain Ω into open non-overlapping polytopic elements κ such that ˉΩ=⋃κ∈Tˉκ. We denote by E the set of faces in T, which are defined as the (d−1)-dimensional planar facets of elements κ∈T.
To each κ∈T we respectively denote by hκ>0 and pκ≥0 the diameter of κ and the polynomial degree on κ. The spatial finite element space is defined by
V={v∈L2(Ω):v|κ∈Ppκ(κ) for all κ∈T}, |
where Ppκ(κ) denotes the space of all d-variate polynomials with maximum total degree at most pκ on κ.
Given κ∈T, we define the inflow and outflow parts of ∂κ by
∂−κ={x∈∂κ:b⋅n(x)<0},∂+κ={x∈∂κ:b⋅n(x)≥0}, |
respectively, where n(x) denotes the outward unit normal to κ at x∈∂κ. For a sufficiently-smooth function v, we denote by v+κ (respectively, v−κ) the interior (respectively, exterior) trace of v on ∂κ (respectively, ∂κ∖∂Ω). Since it will always be clear which element κ∈T the quantities v±κ correspond to, the subscript κ will be suppressed for the remainder of this article.
The DGFEM discretisation of (3.1) with upwind numerical flux reads as follows: Find uh∈V such that
a(uh,vh)=ℓ(vh) | (3.2) |
for all vh∈V, where a:V×V→R and ℓ:V→R are defined, respectively, for all wh,vh∈V by
a(wh,vh)=∑κ∈T(∫κ(−whb⋅∇vh+cwhvh)dx+∫∂+κ|b⋅n| w+hv+hds−∫∂−κ∖∂Ω|b⋅n| w−hv+hds),ℓ(vh)=∑κ∈T(∫κfvhdx+∫∂−κ∩∂Ω|b⋅n| gv+hds). |
For the remainder of this article we will assume that each element κ∈T is equipped with a basis comprising of the polynomial space Pp(κ) for some fixed p≥0. Following [2,6], we construct a Cartesian bounding box Bκ for each element κ∈T such that ˉκ⊆¯Bκ. Furthermore, we define a reference bounding box ˆB=(−1,1)d and a family of affine mappings Fκ:ˆB→Bκ such that Fκ(ˆx)=Jκˆx+tκ, where Jκ∈Rd×d is the (diagonal) Jacobi matrix of Fκ and tκ∈Rd is the translation between the origin 0∈ˆB and the barycentre of Bκ.
We define a basis of Pp(ˆB) as follows: We denote by {Ln(t)}∞n=0 the family of orthogonal (or orthonormal) univariate Legendre polynomials on L2(−1,1). For each multi-index α of length d with 0≤|α|≤p we define the basis function ˆϕα:ˆB→R by
ˆϕα(ˆx)=d∏k=1Lαk(ˆxk). | (3.3) |
It is straightforward to see that {ˆϕα(ˆx)}0≤|α|≤p forms a basis for Pp(ˆB). The basis functions {ϕα,κ(x)}0≤|α|≤p for Pp(κ) are constructed upon application of the element mapping; more precisely, ϕα,κ(x)=ˆϕα(F−1κ(x)). The set
{ϕα,κ(x):κ∈T,0≤|α|≤p} |
forms a basis on each element κ, κ∈T, for the finite element space V. Henceforth, we will identify a bijection between the set of multi-indices {α}0≤|α|≤p and the set {1,…,dimPp(κ)} such that the ith basis function on κ is denoted by ϕα(i),κ(x).
We conclude this section by expanding the products of Ln(t) and their derivatives as sums of monomials:
Lm(t)Ln(t)=m+n∑k=0Cm,n,ktk, | (3.4) |
L′m(t)Ln(t)=m+n−1∑k=0C′m,n,ktk. | (3.5) |
The sets of coefficients {Cm,n,k:0≤m,n≤p,0≤k≤m+n} and {C′m,n,k:0≤m,n≤p,0≤k≤m+n−1} may be pre-computed before assembly.
In this section we focus on the efficient assembly of the DGFEM matrix arising on the left-hand side of (3.2). Typically, when quadrature is employed, the evaluation of the volume integrals appearing in the left-hand side of (3.2) are far more expensive to compute than the corresponding face integrals since in the former case significantly more quadrature points must be employed. With that in mind, we focus on the acceleration of the assembly of the local element matrix contributions Aκ∈RdimPp(κ)×dimPp(κ), where
(Aκ)ij=∫κ(−ϕα(j),κ(x)b⋅∇ϕα(i),κ(x)+cϕα(i),κ(x)ϕα(j),κ(x))dx. | (4.1) |
While we will not discuss the exploitation of quadrature-free methods to compute the face integrals appearing in (3.2), we refer to [2] for the application of the proposed numerical integration approach for the computation of the face integrals arising from a DGFEM discretisation of a second-order elliptic problem.
As a point of comparison, we will consider quadrature rules of the form
∫Pf(x)dx≈N∑i=1ωif(xi), | (4.2) |
where f:P→R, {xi}Ni=1⊂Rd denotes a set of quadrature points with non-negative quadrature weights {ωi}Ni=1⊂R.
A number of methods may be employed to construct the N-point quadrature scheme QN={(xi,ωi)}Ni=1. One of the most popular and simple to implement strategies is to sub-tessellate the integration domain P into simplicial subdomains (triangles in 2D, tetrahedra in 3D), on which standard quadrature schemes can be used [25]. However, the resulting quadrature scheme on P may contain an excessive number of points and weights. It has been demonstrated that numerical optimisation algorithms can generate efficient numerical quadrature schemes on arbitrary polygonal domains, cf., for example, [20].
In the case when f∈Pp(P), the approximation (4.2) can be exact. Indeed, it can be shown that the smallest N-point quadrature scheme which is exact for all polynomial functions of a given degree p contains
N≥(⌊p2⌋+dd) | (4.3) |
quadrature points and weights [23].
Algorithm 2 presents a pseudocode for a typical implementation of numerical quadrature to evaluate the integrals (4.1) appearing in the DGFEM discretisation of the transport equation. Noting that lines 8 and 9 of Algorithm 2 require 2(d+1) and 2 floating-point operations to evaluate, respectively, it can be seen that the number of floating-point operations performed in the main body (lines 5–12) is given by 2(d+2)(dimPp(κ))2N. Furthermore, noting that dimPp(κ)=(p+dd) and that the integrand of (4.1) is a polynomial of total degree at most 2p, the number of floating-point operations required to exactly evaluate Aκ for a single element κ using Algorithm 2 is at least 2(d+1)(p+dd)3. Here, we have assumed that the lower bound on the number of quadrature points in (4.3) is attainable; that is, we set N=Nopt=(p+dd).
Algorithm 2: Computation of Aκ using quadrature. |
1: procedure COMPUTEELEMENTMATRIX (κ) |
2: Compute quadrature scheme {(xq,ωq)}Nq=1 on κ |
3: Pre-compute {ϕα,κ}0≤|α|≤p and {∇ϕα,κ}0≤|α|≤p at quadrature points |
4: Aκ←0 |
5: for q=1,…,N do |
6: for i=1,…,dimPp(κ) do |
7: for j=1,…,dimPp(κ) do |
8: I←(cϕα(i),κ(xq)−b⋅∇ϕα(i),κ(xq))ϕα(j),κ(xq) |
9: (Aκ)i,j←(Aκ)i,j+ωqI |
10: end for |
11: end for |
12: end for |
13: return Aκ |
14: end procedure |
The quadrature-free integration method outlined in Section 2 is not immediately applicable to the case of exactly evaluating the entries of Aκ in (4.1), since the integrand is typically not a homogeneous function. We remedy this issue by decomposing the integrand as a sum of monomials which may be integrated separately using Algorithm 1. We shall skip the details for brevity; see [2] for a more detailed treatment of similar integrals.
Since the basis functions {ϕα,κ}0≤|α|≤p are only supported on κ, we may apply the inverse map F−1κ to obtain an expression for the matrix entry (Aκ)i,j as an integral over the mapped element
ˆκ=F−1κ(κ)⊆ˆB. |
It can be shown that
(Aκ)i,j=∫ˆκ(−ˆϕα(j)(ˆx)ˆbκ⋅ˆ∇ˆϕα(i)(ˆx)+cˆϕα(i)(ˆx)ˆϕα(j)(ˆx))|Jκ|dˆx, |
where ˆbκ=J−1κb denotes a scaled wind direction. Finally, by using the definition (3.3) of the basis functions {ˆϕα}0≤α≤p (which we remark are independent of κ) and the decompositions (3.4) and (3.5), we arrive at the following expression for (Aκ)i,j:
(Aκ)i,j=∑0≤α≤α(i)+α(j)c(i,j)α∫ˆκˆxαdˆx, | (4.4) |
where the coefficients {c(i,j)α}0≤α≤α(i)+α(j) are defined for each 1≤i,j≤dimPp(κ) by
c(i,j)α=(cd∏k=1Cα(i)k,α(j)k,αk−d∑k=1ˆbκ,kC′α(i)k,α(j)k,αkd∏ℓ=1ℓ≠kCα(i)ℓ,α(j)ℓ,αℓ)|Jκ|. | (4.5) |
We remark that the entries of Aκ are now in a form in which Algorithm 1 can be applied to generate the set of integrated monomials I(ˆκ,J). Here, we make the choice
J={α∈Nd0:0≤|α|≤2p}; |
this ensures that J contains each α for which c(i,j)α≠0. Moreover, by (4.4), the set I(ˆκ,J) can be computed once for each κ and re-used to assemble each entry of Aκ.
Algorithm 3 provides pseudocode for a typical implementation of the quadrature-free based integration method to evaluate the integrals (4.1) appearing in the DGFEM discretisation of the transport equation. We remark that the computational complexity associated with the execution of Algorithm 1 on line 4 has already been discussed in Section 2.2. Noting that lines 10 and 11 of Algorithm 3 require (d+1)2 and 2 floating-point operations, respectively, it can be seen that the number of floating-point operations performed in the main body (lines 7–14) is given by ((d+1)2+2)Qd(p), where
Qd(p)=∑0≤|α(i)|≤p∑0≤|α(j)|≤pcard{α:0≤α≤α(i)+α(j)}. | (4.6) |
Algorithm 3: Computation of Aκ via quadrature-free integration. |
1: procedureCOMPUTEELEMENTMATRIX (κ) |
2: Compute sets of coefficients {Cm,n,k} and {C′m,n,k} in (3.4) and (3.5) (if not already available) |
3: Map κ↦ˆκ |
4: Compute I(ˆκ,J) using Algorithm 1 |
5: Compute ˆbκ=J−1κb |
6: Aκ←0 |
7: for i=1,…,dimPp(κ) do |
8: for j=1,…,dimPp(κ) do |
9: for 0≤α≤α(i)+α(j) do |
10: Compute c(i,j)α as in (4.5) |
11: (Aκ)i,j←(Aκ)i,j+c(i,j)αI(ˆκ,α) |
12: end for |
13: end for |
14: end for |
15: return Aκ |
16: end procedure |
It can be shown that Qd(p)∼1(3d)!(4d2d)p3d as p→∞.
Remark 6 (Pre-computation of coefficients). One may optionally pre-compute the (κ-independent) coefficients
Cα(i),α(j),α=d∏k=1Cα(i)k,α(j)k,αk |
and
C(k)α(i),α(j),α=C′α(i)k,α(j)k,αkd∏ℓ=1ℓ≠kCα(i)ℓ,α(j)ℓ,αℓ |
for 0≤|α(i)|≤p, 0≤|α(j)|≤p, 0≤α≤α(i)+α(j) and 1≤k≤d. This allows (4.5) to be computed using 2(d+1) floating-point operations, which is the same as the number of floating-point operations needed to evaluate I in the quadrature based implementation given in Algorithm 2.
Remark 7 (p-refinement in quadrature-free based assembly). Algorithm 3 requires the following sets in order to assemble the matrix Aκ using a basis of Pp(κ):
C(p)={Ci,j,k:0≤i≤p,0≤j≤p,0≤k≤i+j},C′(p)={C′i,j,k:0≤i≤p,0≤j≤p,0≤k≤i+j},Iκ(p)={I(F,α):0≤dimF≤d,0≤|α|≤2p}. |
Suppose we wish to perform a p-refinement; that is, to construct Aκ using a basis of Pp+1(κ). Further, assume that C(p), C′(p) and Iκ(p) are already known. We have that
C(p+1)=C(p)∪{Ci,p,k:0≤i≤p+1, 0≤k≤p+i}∪{Cp,i,k:0≤i≤p+1, 0≤k≤p+i},C′(p+1)=C′(p)∪{C′i,p,k:0≤i≤p+1, 0≤k≤p+i}∪{C′p,i,k:0≤i≤p+1, 0≤k≤p+i},Iκ(p+1)=Iκ(p)∪{I(F,α):0≤dimF≤d, 2p+1≤|α|≤2p+2}. |
The computations of the updated coefficient sets C(p+1) and C′(p+1) are a one-time cost (since these sets may be used for every κ∈T) and the computation of the updated integral set Iκ(p+1) can be performed using a modification of Algorithm 1 that uses prior knowledge of Iκ(p) to avoid unnecessary re-computation.
It can be seen that the time complexities associated with the main loops in Algorithms 2 and 3, measured as the total number of floating-point operations required to assemble Aκ, are O(p3d) in the limit as p→∞. Moreover, the time complexity associated with the execution of Algorithm 1, used to compute I(ˆκ,J) in Algorithm 3, is O(χ1(κ)pd), where χ1(P) denotes the measure of complexity of the geometry of a polytope P given in the statement of Lemma 1. Thus, for large enough p, the main loops in Algorithms 2 and 3 are the most expensive contributions to the total assembly time.
A study of the loops in Algorithms 2 and 3 shows that the quadrature based method reaches the inner-most computations (dimPp(κ))2N times, while the quadrature-free based method reaches the inner-most computations Qd(p) times, where Qd(p) is the function defined in (4.6). Here, the number of quadrature points and weights N used in Algorithm 2 is chosen to exactly evaluate the integrals (Aκ)ij in (4.1) whose integrands are polynomial functions of total degree at most 2p. Thus, a lower bound for N is given by Nopt=(p+dd).
Figure 3 shows the expected number of floating-point operations required by Algorithm 2 (using the theoretically-optimal number of quadrature points Nopt) and Algorithm 3 used to assemble Aκ. The leading-order behaviour of both algorithms in the limit p→∞ is O(p3d). It is seen that, without taking the geometric complexity of κ into consideration, the performance of the quadrature-free based assembly method is expected to be comparable to that of the quadrature based assembly employing the minimal quadrature set that exactly evaluates Aκ.
However, the quadrature-free based assembly method can outperform the quadrature based assembly method when one takes into consideration the geometric complexity of κ. To see this, suppose that κ is decomposed into n subdomains for the purpose of numerical integration, on each of which an Nopt-point quadrature scheme can be applied. The leading-order behaviour of Algorithm 2 is O(np3d); that is, the time taken to assemble Aκ via the quadrature based assembly method will increase significantly if many integration subdomains are required. In contrast, the time taken to execute the main loop of Algorithm 3 is independent of κ; furthermore, taking into account the evaluation of the set of integrated monomials I(ˆκ,J), the leading-order behaviour of Algorithm 3 is O(χ1(ˆκ)pd+p3d), where χ1 denotes the function given in the statement of Lemma 1. We investigate this further in the following section.
In this section we consider the practical performance of the proposed quadrature-free algorithm.
We shall first study the effect of implementing Algorithm 1 with and without pruning as described in Section 2.2. The pruning strategy we adopt is to select the reference point xF as the first vertex of F for each face visited by Algorithm 1. We will apply Algorithm 1 to the problem of assembling the integral sets ⋃κ∈TI(κ,J) in the case where T is a simplicial or agglomerated simplicial mesh in two or three spatial dimensions and
J={α∈Nd0:0≤|α|≤p} |
for p∈{0,2,4,6,8,10,12}.
Tables 2 and 3 show the total CPU time taken by the unpruned and pruned versions of Algorithm 1 applied to two-dimensional triangular and agglomerated triangular meshes, respectively, while Tables 4 and 5 show the total CPU time taken by the unpruned and pruned versions of Algorithm 1 applied to three-dimensional tetrahedral and agglomerated tetrahedral meshes, respectively. An additional quantity, computed as the ratio of the CPU time taken by the pruned algorithm against the unpruned algorithm, is also reported; values of this ratio less than 1 indicate that Algorithm 1 with pruning computes the integral set I(P,J) faster than the same algorithm without pruning.
|T| | 32 | 128 | 512 | 2048 | 8192 | |
p=0 | Unpruned | 1.8926E-05 | 6.2850E-05 | 2.3568E-04 | 8.8175E-04 | 3.6211E-03 |
Pruned | 2.4888E-05 | 7.3720E-05 | 2.7315E-04 | 1.0641E-03 | 4.5114E-03 | |
Ratio | 1.32 | 1.17 | 1.16 | 1.21 | 1.25 | |
p=2 | Unpruned | 3.3402E-05 | 8.2150E-05 | 3.3563E-04 | 1.3032E-03 | 5.1413E-03 |
Pruned | 2.8993E-05 | 8.0898E-05 | 3.1069E-04 | 1.2043E-03 | 4.7641E-03 | |
Ratio | 0.87 | 0.98 | 0.93 | 0.92 | 0.93 | |
p=4 | Unpruned | 3.3678E-05 | 1.2722E-04 | 5.1513E-04 | 1.7583E-03 | 6.4381E-03 |
Pruned | 3.1136E-05 | 9.4751E-05 | 3.6298E-04 | 1.5314E-03 | 5.5922E-03 | |
Ratio | 0.92 | 0.74 | 0.70 | 0.87 | 0.87 | |
p=6 | Unpruned | 4.4647E-05 | 1.6264E-04 | 6.3257E-04 | 2.6062E-03 | 9.5336E-03 |
Pruned | 3.8436E-05 | 1.2308E-04 | 5.2076E-04 | 1.9159E-03 | 7.2723E-03 | |
Ratio | 0.86 | 0.76 | 0.82 | 0.74 | 0.76 | |
p=8 | Unpruned | 6.4398E-05 | 2.2496E-04 | 9.3172E-04 | 3.2624E-03 | 1.2828E-02 |
Pruned | 4.5497E-05 | 1.7977E-04 | 7.8489E-04 | 2.5205E-03 | 9.4220E-03 | |
Ratio | 0.71 | 0.80 | 0.84 | 0.77 | 0.73 | |
p=10 | Unpruned | 7.9538E-05 | 3.2234E-04 | 1.1563E-03 | 4.5004E-03 | 1.7702E-02 |
Pruned | 5.6676E-05 | 2.2770E-04 | 7.9667E-04 | 3.0367E-03 | 1.2133E-02 | |
Ratio | 0.71 | 0.71 | 0.69 | 0.67 | 0.69 | |
p=12 | Unpruned | 1.3727E-04 | 4.1301E-04 | 1.6069E-03 | 5.7918E-03 | 2.3696E-02 |
Pruned | 6.9956E-05 | 2.7148E-04 | 9.6113E-04 | 3.7934E-03 | 1.5079E-02 | |
Ratio | 0.51 | 0.66 | 0.60 | 0.65 | 0.64 |
|T| | 12 | 51 | 204 | 819 | 3276 | |
p=0 | Unpruned | 2.3893E-05 | 6.7615E-05 | 2.7051E-04 | 1.1335E-03 | 5.0569E-03 |
Pruned | 3.1631E-05 | 8.4488E-05 | 3.1938E-04 | 1.3849E-03 | 6.2978E-03 | |
Ratio | 1.32 | 1.25 | 1.18 | 1.22 | 1.25 | |
p=2 | Unpruned | 2.9957E-05 | 1.0833E-04 | 3.4219E-04 | 1.6801E-03 | 6.6029E-03 |
Pruned | 3.3662E-05 | 1.1878E-04 | 3.8369E-04 | 1.8276E-03 | 7.3026E-03 | |
Ratio | 1.12 | 1.10 | 1.12 | 1.09 | 1.11 | |
p=4 | Unpruned | 4.6595E-05 | 1.4459E-04 | 5.9718E-04 | 2.2960E-03 | 9.4577E-03 |
Pruned | 3.9385E-05 | 1.2878E-04 | 5.9189E-04 | 2.2926E-03 | 9.2390E-03 | |
Ratio | 0.85 | 0.89 | 0.99 | 1.00 | 0.98 | |
p=6 | Unpruned | 5.6086E-05 | 2.1821E-04 | 9.3389E-04 | 3.2810E-03 | 1.2832E-02 |
Pruned | 5.5423E-05 | 1.9397E-04 | 8.2830E-04 | 2.9239E-03 | 1.1699E-02 | |
Ratio | 0.99 | 0.89 | 0.89 | 0.89 | 0.91 | |
p=8 | Unpruned | 8.5570E-05 | 3.0688E-04 | 1.1368E-03 | 4.5906E-03 | 1.7794E-02 |
Pruned | 8.1338E-05 | 2.5567E-04 | 9.8448E-04 | 3.8316E-03 | 1.5385E-02 | |
Ratio | 0.95 | 0.83 | 0.87 | 0.83 | 0.86 | |
p=10 | Unpruned | 1.2774E-04 | 4.2015E-04 | 1.5151E-03 | 6.0202E-03 | 2.4115E-02 |
Pruned | 8.7737E-05 | 3.4614E-04 | 1.2627E-03 | 5.0212E-03 | 1.9915E-02 | |
Ratio | 0.69 | 0.82 | 0.83 | 0.83 | 0.83 | |
p=12 | Unpruned | 1.5864E-04 | 5.3115E-04 | 2.0558E-03 | 8.2953E-03 | 3.3196E-02 |
Pruned | 1.4783E-04 | 4.3479E-04 | 1.7454E-03 | 6.7921E-03 | 2.5964E-02 | |
Ratio | 0.93 | 0.82 | 0.85 | 0.82 | 0.78 |
|T| | 6 | 48 | 384 | 3072 | 24576 | |
p=0 | Unpruned | 2.4926E-05 | 1.0816E-04 | 8.1476E-04 | 6.8287E-03 | 4.8521E-02 |
Pruned | 1.9337E-05 | 7.1661E-05 | 5.1822E-04 | 4.0497E-03 | 3.1205E-02 | |
Ratio | 0.78 | 0.66 | 0.64 | 0.59 | 0.64 | |
p=2 | Unpruned | 3.5647E-05 | 2.0676E-04 | 1.5616E-03 | 1.3476E-02 | 9.5595E-02 |
Pruned | 2.1014E-05 | 9.8906E-05 | 7.1403E-04 | 5.6071E-03 | 4.3727E-02 | |
Ratio | 0.59 | 0.48 | 0.46 | 0.42 | 0.46 | |
p=4 | Unpruned | 6.3102E-05 | 4.2202E-04 | 3.2991E-03 | 2.5008E-02 | 1.9898E-01 |
Pruned | 3.1368E-05 | 1.5550E-04 | 1.2420E-03 | 9.1364E-03 | 7.2012E-02 | |
Ratio | 0.50 | 0.37 | 0.38 | 0.37 | 0.36 | |
p=6 | Unpruned | 1.1175E-04 | 8.2232E-04 | 5.9819E-03 | 4.7363E-02 | 3.7750E-01 |
Pruned | 4.4362E-05 | 2.5484E-04 | 1.8026E-03 | 1.3952E-02 | 1.1175E-01 | |
Ratio | 0.40 | 0.31 | 0.30 | 0.29 | 0.30 | |
p=8 | Unpruned | 1.8889E-04 | 1.4290E-03 | 1.0750E-02 | 8.6211E-02 | 6.8734E-01 |
Pruned | 6.2170E-05 | 2.8795E-04 | 3.0248E-03 | 2.3477E-02 | 1.8765E-01 | |
Ratio | 0.33 | 0.20 | 0.28 | 0.27 | 0.27 |
|T| | 4 | 38 | 307 | 2457 | 19660 | |
p=0 | Unpruned | 4.5272E-05 | 3.5574E-04 | 2.6056E-03 | 2.1919E-02 | 1.7550E-01 |
Pruned | 3.7088E-05 | 2.8895E-04 | 2.0201E-03 | 1.6717E-02 | 1.3413E-01 | |
Ratio | 0.82 | 0.81 | 0.78 | 0.76 | 0.76 | |
p=2 | Unpruned | 8.7692E-05 | 7.7001E-04 | 5.5442E-03 | 4.3656E-02 | 3.5131E-01 |
Pruned | 5.4265E-05 | 4.5130E-04 | 3.1994E-03 | 2.6491E-02 | 2.0647E-01 | |
Ratio | 0.62 | 0.59 | 0.58 | 0.61 | 0.59 | |
p=4 | Unpruned | 1.9519E-04 | 1.5893E-03 | 1.1656E-02 | 9.1985E-02 | 7.4515E-01 |
Pruned | 9.1346E-05 | 7.6676E-04 | 5.5029E-03 | 4.4103E-02 | 3.5562E-01 | |
Ratio | 0.47 | 0.48 | 0.47 | 0.48 | 0.48 | |
p=6 | Unpruned | 3.4222E-04 | 3.0026E-03 | 2.3204E-02 | 1.8303E-01 | 1.4867E+00 |
Pruned | 1.6222E-04 | 1.2524E-03 | 9.9395E-03 | 7.9390E-02 | 6.4008E-01 | |
Ratio | 0.47 | 0.42 | 0.43 | 0.43 | 0.43 | |
p=8 | Unpruned | 6.0944E-04 | 5.2648E-03 | 4.1087E-02 | 3.2613E-01 | 2.6380E+00 |
Pruned | 2.4756E-04 | 1.9114E-03 | 1.5453E-02 | 1.2295E-01 | 1.0089E+00 | |
Ratio | 0.41 | 0.36 | 0.38 | 0.38 | 0.38 |
For each fixed p, it is observed that the ratio of CPU time taken by the pruned algorithm against the unpruned algorithm remains roughly constant in all cases. For a fixed number of elements, this ratio decreases as p increases. It is expected that this ratio continues to decrease as p→∞ but remains bounded from below by a constant, for a single element κ, this constant is expected to depend on the number of nodes and edges of the graph G(κ) defined in Section 2.2 before and after pruning.
While pruning accelerates the assembly of ⋃κ∈TI(κ,J) for both tetrahedral and agglomerated tetrahedral elements in three dimensions, a greater improvement in assembly time is observed for tetrahedral meshes, this is because a greater proportion of the nodes and edges of G(κ) can be eliminated through pruning when κ is simplicial. We observe similar behaviour between the pruned and unpruned version of Algorithm 1 applied to triangular and agglomerated triangular elements in two dimensions, though pruning is seen to be less effective at reducing the CPU time spent assembling ⋃κ∈TI(κ,J). For small values of p, pruning actually slows down Algorithm 1, we speculate that this is because the extra time spent checking whether a given facet F is to be pruned outweighs the expense that would be incurred to assemble I(F,J) without pruning.
As a second example, we compare the quadrature and quadrature-free based integration algorithms for the evaluation of the sets
In,p={∫Pnxαdx:α∈N20,0≤|α|≤p}, |
where Pn⊂R2 denotes the regular n-gon, 5≤n≤16, with vertices {(cos2πkn,sin2πkn)}n−1k=0 and p∈{2,4,8,16,32}.
The quadrature based method is employed as follows: A sub-tessellation of Pn consisting of (n−2) triangles is constructed by joining the first vertex of Pn to every other vertex. On each triangle, a (q+1)2-point quadrature scheme, q=⌈p+12⌉, is defined by constructing a quadrature scheme on the unit square (−1,1)2 exactly integrating all bivariate polynomial functions of maximal degree p+1. The reference quadrature scheme on (−1,1)2 is then mapped to each triangle in the sub-tessellation of Pn via a Duffy transformation [11]. The resulting quadrature scheme on Pn therefore contains (n−2)(q+1)2 points and weights. We record the time taken for the quadrature based integration method to be executed for each element of In,p; here, we do not include the time taken to generate the quadrature scheme on Pn.
The quadrature-free based method is specialised to the two-dimensional setting. The integrals I(Pn,α) and I(Fn,k,α), for each boundary facet Fn,k⊂∂Pn, 1≤k≤n, are stored in two arrays. We employ pruning based on selecting xF as the first vertex of each visited face F.
Figures 4 and 5 show the CPU times taken by the quadrature based and quadrature-free based integration algorithms to evaluate In,p averaged over 100 function calls. The time complexity of the quadrature based method is O(np4), since the size of the requested set of integrals |In,p|=O(p2) and each integral requires O(n(q+1)2)=O(np2) flops to compute. On the other hand, the quadrature-free based method is seen to have time complexity O(np2). This is consistent with Lemma 2 since |J|=O(p2). It has been verified that the integrals in the set In,p computed using Algorithm 1 agree with the same set of integrals computed using quadrature to within machine precision.
As a third example, we compare the quadrature based and quadrature-free based assembly methods for a single element matrix arising from the DGFEM discretisation of linear first-order transport problems posed in two spatial dimensions. We first consider a single n-sided polygonal domain Pn⊂R2 with 3≤n≤64 and employ a basis of degree p=4, i.e., the finite element space is V=P4(Pn).
As before, the quadrature based method employs a sub-tessellation for the purposes of constructing a quadrature scheme on Pn. In this example, the sub-tessellation consists of n triangles constructed by joining each vertex of Pn to the centroid. Here, a (p+2)2-point quadrature scheme is employed on each triangle using the method outlined in the previous example; this ensures that the quadrature scheme exactly evaluates the element integrals appearing in (3.2). As before, we do not include the time taken to generate the quadrature scheme on Pn. The time taken to evaluate the basis functions at the quadrature points is also excluded.
The quadrature-free based method is specialised to the two-dimensional setting. The element integrals appearing in (3.2) are evaluated using the two-step procedure given in Algorithm 3. The monomial integrals ∫Pnxαdx for 0≤|α|≤2p are computed once based on employing Algorithm 1 using the first-vertex based pruning strategy as before; these are then used to assemble the local element matrix entry-wise through the decomposition (4.4) of the integrals given in (3.2). The time taken to generate the coefficients in these decompositions is not included, since for the given finite element basis, the individual terms present in the brackets in (4.5) can be precomputed once and for all, using, for example, maple.
Figure 6 shows the CPU time taken by the quadrature based assembly method using Algorithm 2 and the quadrature-free based assembly method using Algorithm 3 to assemble the element matrices arising from a DGFEM discretisation of a linear, constant-coefficient transport problem in two spatial dimensions. The CPU times are averaged over 10000 calls to both assembly methods on the same n-gon element Pn for 3≤n≤64 on which a basis of P4(Pn) is employed. The time taken by Algorithm 3 to assemble the element matrices is further broken down into contributions from line 4 (i.e., Algorithm 1) and lines 7–14 (i.e., the reconstruction of (Aκ)i,j via (4.4)).
As seen in the previous example, the time-complexity of Algorithm 2 scales linearly with n, the number of sides of the element Pn. The time-complexity associated with line 4 of Algorithm 3, which we remark is a call to Algorithm 1 also scales linearly with n, as expected.
The main body of Algorithm 3, namely the nested loops on lines 7–14, is seen to scale independently of n. Indeed, it was seen in the analysis of the previous section that this loop exhibits no dependence on the geometry of the element. Therefore, the actual run-time of the quadrature-free based assembly method depends on whether the integration phase (line 4) or reconstruction phase (lines 7–14) is more expensive. For small values of n, the reconstruction phase is more expensive and so the assembly time remains roughly constant; for large enough n the integration phase dominates the computational time and so a dependence of the assembly time on the geometric complexity of the element emerges.
We now consider fixing n to study the dependence on p; to this end, in Figure 7 we show the scalings of Algorithms 2 and 3 as functions of the polynomial degree of approximation for 1≤p≤12 on a fixed 6-sided polygonal domain. For large enough p, it can be seen that the time complexities of the quadrature based and quadrature-free based assembly methods are both O(p3d)=O(p6) as predicted earlier. In the case of the quadrature-free based assembly method, the time complexities (on a given geometry) of the monomial integration phase (line 4) and the reconstruction phase (lines 7–14) of Algorithm 3 are O(p3d)=O(p6) and O(pd)=O(p2), respectively. This is in agreement with our previous analysis, which predicted that the time complexity of the quadrature-free based assembly scales like O(χ1(Pn)pd+p3d)=O(χ1(Pn)p2+p6), where χ1(P) denotes the measure of complexity of the geometry of a polytope P given in the statement of Lemma 1. In this example, it is observed that the quadrature-free based assembly method is faster than the quadrature based assembly method by a factor of around an order of magnitude for all tested polynomial degrees.
As a final example, we will compare the quadrature based and quadrature-free based assembly methods for the system matrix arising from DGFEM discretisation of the linear first-order transport problem posed in three spatial dimensions. We will consider sequences of tetrahedral and agglomerated tetrahedral meshes T and employ local polynomial bases Pp(κ) for each κ∈T with 0≤p≤4.
For standard tetrahedral meshes, the quadrature based method employs quadrature schemes consisting of (p+2)3 points and weights on each tetrahedron; this ensures that the quadrature scheme exactly evaluates the element integrals appearing in (3.2). The agglomerated tetrahedral meshes are formed by partitioning a fine mesh Tfine into polyhedral coarse-mesh elements κ∈T using METIS [15]. The agglomeration strategy is chosen such that each coarse-mesh element κ is formed from an average of 10 fine-mesh elements κfine∈Tfine. The quadrature schemes on elements in Tfine are inherited by the coarse-mesh elements, ensuring that the integrals appearing in (3.2) can be evaluated exactly.
The quadrature-free based method is performed in two steps as before. On each element κ∈T, the monomial integrals ∫κxαdx for 0≤|α|≤2p are computed once using an implementation of Algorithm 1 specialised to the three-dimensional setting. As before, we employ a first-vertex based pruning strategy to reduce the CPU time spent in Algorithm 1. The integrals in (3.2) are then evaluated using known decompositions of the integrands in terms of the monomial basis. The time taken to generate the coefficients in these decompositions is not included.
Figure 8 shows the CPU time taken by the quadrature based and quadrature-free based methods (Algorithms 2 and 3, respectively) to assemble the global transport matrices arising from a DGFEM discretisation of a linear, constant-coefficient transport problem in three spatial dimensions. Both methods are tested on standard and agglomerated tetrahedral meshes for global polynomial degrees 0≤p≤4.
Both assembly methods are seen to scale linearly with the number of elements in the spatial mesh, as expected. For all tests recorded, the CPU time taken to assemble the system matrix using the quadrature-free method is consistently faster than the standard quadrature based approach by at most a constant multiplicative factor. For the tests performed on standard tetrahedral meshes, this multiplicative constant is between 2 and 3 for p≥1. For agglomerated tetrahedral meshes, this multiplicative constant improves to at least 5 for p≥1, with the quadrature based assembly taking almost 20 times longer than the quadrature-free based assembly in the case p=4. This improvement in assembly time due to switching to a quadrature-free approach is expected to be greater on agglomerated meshes than tetrahedral meshes. Indeed, on a given element κ∈T, the time complexity of the quadrature based method is O(nκp3d)=O(nκp9), where nκ denotes the number of fine-mesh elements in Tfine that comprise κ. In contrast, the time complexity of the quadrature-free based method is O(χ1(κ)pd+p3d)=O(χ1(κ)p3+p9).
Figures 9 and 10 present the breakdown of the contribution of line 4 and lines 7–14 of Algorithm 3 to the total CPU time taken by the quadrature-free based algorithm. Both the integration and reconstruction phases are seen to scale linearly with the number of elements in the mesh; this is to be expected since lines 4 and lines 7–14 of Algorithm 3 are executed once for each element. The reconstruction (4.4) performed on lines 7–14 of Algorithm 3 is seen to scale much faster as a function of p than the integration of the monomial basis via Algorithm 1; this is evidenced by the greater separation of the data corresponding to the CPU times for each polynomial degree p. However, the time taken in the reconstruction phase is seen to be independent of the geometry of the mesh elements, whereas a greater amount of CPU time is required to perform the integration phase on agglomerated tetrahedral meshes. Analogous behaviour is also observed when fine meshes consisting of hexahedral elements are agglomerated; for brevity these results have been omitted.
In this article we have analysed the computational complexity of computing the integral of families of polynomial spaces over general polytopic domains. Starting from the ideas developed in [7,17] for the integration of homogeneous functions, we have demonstrated that the time and space complexities required to integrate families of monomial functions are dependent on three factors: the ambient dimension of the polytopic domain; the size of the requested set of monomial integrals; and the size of a directed graph related to the polytopic domain. In the case of polygonal or polyhedral geometries, the monomial integration algorithm is shown to scale linearly with the number of graph edges. This algorithm was applied to the computation of element integrals arising in the DGFEM discretisation of the linear transport problem. We have shown that, by decomposing the integrand into a linear combination of monomial functions, these integrals could be evaluated at speeds comparable to methods based on employing quadrature schemes with a minimal number of points and weights. In comparison to quadrature based methods employing a sub-tessellation, the quadrature-free approach is seen to both significantly accelerate the assembly of the system matrix and also scale independently of the element geometry for sufficiently-high polynomial degrees.
The authors declare they have not used Artificial Intelligence (AI) tools in the creation of this article.
PH and MEH acknowledge the financial support of the EPSRC (grant EP/R030707/1). PH also acknowledges the financial support of the MRC (grant MR/T017988/1).
The authors declare no conflicts of interest.
[1] | P. F. Antonietti, A. Cangiani, J. Collis, Z. Dong, E. H. Georgoulis, S. Giani, et al., Review of discontinuous Galerkin finite element methods for partial differential equations on complicated domains, In: G. R. Barrenechea, F. Brezzi, A. Cangiani, E. H. Georgoulis, Building bridges: connections and challenges in modern approaches to numerical partial differential equations, Cham: Springer, 114 (2016), 281–310. https://doi.org/10.1007/978-3-319-41640-3_9 |
[2] |
P. F. Antonietti, P. Houston, G. Pennesi, Fast numerical integration on polytopic meshes with applications to discontinuous Galerkin finite element methods, J. Sci. Comput., 77 (2018), 1339–1370. https://doi.org/10.1007/s10915-018-0802-y doi: 10.1007/s10915-018-0802-y
![]() |
[3] |
L. Beirão Da Veiga, F. Brezzi, A. Cangiani, G. Manzini, L. D. Marini, A. Russo, Basic principles of virtual element methods, Math. Models Methods Appl. Sci., 23 (2013), 199–214. https://doi.org/10.1142/S0218202512500492 doi: 10.1142/S0218202512500492
![]() |
[4] | B. Büeler, A. Enge, K. Fukuda, Exact volume computation for polytopes: a practical study, In: G. Kalai, G. M. Ziegler, Polytopes–Combinatorics and computation, DMV Seminar, Basel: Birkhäuser, 29 (2000), 131–154. https://doi.org/10.1007/978-3-0348-8438-9_6 |
[5] | A. Cangiani, Z. Dong, E. H. Georgoulis, P. Houston, hp-version discontinuous Galerkin methods on polygonal and polyhedral meshes, Cham: Springer, 2017. https://doi.org/10.1007/978-3-319-67673-9 |
[6] |
A. Cangiani, E. H. Georgoulis, P. Houston, hp-version discontinuous Galerkin methods on polygonal and polyhedral meshes, Math. Models Methods Appl. Sci., 24 (2014), 2009–2041. https://doi.org/10.1142/S0218202514500146 doi: 10.1142/S0218202514500146
![]() |
[7] |
E. B. Chin, J. B. Lasserre, N. Sukumar, Numerical integration of homogeneous functions on convex and nonconvex polygons and polyhedra, Comp. Mech., 56 (2015), 967–981. https://doi.org/10.1007/s00466-015-1213-7 doi: 10.1007/s00466-015-1213-7
![]() |
[8] |
E. B. Chin, N. Sukumar, An efficient method to integrate polynomials over polytopes and curved solids, Comput. Aided Geom. Design, 82 (2020), 101914. https://doi.org/10.1016/j.cagd.2020.101914 doi: 10.1016/j.cagd.2020.101914
![]() |
[9] | M. Cicuttin, A. Ern, N. Pignet, Hybrid high-order methods: a primer with applications to solid mechanics, Cham: Springer, 2021. https://doi.org/10.1007/978-3-030-81477-9 |
[10] |
Z. Dong, E. H. Georgoulis, T. Kappas, GPU-accelerated discontinuous Galerkin methods on polytopic meshes, SIAM J. Sci. Comput., 43 (2021), C312–C334. https://doi.org/10.1137/20M1350984 doi: 10.1137/20M1350984
![]() |
[11] |
M. G. Duffy, Quadrature over a pyramid or cube of integrands with a singularity at a vertex, SIAM J. Numer. Anal., 19 (1982), 1260–1262. https://doi.org/10.1137/0719090 doi: 10.1137/0719090
![]() |
[12] | B. Grünbaum, V. Klee, M. A. Perles, G. C. Shephard, Convex polytopes, Vol. 16, 1 Ed., New York: Interscience, 1967. |
[13] | P. Houston, M. E. Hubbard, T. J. Radley, O. J. Sutton, R. S. J. Widdowson, Efficient high-order space-angle-energy polytopic discontinuous Galerkin finite element methods for linear Boltzmann transport, arXiv, 2023. https://doi.org/10.48550/arXiv.2304.09592 |
[14] |
V. Kaibel, M. E. Pfetsch, Computing the face lattice of a polytope from its vertex-facet incidences, Comp. Geom., 23 (2002), 281–290. https://doi.org/10.1016/S0925-7721(02)00103-7 doi: 10.1016/S0925-7721(02)00103-7
![]() |
[15] |
G. Karypis, V. Kumar, A fast and highly quality multilevel scheme for partitioning irregular graphs, SIAM J. Sci. Comput., 20 (1998), 359–392. https://doi.org/10.1137/S1064827595287997 doi: 10.1137/S1064827595287997
![]() |
[16] | D. E. Knuth, The art of computer programming, Vol. 2, Seminumerical Algorithms, 3 Eds., Addison-Wesley, 1981. |
[17] | J. Lasserre, Integration on a convex polytope, Proc. Amer. Math. Soc., 126 (1998), 2433–2441. |
[18] | J. B. Lasserre, Integration and homogeneous functions, Proc. Amer. Math. Soc., 127 (1999), 813–818. |
[19] |
J. N. Lyness, G. Monegato, Quadrature rules for regions having regular hexagonal symmetry, SIAM J. Numer. Anal., 14 (1977), 283–295. https://doi.org/10.1137/0714018 doi: 10.1137/0714018
![]() |
[20] |
S. E. Mousavi, H. Xiao, N. Sukumar, Generalized Gaussian quadrature rules on arbitrary polygons, Int. J. Numer. Methods Eng., 82 (2010), 99–113. https://doi.org/10.1002/nme.2759 doi: 10.1002/nme.2759
![]() |
[21] |
S. E. Mousavi, N. Sukumar, Numerical integration of polynomials and discontinuous functions on irregular convex polygons and polyhedrons, Comput. Mech., 47 (2011), 535–554. https://doi.org/10.1007/s00466-010-0562-5 doi: 10.1007/s00466-010-0562-5
![]() |
[22] | C. P. Simon, L. Blume, Mathematics for economists, Vol. 7, New York: Norton, 1994. |
[23] | A. Stroud, Approximate calculation of multiple integrals, Prentice-Hall series in automatic computation, Prentice-Hall, Inc., 1971. |
[24] |
Y. Sudhakar, W. A. Wall, Quadrature schemes for arbitrary convex/concave volumes and integration of weak form in enriched partition of unity methods, Comput. Methods Appl. Mech. Eng., 258 (2013), 39–54. https://doi.org/10.1016/j.cma.2013.01.007 doi: 10.1016/j.cma.2013.01.007
![]() |
[25] |
N. Sukumar, A. Tabarraei, Conforming polygonal finite elements, Int. J. Numer. Methods Eng., 61 (2004), 2045–2066. https://doi.org/10.1002/nme.1141 doi: 10.1002/nme.1141
![]() |
[26] | M. E. Taylor, Partial differential equations: basic theory, Vol. 1, Springer, 1996. |
[27] |
H. Xiao, Z. Gimbutas, A numerical algorithm for the construction of efficient quadrature rules in two and higher dimensions, Comput. Math. Appl., 59 (2010), 663–676. https://doi.org/10.1016/j.camwa.2009.10.027 doi: 10.1016/j.camwa.2009.10.027
![]() |
1. | Marco Feder, Andrea Cangiani, Luca Heltai, R3MG: R-tree based agglomeration of polytopal grids with applications to multilevel methods, 2025, 00219991, 113773, 10.1016/j.jcp.2025.113773 |
Family | |V(P)| | |E(P)| | Time complexity | Space complexity |
d-dimensional simplex | 2d+1 | 2d(d+1) | O(2dd|J|) | O(2d|J|) |
d-dimensional hypercube | 3d+1 | 2⋅3d−1d+2d | O(3dd|J|) | O(3d|J|) |
n-sided polygon | 2(n+1) | 4n | O(n|J|) | O(n|J|) |
n-gonal prism | 2(3n+2) | 15n+2 | O(n|J|) | O(n|J|) |
n-based pyramid | 4(n+1) | 2(5n+1) | O(n|J|) | O(n|J|) |
|T| | 32 | 128 | 512 | 2048 | 8192 | |
p=0 | Unpruned | 1.8926E-05 | 6.2850E-05 | 2.3568E-04 | 8.8175E-04 | 3.6211E-03 |
Pruned | 2.4888E-05 | 7.3720E-05 | 2.7315E-04 | 1.0641E-03 | 4.5114E-03 | |
Ratio | 1.32 | 1.17 | 1.16 | 1.21 | 1.25 | |
p=2 | Unpruned | 3.3402E-05 | 8.2150E-05 | 3.3563E-04 | 1.3032E-03 | 5.1413E-03 |
Pruned | 2.8993E-05 | 8.0898E-05 | 3.1069E-04 | 1.2043E-03 | 4.7641E-03 | |
Ratio | 0.87 | 0.98 | 0.93 | 0.92 | 0.93 | |
p=4 | Unpruned | 3.3678E-05 | 1.2722E-04 | 5.1513E-04 | 1.7583E-03 | 6.4381E-03 |
Pruned | 3.1136E-05 | 9.4751E-05 | 3.6298E-04 | 1.5314E-03 | 5.5922E-03 | |
Ratio | 0.92 | 0.74 | 0.70 | 0.87 | 0.87 | |
p=6 | Unpruned | 4.4647E-05 | 1.6264E-04 | 6.3257E-04 | 2.6062E-03 | 9.5336E-03 |
Pruned | 3.8436E-05 | 1.2308E-04 | 5.2076E-04 | 1.9159E-03 | 7.2723E-03 | |
Ratio | 0.86 | 0.76 | 0.82 | 0.74 | 0.76 | |
p=8 | Unpruned | 6.4398E-05 | 2.2496E-04 | 9.3172E-04 | 3.2624E-03 | 1.2828E-02 |
Pruned | 4.5497E-05 | 1.7977E-04 | 7.8489E-04 | 2.5205E-03 | 9.4220E-03 | |
Ratio | 0.71 | 0.80 | 0.84 | 0.77 | 0.73 | |
p=10 | Unpruned | 7.9538E-05 | 3.2234E-04 | 1.1563E-03 | 4.5004E-03 | 1.7702E-02 |
Pruned | 5.6676E-05 | 2.2770E-04 | 7.9667E-04 | 3.0367E-03 | 1.2133E-02 | |
Ratio | 0.71 | 0.71 | 0.69 | 0.67 | 0.69 | |
p=12 | Unpruned | 1.3727E-04 | 4.1301E-04 | 1.6069E-03 | 5.7918E-03 | 2.3696E-02 |
Pruned | 6.9956E-05 | 2.7148E-04 | 9.6113E-04 | 3.7934E-03 | 1.5079E-02 | |
Ratio | 0.51 | 0.66 | 0.60 | 0.65 | 0.64 |
|T| | 12 | 51 | 204 | 819 | 3276 | |
p=0 | Unpruned | 2.3893E-05 | 6.7615E-05 | 2.7051E-04 | 1.1335E-03 | 5.0569E-03 |
Pruned | 3.1631E-05 | 8.4488E-05 | 3.1938E-04 | 1.3849E-03 | 6.2978E-03 | |
Ratio | 1.32 | 1.25 | 1.18 | 1.22 | 1.25 | |
p=2 | Unpruned | 2.9957E-05 | 1.0833E-04 | 3.4219E-04 | 1.6801E-03 | 6.6029E-03 |
Pruned | 3.3662E-05 | 1.1878E-04 | 3.8369E-04 | 1.8276E-03 | 7.3026E-03 | |
Ratio | 1.12 | 1.10 | 1.12 | 1.09 | 1.11 | |
p=4 | Unpruned | 4.6595E-05 | 1.4459E-04 | 5.9718E-04 | 2.2960E-03 | 9.4577E-03 |
Pruned | 3.9385E-05 | 1.2878E-04 | 5.9189E-04 | 2.2926E-03 | 9.2390E-03 | |
Ratio | 0.85 | 0.89 | 0.99 | 1.00 | 0.98 | |
p=6 | Unpruned | 5.6086E-05 | 2.1821E-04 | 9.3389E-04 | 3.2810E-03 | 1.2832E-02 |
Pruned | 5.5423E-05 | 1.9397E-04 | 8.2830E-04 | 2.9239E-03 | 1.1699E-02 | |
Ratio | 0.99 | 0.89 | 0.89 | 0.89 | 0.91 | |
p=8 | Unpruned | 8.5570E-05 | 3.0688E-04 | 1.1368E-03 | 4.5906E-03 | 1.7794E-02 |
Pruned | 8.1338E-05 | 2.5567E-04 | 9.8448E-04 | 3.8316E-03 | 1.5385E-02 | |
Ratio | 0.95 | 0.83 | 0.87 | 0.83 | 0.86 | |
p=10 | Unpruned | 1.2774E-04 | 4.2015E-04 | 1.5151E-03 | 6.0202E-03 | 2.4115E-02 |
Pruned | 8.7737E-05 | 3.4614E-04 | 1.2627E-03 | 5.0212E-03 | 1.9915E-02 | |
Ratio | 0.69 | 0.82 | 0.83 | 0.83 | 0.83 | |
p=12 | Unpruned | 1.5864E-04 | 5.3115E-04 | 2.0558E-03 | 8.2953E-03 | 3.3196E-02 |
Pruned | 1.4783E-04 | 4.3479E-04 | 1.7454E-03 | 6.7921E-03 | 2.5964E-02 | |
Ratio | 0.93 | 0.82 | 0.85 | 0.82 | 0.78 |
|T| | 6 | 48 | 384 | 3072 | 24576 | |
p=0 | Unpruned | 2.4926E-05 | 1.0816E-04 | 8.1476E-04 | 6.8287E-03 | 4.8521E-02 |
Pruned | 1.9337E-05 | 7.1661E-05 | 5.1822E-04 | 4.0497E-03 | 3.1205E-02 | |
Ratio | 0.78 | 0.66 | 0.64 | 0.59 | 0.64 | |
p=2 | Unpruned | 3.5647E-05 | 2.0676E-04 | 1.5616E-03 | 1.3476E-02 | 9.5595E-02 |
Pruned | 2.1014E-05 | 9.8906E-05 | 7.1403E-04 | 5.6071E-03 | 4.3727E-02 | |
Ratio | 0.59 | 0.48 | 0.46 | 0.42 | 0.46 | |
p=4 | Unpruned | 6.3102E-05 | 4.2202E-04 | 3.2991E-03 | 2.5008E-02 | 1.9898E-01 |
Pruned | 3.1368E-05 | 1.5550E-04 | 1.2420E-03 | 9.1364E-03 | 7.2012E-02 | |
Ratio | 0.50 | 0.37 | 0.38 | 0.37 | 0.36 | |
p=6 | Unpruned | 1.1175E-04 | 8.2232E-04 | 5.9819E-03 | 4.7363E-02 | 3.7750E-01 |
Pruned | 4.4362E-05 | 2.5484E-04 | 1.8026E-03 | 1.3952E-02 | 1.1175E-01 | |
Ratio | 0.40 | 0.31 | 0.30 | 0.29 | 0.30 | |
p=8 | Unpruned | 1.8889E-04 | 1.4290E-03 | 1.0750E-02 | 8.6211E-02 | 6.8734E-01 |
Pruned | 6.2170E-05 | 2.8795E-04 | 3.0248E-03 | 2.3477E-02 | 1.8765E-01 | |
Ratio | 0.33 | 0.20 | 0.28 | 0.27 | 0.27 |
|T| | 4 | 38 | 307 | 2457 | 19660 | |
p=0 | Unpruned | 4.5272E-05 | 3.5574E-04 | 2.6056E-03 | 2.1919E-02 | 1.7550E-01 |
Pruned | 3.7088E-05 | 2.8895E-04 | 2.0201E-03 | 1.6717E-02 | 1.3413E-01 | |
Ratio | 0.82 | 0.81 | 0.78 | 0.76 | 0.76 | |
p=2 | Unpruned | 8.7692E-05 | 7.7001E-04 | 5.5442E-03 | 4.3656E-02 | 3.5131E-01 |
Pruned | 5.4265E-05 | 4.5130E-04 | 3.1994E-03 | 2.6491E-02 | 2.0647E-01 | |
Ratio | 0.62 | 0.59 | 0.58 | 0.61 | 0.59 | |
p=4 | Unpruned | 1.9519E-04 | 1.5893E-03 | 1.1656E-02 | 9.1985E-02 | 7.4515E-01 |
Pruned | 9.1346E-05 | 7.6676E-04 | 5.5029E-03 | 4.4103E-02 | 3.5562E-01 | |
Ratio | 0.47 | 0.48 | 0.47 | 0.48 | 0.48 | |
p=6 | Unpruned | 3.4222E-04 | 3.0026E-03 | 2.3204E-02 | 1.8303E-01 | 1.4867E+00 |
Pruned | 1.6222E-04 | 1.2524E-03 | 9.9395E-03 | 7.9390E-02 | 6.4008E-01 | |
Ratio | 0.47 | 0.42 | 0.43 | 0.43 | 0.43 | |
p=8 | Unpruned | 6.0944E-04 | 5.2648E-03 | 4.1087E-02 | 3.2613E-01 | 2.6380E+00 |
Pruned | 2.4756E-04 | 1.9114E-03 | 1.5453E-02 | 1.2295E-01 | 1.0089E+00 | |
Ratio | 0.41 | 0.36 | 0.38 | 0.38 | 0.38 |
Family | |V(P)| | |E(P)| | Time complexity | Space complexity |
d-dimensional simplex | 2d+1 | 2d(d+1) | O(2dd|J|) | O(2d|J|) |
d-dimensional hypercube | 3d+1 | 2⋅3d−1d+2d | O(3dd|J|) | O(3d|J|) |
n-sided polygon | 2(n+1) | 4n | O(n|J|) | O(n|J|) |
n-gonal prism | 2(3n+2) | 15n+2 | O(n|J|) | O(n|J|) |
n-based pyramid | 4(n+1) | 2(5n+1) | O(n|J|) | O(n|J|) |
|T| | 32 | 128 | 512 | 2048 | 8192 | |
p=0 | Unpruned | 1.8926E-05 | 6.2850E-05 | 2.3568E-04 | 8.8175E-04 | 3.6211E-03 |
Pruned | 2.4888E-05 | 7.3720E-05 | 2.7315E-04 | 1.0641E-03 | 4.5114E-03 | |
Ratio | 1.32 | 1.17 | 1.16 | 1.21 | 1.25 | |
p=2 | Unpruned | 3.3402E-05 | 8.2150E-05 | 3.3563E-04 | 1.3032E-03 | 5.1413E-03 |
Pruned | 2.8993E-05 | 8.0898E-05 | 3.1069E-04 | 1.2043E-03 | 4.7641E-03 | |
Ratio | 0.87 | 0.98 | 0.93 | 0.92 | 0.93 | |
p=4 | Unpruned | 3.3678E-05 | 1.2722E-04 | 5.1513E-04 | 1.7583E-03 | 6.4381E-03 |
Pruned | 3.1136E-05 | 9.4751E-05 | 3.6298E-04 | 1.5314E-03 | 5.5922E-03 | |
Ratio | 0.92 | 0.74 | 0.70 | 0.87 | 0.87 | |
p=6 | Unpruned | 4.4647E-05 | 1.6264E-04 | 6.3257E-04 | 2.6062E-03 | 9.5336E-03 |
Pruned | 3.8436E-05 | 1.2308E-04 | 5.2076E-04 | 1.9159E-03 | 7.2723E-03 | |
Ratio | 0.86 | 0.76 | 0.82 | 0.74 | 0.76 | |
p=8 | Unpruned | 6.4398E-05 | 2.2496E-04 | 9.3172E-04 | 3.2624E-03 | 1.2828E-02 |
Pruned | 4.5497E-05 | 1.7977E-04 | 7.8489E-04 | 2.5205E-03 | 9.4220E-03 | |
Ratio | 0.71 | 0.80 | 0.84 | 0.77 | 0.73 | |
p=10 | Unpruned | 7.9538E-05 | 3.2234E-04 | 1.1563E-03 | 4.5004E-03 | 1.7702E-02 |
Pruned | 5.6676E-05 | 2.2770E-04 | 7.9667E-04 | 3.0367E-03 | 1.2133E-02 | |
Ratio | 0.71 | 0.71 | 0.69 | 0.67 | 0.69 | |
p=12 | Unpruned | 1.3727E-04 | 4.1301E-04 | 1.6069E-03 | 5.7918E-03 | 2.3696E-02 |
Pruned | 6.9956E-05 | 2.7148E-04 | 9.6113E-04 | 3.7934E-03 | 1.5079E-02 | |
Ratio | 0.51 | 0.66 | 0.60 | 0.65 | 0.64 |
|T| | 12 | 51 | 204 | 819 | 3276 | |
p=0 | Unpruned | 2.3893E-05 | 6.7615E-05 | 2.7051E-04 | 1.1335E-03 | 5.0569E-03 |
Pruned | 3.1631E-05 | 8.4488E-05 | 3.1938E-04 | 1.3849E-03 | 6.2978E-03 | |
Ratio | 1.32 | 1.25 | 1.18 | 1.22 | 1.25 | |
p=2 | Unpruned | 2.9957E-05 | 1.0833E-04 | 3.4219E-04 | 1.6801E-03 | 6.6029E-03 |
Pruned | 3.3662E-05 | 1.1878E-04 | 3.8369E-04 | 1.8276E-03 | 7.3026E-03 | |
Ratio | 1.12 | 1.10 | 1.12 | 1.09 | 1.11 | |
p=4 | Unpruned | 4.6595E-05 | 1.4459E-04 | 5.9718E-04 | 2.2960E-03 | 9.4577E-03 |
Pruned | 3.9385E-05 | 1.2878E-04 | 5.9189E-04 | 2.2926E-03 | 9.2390E-03 | |
Ratio | 0.85 | 0.89 | 0.99 | 1.00 | 0.98 | |
p=6 | Unpruned | 5.6086E-05 | 2.1821E-04 | 9.3389E-04 | 3.2810E-03 | 1.2832E-02 |
Pruned | 5.5423E-05 | 1.9397E-04 | 8.2830E-04 | 2.9239E-03 | 1.1699E-02 | |
Ratio | 0.99 | 0.89 | 0.89 | 0.89 | 0.91 | |
p=8 | Unpruned | 8.5570E-05 | 3.0688E-04 | 1.1368E-03 | 4.5906E-03 | 1.7794E-02 |
Pruned | 8.1338E-05 | 2.5567E-04 | 9.8448E-04 | 3.8316E-03 | 1.5385E-02 | |
Ratio | 0.95 | 0.83 | 0.87 | 0.83 | 0.86 | |
p=10 | Unpruned | 1.2774E-04 | 4.2015E-04 | 1.5151E-03 | 6.0202E-03 | 2.4115E-02 |
Pruned | 8.7737E-05 | 3.4614E-04 | 1.2627E-03 | 5.0212E-03 | 1.9915E-02 | |
Ratio | 0.69 | 0.82 | 0.83 | 0.83 | 0.83 | |
p=12 | Unpruned | 1.5864E-04 | 5.3115E-04 | 2.0558E-03 | 8.2953E-03 | 3.3196E-02 |
Pruned | 1.4783E-04 | 4.3479E-04 | 1.7454E-03 | 6.7921E-03 | 2.5964E-02 | |
Ratio | 0.93 | 0.82 | 0.85 | 0.82 | 0.78 |
|T| | 6 | 48 | 384 | 3072 | 24576 | |
p=0 | Unpruned | 2.4926E-05 | 1.0816E-04 | 8.1476E-04 | 6.8287E-03 | 4.8521E-02 |
Pruned | 1.9337E-05 | 7.1661E-05 | 5.1822E-04 | 4.0497E-03 | 3.1205E-02 | |
Ratio | 0.78 | 0.66 | 0.64 | 0.59 | 0.64 | |
p=2 | Unpruned | 3.5647E-05 | 2.0676E-04 | 1.5616E-03 | 1.3476E-02 | 9.5595E-02 |
Pruned | 2.1014E-05 | 9.8906E-05 | 7.1403E-04 | 5.6071E-03 | 4.3727E-02 | |
Ratio | 0.59 | 0.48 | 0.46 | 0.42 | 0.46 | |
p=4 | Unpruned | 6.3102E-05 | 4.2202E-04 | 3.2991E-03 | 2.5008E-02 | 1.9898E-01 |
Pruned | 3.1368E-05 | 1.5550E-04 | 1.2420E-03 | 9.1364E-03 | 7.2012E-02 | |
Ratio | 0.50 | 0.37 | 0.38 | 0.37 | 0.36 | |
p=6 | Unpruned | 1.1175E-04 | 8.2232E-04 | 5.9819E-03 | 4.7363E-02 | 3.7750E-01 |
Pruned | 4.4362E-05 | 2.5484E-04 | 1.8026E-03 | 1.3952E-02 | 1.1175E-01 | |
Ratio | 0.40 | 0.31 | 0.30 | 0.29 | 0.30 | |
p=8 | Unpruned | 1.8889E-04 | 1.4290E-03 | 1.0750E-02 | 8.6211E-02 | 6.8734E-01 |
Pruned | 6.2170E-05 | 2.8795E-04 | 3.0248E-03 | 2.3477E-02 | 1.8765E-01 | |
Ratio | 0.33 | 0.20 | 0.28 | 0.27 | 0.27 |
|T| | 4 | 38 | 307 | 2457 | 19660 | |
p=0 | Unpruned | 4.5272E-05 | 3.5574E-04 | 2.6056E-03 | 2.1919E-02 | 1.7550E-01 |
Pruned | 3.7088E-05 | 2.8895E-04 | 2.0201E-03 | 1.6717E-02 | 1.3413E-01 | |
Ratio | 0.82 | 0.81 | 0.78 | 0.76 | 0.76 | |
p=2 | Unpruned | 8.7692E-05 | 7.7001E-04 | 5.5442E-03 | 4.3656E-02 | 3.5131E-01 |
Pruned | 5.4265E-05 | 4.5130E-04 | 3.1994E-03 | 2.6491E-02 | 2.0647E-01 | |
Ratio | 0.62 | 0.59 | 0.58 | 0.61 | 0.59 | |
p=4 | Unpruned | 1.9519E-04 | 1.5893E-03 | 1.1656E-02 | 9.1985E-02 | 7.4515E-01 |
Pruned | 9.1346E-05 | 7.6676E-04 | 5.5029E-03 | 4.4103E-02 | 3.5562E-01 | |
Ratio | 0.47 | 0.48 | 0.47 | 0.48 | 0.48 | |
p=6 | Unpruned | 3.4222E-04 | 3.0026E-03 | 2.3204E-02 | 1.8303E-01 | 1.4867E+00 |
Pruned | 1.6222E-04 | 1.2524E-03 | 9.9395E-03 | 7.9390E-02 | 6.4008E-01 | |
Ratio | 0.47 | 0.42 | 0.43 | 0.43 | 0.43 | |
p=8 | Unpruned | 6.0944E-04 | 5.2648E-03 | 4.1087E-02 | 3.2613E-01 | 2.6380E+00 |
Pruned | 2.4756E-04 | 1.9114E-03 | 1.5453E-02 | 1.2295E-01 | 1.0089E+00 | |
Ratio | 0.41 | 0.36 | 0.38 | 0.38 | 0.38 |