Professional Documents
Culture Documents
Confidentiality Notice
Non-confidential report and publishable on Internet
Author: Promotion:
Armand Touminet 2019
ENSTA ParisTech Tutor: Host Organism Tutor:
Éliane Bécache Maxime Theillard
Confidentiality Notice
This present document is not confidential. It can be communicated outside in paper format
or distributed in electronic format.
Abstract
Discontinuous Galerkin methods are the state of the art of high accuracy numerical analysis.
First introduced by Reed and Hill [10], they allow solving partial differential equations with an
arbitrary order of convergence and show many advantages over classic finite element methods
when it comes to highly scalable parallel computing thanks to a low computational stencil.
In this work, we introduce a modal Discontinuous Galerkin method applied on non graded
quadtrees that implements an adaptative mesh refinement technique. Then we extend the
described method to handle levelset interface with high precision to preserve the convergence
orders. We finally test our method on Poisson equation and discuss the impact of the chosen
numerical flux on the observed order of covergence. Alongside this work comes a scalable
parallel two dimensional implementation in C++.
Acknowledgment
I would like to thank Maxime Theillard for being my supervisor during this internship.
He helped me a lot to get this project working, and helped me solving a lot of problems I
encountered during my work.
I would also like to thank Thomas Bellotti, my coworker during this internship. My discus-
sions with him brought me a lot of insight about my work.
I thank Éliane Bécache for accepting being my ENSTA tutor for this internship.
Finally, I’d like to thank all of the graduate students of UC Merced without whom my
internship would not have been as pleasant as it was.
Contents
Confidentiality Notice 3
Abstract 5
Acknowledgment 7
Contents 9
Introduction 11
II Implementation 21
II.1 Numerical integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
II.2 Parallelization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
III Examples 25
III.1 The Poisson equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
III.2 Choosing β . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
III.3 A non graded quadtree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
III.4 Higher order is cheaper . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
IV Adding an interface 31
IV.1 The new problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
IV.2 Modal basis on cut cells . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
IV.2.1 Legendre polynomial on bounding boxes . . . . . . . . . . . . . . . . 32
IV.2.2 Algebraic components on cut cells . . . . . . . . . . . . . . . . . . . . 33
IV.2.3 Warped Legendre polynomials . . . . . . . . . . . . . . . . . . . . . . 34
V Conclusion 35
Bibliography 37
Introduction
The discontinuous Galerkin method aims to provide a high order alternative to the finite
element method. It can be seen as an extension of the finite volume method as it is a particular
case of the first order discontinuous Galerkin method. Its idea is quite simple: we project the
solution of a problem on a vector space containing piecewise polynomial functions. The main
difference with the finite elements method is that no continuity of the solution is enforced at the
interface between two cells of the mesh. This has the benefit of keeping a low computational
stencil which brings a better parallelization potential. Throughout this work, we show how to
set up a modal discontinuous Galerkin method using a non graded quadtree mesh for solving a
simple diffusion equation. Quadtrees are mainly used in an adaptive mash refinement context
to increase accuracy at specific locations. The main focus of this work is to study the orders
of accuracy that the method can achieve. To do so, we explain how to deal with the common
implementation pitfalls that come with high order methods and validate our implementation
on simple examples. We also show how high order can be used to reduce the computation
time while preserving the overall solution accuracy. In a second part, we show how to handle
adding an interface to model various kind of geometries thanks to the level set method. We
show how this integrates to our discontinous Galerkin solver and present several approaches
to deal with the main problems encountered while implementing it.
Part I
I.1 Theory
We first construct a modal Discontinuous Galerkin method for the following stationary
Poisson equation
αu − µ∆u = f (I.1)
on a rectangle shaped domain Ω ∈ R2 with Neumann or Dirichlet boundary conditions on
the computational domain boundaries given as
u = uD on ∂Ω (Dirichlet) (I.2)
∇u · n = uN on ∂Ω (Neumann) (I.3)
We first consider this equation without any interface and thus α and µ are constants.
L=1
L=3
L=2
approaches we subdivide the computational domain Ω in small cells on which the solution is
locally computed. The nodal approach consists in defining computation nodes on each cell
and compute the solution on these nodes and then interpolate a Lagrange polynomial on them
to construct the solution on the whole cell. The more computational nodes we define on each
cell, the higher the degree of the constructed Lagrange polynomial is, and the more accurate
the solution is. By assembling all of the local cell defined polynomials, we construct a piecewise
continuous solution.
On the other hand, in the modal variant we define on each cell a functional basis and define
the equation solution as vector in the span of that basis. By cleverly choosing our functions we
can construct a basis such as we increase the accuracy of our computation for each function
we add in the basis. More precisely, we can construct bases with (N + 1)2 functions in each
yielding at least a N -th order of accuracy.
We define the order of accuracy (or order of convergence) of the method as follows: con-
sidering we use a mesh that is constructed depending on a typical spacial step h, the order of
accuracy is computed from the L2 or L∞ error Eh yielded by applying the method on a mesh
constructed from a step of h and h/2 and is given by the following formula
ln(Eh /Eh/2 )
orderh = . (I.4)
ln 2
Its meaning is quite straightforward: dividing the spacial step by 2 divides the error of the
solution by 2orderh . We use this definition of the order of accuracy throughout this paper to
evaluate the performance of the method.
In this work we choose to follow the modal approach and we use shifted and normalized
Legendre polynomials as our base functions. They show several benefits such as forming
orthonormal bases which highly simplify the implementation and reduce the computational
cost.
[yk ; yk+1 ]. Then, for all 0 ≤ k ≤ K − 1, we define the shifted and normalized Legendre
polynomial of degree m with the respect to x and n with the respect to y on Ωk by
v ! ! !
u (2n + 1)(2m + 1) x − xk y − yk
u
∀(x, y) ∈ Ω, φkmn (x, y) = t P mn 2 − 1, 2 − 1 1Ωk (x, y)
hkx hky hkx hky
(I.5)
where hkx = xx+1 − xk , hky = yk+1 − yk and where Pnm := Pn Pm . Here, for all n ∈ N,
Pn : [−1; 1] → R denotes the Legendre polynomial of degree n defined as the solution of
Legendre’s differential equation
" #
d dP (x)
n
1 − x2 + n(n + 1)Pn (x) = 1 − x2 Pn00 (x) − 2xPn0 (x) + n(n + 1)Pn (x) = 0 .
dx dx
k
We finally define BN := span(φk00 , ..., φkmn , ..., φkN N ) to be the modal functional basis we use
on each cell for our DG method, and BN = K k
k=0 BN .
L
Before integrating by parts be must break down φ on each cell since it is only piecewise
continuous on Ω
K Z Z K Z
uφkpq dω − µ ∆uφkpq dω = f φkpq dω
X X X
α (I.7)
k=0 0≤p,q≤N Ωk Ωk k=0 Ωk
The value of the solution u is ambiguous at the cells interfaces Ωk since u is not continuous
at those locations. We must thus choose how to properly define a so called numerical flux.
Choosing a numerical flux is a really complex topic and is discussed in many papers (see, e.g.,
[7]). We choose the numerical flux described in [8]
β
∇u
d = [[u]] + ∇u
2
where [[u]] = (u+ − u− )n (see figure I.2 for their definiton) denotes the jump of u at the
interface and ∇u = (∇u+ + ∇u− )/2 is the average of ∇u at the interface. We will see later
how the choice of β can influence the convergence order of the method.
u−
u+
u− u+ Ωk u+ u−
u+
u−
0≤k≤K
We thus construct the vector of unknowns U := (ukmn )0≤m,n≤N and the right hand side
k 0≤k≤K k
vector F := (fmn )0≤m,n≤N . Since the function f is known, we can compute the fmn values
k
by evaluating hf, φmn iL2 (Ωk ) . Including this in (I.2) gives
K K
(αhφkmn , φkpq i+µh∇φkmn , ∇φkpq i)ukmn − φk hf, φkpq i
X X X X X X
µh∇u·n,
d
pq i =
k=0 0≤p,q≤N 0≤n,m≤N 0≤p,q≤N k=0 0≤p,q≤N
(I.9)
We can conveniently rewrite (I.2) as a linear algebra problem by defining the following
matrices, where ` ∈ [[0; K − 1]]
S := (h∇φmn , ∇φ`pq i)
k
F := (hf, φkpq i)
Note that we omited writing some of the indices to avoid obvious readability issues.
The term
P d · n, φk i can also be written as a matrix F, but we’ll discuss it’s
µh∇u pq
0≤p,q≤N
exact expression in the following sections as it requires a particular attention.
In the next subsection, we discuss the properties of both M and S and how to construct them,
then we will give some insight about the flux matrix F.
n
0 P
ii. P2n+1 =1+ (4i + 1)P2i
i=1
n−1
0 P
iii. P2n = (4i + 3)P2i+1
i=0
0 0
iv. hP2n+1 , P2m i=0
0 0
v. hP2n+1 , P2m+1 i = 2n(2n + 3) + 2
0 0
vi. hP2n , P2m i = 2n(2n + 1)
!
Pn0 Pm
vii. ∇Pnm =
Pn Pm0
These properties show that we can avoid numerical quadratures to construct the mass and
stiffness matrices. Note that this no longer holds when the cell is cut, e.g. when an interface
crosses a cell, which we’ll discuss later. But even when considering and interface in Ω we can
still take advantage of thoses formulae the handle the cells that do not intersect the interface.
From (2.ii) we deduce that M is the identity matrix. The stiffness matrix is block sparse,
because h∇φknm , ∇φ`pq i = 0 when k 6= `.
k
\ k ·n = β
[[umn ]]
∇u mn + ∇ukmn · n. (I.12)
2
The choice of this numerical flux is discussed in [8]. It is shown that for some values of
β this flux ensure time stability of the solution when we consider a time dependent equation.
Time discretization is not the main topic of this work however, but we’ll see later that the
choice of β is important even for stationnary equation, as we observe that for high polynomial
degrees used, a wrong value can lead to a loss of convergence of the method.
The numerical flux is what links the different cells between each other, as computing the
jump and the average of the basis funcitons involves evaluating basis functions from the
neighbor cells. Let’s first assume that each cell has four different neighbors (one for each
edge, which is the case when we consider a structured mesh). Let NΩLk , NΩRk , NΩBk , NΩTk be
the four neighbor cells of Ωk . The term from the left neighbor is
β k ukL + ukmn
\
∇ k
x umn = (umn − ukmn
L
)φkmn − mn ∇x φkmn (I.13)
2 2
!
1
where kL is defined such that ΩkL = NΩLk ,
and ∇x u = ∇u · . The flux matrix is made of
0
the contributions of every neighbor cell. We obtain a similar formula for the other neighbors
except that the normal vector n can give a different sign, e.g. in the term from the top
neighbor
β k kT k ukmn
T
+ ukmn
∇\ uk
y mn = (u − u )φ + ∇y φkmn .
2 mn mn mn
2
We then construct a square flux matrix F of size (K(N + 1)2 )2 in which we store the different
flux contribution as follows: on the (kN 2 +pN +q)-th row we store the different terms coming
\
from ∇u k , and on this row, on the (k N 2 + mN + n)-th column we store the contribution
mn i
from the the cell NΩi k of degree p with the respect to x and q with the respect to y. For
instance, if we note F(i, j) to be the coefficient on row i and column j, then we have
∇x φkmn
!
Z
β kL L
2 2
F(kN + pN + q, kL N + mN + n) = − φmn − φkpq dγ. (I.14)
2 2
∂ΩL
k
Wall
ΩkL Ωk
Note. Applying this discretization using a polynomial degree N = 0 is identical to using the
finite volumes method. We can see that the flux we chose is compatible with the flux used
in finite volumes solvers (such as in [11]) if β = 2/∆k,ki where ∆k,ki is a typical distance
that depends on the size of the cells Ωk and Ωki . In this, case, the value of beta has a
geometrical meaning and choosing another value leads to a wrong solution. More precisely,
the flux represents the solution variation between the centers of two adjacent cells. When
using a higher order basis, the numerical flux also means to penalize the solution’s gradient
discontinuity at the interface, and we observe that when N ≥ 1, several values for β are
possible.
Unfortunately, the properties of the modal basis shown previously are not always useful
when computing the flux matrix. Indeed, when considering an irregular mesh, the integration
domain ∂Ωki is smaller than the domain on which the functions of the basis are defined, and
integrating Legendre polynomials on subdomains of their definition domain does not have any
useful properties. We thus use numerical quadrature to compute the flux matrix.
and modify the right hand side vector accordingly to represent the Dirichlet boundary conditions
β Z
F (kN 2 + mN + n) = hf, φkmn i − uD φkmn dγ. (I.17)
2
∂ΩkL
Note that for the sake of readability we only wrote the flux coming from the left cell
boundary, but for the implementation we also take into account the fluxes coming from every
neighbor cell as well.
Imposing Neumann boundary conditions is quite similar. The flux coming from the left fake
cell is given by
k ·n = − N −
u ukmn
∇x φkmn
X X
\
∇u (I.18)
mn
0≤m,n≤N 2 0≤m,n≤N 2
which translates to a new expression in the flux matrix
Z
∇x φkmn
L
F(kN 2 + pN + q, kL N 2 + mN + n) = − φkpq dγ (I.19)
2
∂ΩL
k
∂u
α − µ∆u = f (I.21)
∂t
Most of the previous work discretize the time dependant term thanks to a fourth order Runge-
Kutta scheme (see [13]). The tricky part of combining it with a Discontinuous Galerkin method
is that high orders constrain the admissible time steps a lot, and thus CFL numbers are quite
restrictive. For instance, in the CFL given in [4] for an advection equation, the upper bound
of the time step that can be used depends on the polynomial degree used as follows
h
∆t ≤ (I.22)
a(2N + 1)
Part II
Implementation
We implement the modal discontinuous galerkin method described previously in a full C++
software. The main advantage of the method is it’s low computational stencil wich allows a
highly parallel implementation as assembling the different matrices only require information
about the current cell and the neighbor cells. We propose an implementation that uses
distributed parallelism over multiple processors that can be run on a cluster containing many
computational nodes and take advandage of the available computing power to considerably
speed up the computation. The implementation uses PETSc as a parallel framework to store
the algebraic components and to solve the obtained linear system through iterative solvers.
Implementing the method requires dealing with numerical computation very carefully as
insufficient numerical approximations can lead to loosing the orders of convergence given by
the method. In this part, we explain how to deal with the common pitfalls encountered while
implementing the modal method.
Property 3 (Gauss-Legendre quadrature) Let f be a real function defined on [−1; 1], and
let n be a positive integer. The integral of f over [−1; 1] can be approximated as follows
Z1 n
X
f (x) dx ≈ wi f (xi )
−1 k=1
where (wi , xi )0≤i≤n are the Gauss-Legendre weights and abscissae; xi being the i-th root of
Pn , and wi = 2((1 − x2i )[Pn0 (xi )]2 )−1 .
The idea behin the Gauss-Legendre quadrature is the same as the one behind the dis-
continuous galerkin discretization: approximate the function f with a Legendre polynomial of
degree n. The higher the degree, the more accurate the computation. More precisely, if f is
of class C 2n , then according to [6]
Z1 n
22n+1 (n!)4
f (2n) (ξ)
X
En := f (x) dx − wi f (xi ) = (II.1)
k=1 (2n + 1)[(2n)!]3
−1
where ξ ∈ [−1; 1]. We can adapt this method to integrate functions defined on a segment
[a; b] ∈ R using the substitution
Zb Z1 !
(b − a) (b − a)
f (x) dx = f a+ (1 + x) dx
a
2 2
−1
Zb n
!
X (b − a) (b − a)2n+1 (n!)4 (2n)
f (x) dx = (b − a) wi f a + (1 + x) + 3
f (ξ). (II.2)
a k=0 2 (2n + 1)[(2n)!]
What’s interesting is that En = O((b − a)2n+1 ) which is an admissible accuracy that won’t
lower the convergence orders of the solver as described previously in this part, provided that
the functions we integrate are sufficiently smooth.
Integrating two-dimensional functions is quite easy by extending the method. Let’s consider
f : [a; b] × [c; d] → R to be such a function. We then have
Zb Zd n X n
!
(b − a)(d − c) X (b − a) (d − c)
f (x, y) dxdy ≈ wi wj f a + (1 + wi ), c + (1 + wj )
a c
4 i=1 j=1 2 2
(II.3)
2n 2
which gives a O([(b − a)(c − d)] ) error. We can even integrate a function f : D ⊂ R → R
provided that we can find a mapping γ : D → [−1; 1]2 that is bijective and of class C 2n , wich
gives
Z n X
n
wi wj f˜ ◦ γ(xi , xj )
X
f (x, y) dxdy ≈ (II.4)
D i=1 j=1
where f˜(x, y) = f (x, y)|Jf (x, y)|, and Jf is the Jacobian matrix of f . This would give a
O(|D|2n ) error.
We use that last formula to compute integrals on cut cells when we consider an interface
in Ω. Finding a good mapping γ can be very tricky when we increase the polynomial degree
used since we need γ to be increasingly smooth in order to keep the orders of convergence.
We show in (III) how a wrong mapping γ can ruin the accuracy of a whole simulation.
We numerically show this phenomenon by comparing the orders of convergence of two simple
cases by integrating f : x 7→ |x| and g : x 7→ ex over [−h; h] by using the Gauss-Legendre
quadrature with a polynomial degree of 4.
II.2 Parallelization
Our modal discontinuous Galerkin method can take advantage of cutting edge high per-
formance computing techniques. Particularly, it can be parallelized very efficiently. There are
two main stages in solving a partial differential equation which are very computation heavy:
constructing the left hand side matrix and solving the obtained linear system. Doing either one
of those sequentially leads to disastrous performance. For instance, using an efficient parallel
linear solver is pointless if the matrix is constructed sequentially. In out implementation we use
the PETSc library to help with the parallelization (see [2], [3]). PETSc uses MPI internally to
handle the distributed computing part.
We construct the left hand side matrix of (I.2) in parallel using parallel storage. As dis-
cussed previously, the matrix is block sparse and thus can be efficiently stored in a parallel
CRS (Compressed Row Storage) matrix object. As explained in [3], the key for an efficient
construction is to properly preallocating the object memory. Doing so involves knowing a
reasonable estimation of how many non zero coefficient are stored in each row of the matrix.
Let’s see what is inside a given matrix row. We keep the notations introduced in (I). Let i ≥ 0
such that i = kN 2 + N + q. Given a matrix M , we note M (i, :) its i-th row. Let A be the
left hand side matrix. We have
We have seen that M is the identity matrix, it thus only has a single non zero coefficient on
the i-th column.
Concerning the stiffness matrix, we know that
• k 6= `
• n 6≡ p (mod 2)
• m 6≡ q (mod 2)
which means that the total amount of non zero coefficient is always less than (N + 1)2 /4.
The flux matrix is the one that brings the larges amount of non zeros. If a cell Ωk has
r neighbors, then F(i, :) contains r + 1 blocks of (N + 1)2 coefficients which are results of
numerical quadrature which thus cannot be treated as non zeros.
We thus choose that last value of (r + 1)(N + 1)2 to preallocate each row of the matrix.
Computing the numerical flux also shows some pitfalls when it comes to parallel computing.
Indeed, computing the coefficient corresponding to a given cell involves retrieving informations
from the neighbor cells. This can introduce additional overhead if we choose to also use a
parallel storage for the quadtree mesh since it means retrieving data that can be owned by a
different process.
Solving (I.2) in parallel is entirely done by the PETSc implementation of the various linear
solvers. What can be done to decrease the computation time is to use an iterative linear solver
algorithm, such as the biconjugate gradient stabilized method (see [5]).
Part III
Examples
In this part we show the results obtained using our Discontinuous Galerkin solver con-
structed following (I) and (II). The goal is to validate that we obtain the expected convergence
orders and to discuss the impact of several parameters.
Table III.1: L2 errors for different values of the quadtree level L and polynomial degree N
L\N 0 1 2 3 4 5 6
0 2.00 2.00 0.83 0.82 7.80 × 10−2 7.80 × 10−2 3.70 × 10−3
1 2.00 0.41 0.41 1.10 × 10−2 1.10 × 10−2 2.20 × 10−4 2.10 × 10−4
2 0.97 0.39 0.33 4.00 × 10−2 5.40 × 10−3 4.30 × 10−4 3.50 × 10−5
3 0.43 6.90 × 10−2 9.00 × 10−2 2.70 × 10−3 4.00 × 10−4 6.40 × 10−6 6.20 × 10−7
4 0.13 2.40 × 10−2 2.20 × 10−2 1.30 × 10−4 2.40 × 10−5 9.60 × 10−8 9.30 × 10−9
5 0.18 5.20 × 10−3 5.70 × 10−3 9.80 × 10−6 1.60 × 10−6 1.50 × 10−9 1.53 × 10−10
6 0.08 1.30 × 10−3 1.40 × 10−3 5.80 × 10−7 1.00 × 10−7 2.60 × 10−11 2.30 × 10−12
We list L2 errors for higher polynomial degrees as well in table III.3. Because we are
limited in precision by the precision of the computer floating point operations, which is roughly
around 10−15 considering the values involved in the computations, we need to decrease the
solver accuracy in order to obtain relevant results. To do so we increase the size of the
computational domain, and for what follows we chose Ω = [−4; 4]2 .
Finally, we list computation time for each simulation in , tested on 12 processes.
Table III.2: Convergence orders for different values of the quadtree level L and polynomial
degree N
L\N 0 1 2 3 4 5 6
0 - - - - - - -
1 0 2.3 1.0 6.2 2.8 8.4 4.1
2 1.0 0.1 0.3 -1.8 1.0 -1.0 2.6
3 1.2 2.5 1.9 3.9 3.8 6.1 5.8
4 1.7 1.5 2.0 4.4 4.1 6.1 6.1
5 -0.4 2.2 1.9 3.7 3.9 6.0 5.9
6 1.2 2.0 2.0 4.1 4.0 5.9 6.1
Figure III.1: The error (on the left) and the solution of the problem with L = 0 and N = 40
Table III.3: L2 errors for different values of the quadtree level L and polynomial degree N
L\N 15 20 25 30
0 4.5 2.3 × 10−2 1.6 × 10−4 2.6 × 10−9
1 1.2 × 10−3 4.4 × 10−8 1.7 × 10−11 2.0 × 10−13
2 6.1 × 10−8 3.1 × 10−13 7.7 × 10−13 -
III.2 Choosing β
Let’s see how β impacts the error on the first example. We first choose N = 0 and L = 5.
We show the results in figure III.2. We see that choosing β 6= 1 leads to wrong results.
Increasing β make the solution converge towards the null function. As explained before, this
happens because N = 0 which is a special case. Here β has a geometrical meaning for the
finite volumes method. We thus must keep β = 1 to preserve convergence. Let’s repeat the
simulations with N = 3 and see the results on figure III.2. Here we see that increasing β
barely changes the error. This is rather surprising because choosing a different numerical flux
does not change the solution. Let’s try with a higher order with N = 9 in figure III.2. Again,
the results are rather surprising. With β = 1, the linear solver fails to converge in under 1000
iterations and yields a very inaccurate solution. With β = 2 the linear solver converges in only
109 iterations and yields a very accurate solution. Increasing β again up to 200 increases both
the error and the linear solver convergence time (473 iterations required). For high orders
there seems to be an optimal value of β for both computation time and accuracy.
Figure III.2: The error map for β = 1, 2, 20, 200 (from top left to bottom right) with N = 0
and L = 5
Figure III.4: The error map for β = 1, 2, 20, 200 with N = 9 and L = 5
Figure III.5: The solution and error map for a random non graded quadtree, with N = 5
Table III.4: L2 errors for different values of the quadtree level L and polynomial degree N
L\N 2 3 5
r 6.7 × 10−1 1.6 × 10−1 1.5 × 10−3
r+1 1.0 × 10−1 2.5 × 10−2 5.6 × 10−5
r+2 1.8 × 10−2 3.2 × 10−3 1.8 × 10−6
Table III.5: Execution time in seconds for the main computation loop
L\N 1 2 3 4 5 6
2 < 0.1 < 0.1 < 0.1 < 0.1 < 0.1 0.1
3 < 0.1 < 0.1 < 0.1 < 0.1 0.1 0.2
4 < 0.1 < 0.1 0.1 0.2 0.5 1.0
5 < 0.1 0.1 0.2 0.9 2.2 5.1
6 0.1 0.4 1.9 5.0 10.4 23.6
Table III.6: Execution time in seconds for the main computation loop with high orders
L\N 15 20 25 30
0 0.2 1.3 4 12.9
1 2.6 23.0 50.0 129.2
2 16.6 81.2 - -
3 57.5 - - -
We can see that increasing the polynomial degree does not increase the execution time
as much as increasing the quadtree level. In fact, to reach a given accuracy, it is faster (in
term of execution time) to increase the polynomial degree than to refine the mesh used. For
instance, to reach a L2 accuracy of around 10−7 , one can choose, according to III.1, either
L = 6 and N = 3 or L = 3 and N = 6. However, according to III.5, the first solution is ten
times slower than the second one. Actually the fastest solution to reach a given accuracy is
always to choose a quadtree of level 0 and to choose an appropriate polynomial degree. The
best mesh is no mesh at all. Of course, this is really unrealistic because for real simulations,
it is mandatory to use a mesh with sufficient refinement to model any kind of geometry, but
our results show that if it is possible to coarsen a mesh for any reason, then it is possible to
do so while preserving L2 accuracy and decreasing computation time.
Part IV
Adding an interface
In this part we want to extend our solver to be able to solve an equation on a specific
domain rather than a square. The computational domain can be described in different ways.
The most simple case is to import a geometry defined by third party tools. These are usually
described by a set of vertices and a connectivity to represent a set of triangles. This method
is great for representing a real object on which we want to solve an equation. For instance,
we could import a mesh representing a plane in order to study its aerodynamics. The second
approach is called the level set technique. It consists in defining a geometry implicitly by
defining a level function ψ on a rectangle shaped domain Ω such that ψ(x, y) < 0 inside the
computational domain, ψ(x, y) > 0 ouside of the computational domain, and ψ(x, y) = 0
on the computational domain boundary. This method was first introduced in [9] and reused
and developped in many different works. This method is particularily efficient for modeling
interface advection such as in diphasic flow simulations (see, e.g., [1]). Note that the first
method can be used with a level set representation since it is possible to define a level function
explicitly that is compatible with a given geometry defined by a mesh. In this work, we
approximate an interface by a segment on each cell. This means that the interface can be
approximated better by refining the quadtree mesh around it. It also means that the accuracy
gain given by increasing the polynomial degree is quite limited by the approximation of the
interface. A possible approach to keep the benefits of polynomial degree refinement without
mesh refinement would be to approximate the interface with higher precision on each cell. For
instance, one could approximate the interface with a polynomial curve of same degree than
the polynomial degree used for the method. This is a possible approach to improve this work
in future work.
αu − µ∆u = f on Ω̃ (IV.1)
u = 0 on Ω\Ω̃ (IV.2)
u = uD on ∂ Ω̃ (Dirichlet) (IV.3)
∂u
= uN on ∂ Ω̃ (Neumann) (IV.4)
∂n
According to the level set technique, we assume that there exists a function ψ : Ω → R such
that
ψ < 0 on Ω̃
ψ > 0 on Ω\Ω̃
ψ = 0 on ∂ Ω̃
We observe first order convergence when N = 0, but unfortunatly, for N ≥ 1, the solver
does not work very well. We observe really high error levels which get worse when we try with
even higher degrees. This can be due to the fact that truncated Legendre polynomials do not
have interesting properties, and we also loose the basis functions are no longer orthogonal. We
observe that the obtained linear system is numerically unstable as the obtained matrix is very
close to be singular. We thus need a more subtle approach to define a high order functional
basis.
Ω̃k Ω̃k
Ω̂k
Ω̃k
Figure IV.1: The three kinds of cut cells and their bounding box
The major pitfall of defining a functional basis on bounding boxes is that it is no longer
orthogonal for the L2 (Ω̃k ) dot product. This means that the mass matrix is no longer the
identity matrix, and the modal basis properties described in I.3 cannot be used on cut cells.
This means we have to use numerical quadrature to integrate functions on Ω̃k . Unfortunately,
integrating a function f over Ω̃k is not as easy as integrating f 1Ω̃k on Ω̂k (where 1D is
the characteristic function on D) because f 1Ω̃k is not smooth and thus the Gauss-Legendre
quadrature would yield wrong convergence orders. Implementing it properly involves cutting
Ω̃k in several simple shapes and applying a Gauss-Legendre quadrature on each of those shapes.
In our case, as we approximate the interface with a segment on each cell, we only have to deal
with three different cut cell shapes: the triangle, the trapezoid and the pentagon, as shown in
figure IV.1. Each of those shapes can be split in several triangles and then we can integrate a
function f on a triangle Ti by integrating f ◦ γi where γi is a mapping from Ω̂k to Ti that is
bijective and of class C ∞ on the interior of Ω̂k (note that it is not on ∂ Ω̂k ). We can construct
a reference mapping γ on the unit square [−1; 1]2 to the triangle defined by the three following
vertices: (0, 0), (1, 0) and (0, 1) by
1+x
!
γ(x, y) = 2 (IV.5)
(1−x)(1+y)
4
We also need to adapt the computation of the numerical flux as the shape of a cell boundary
is changed. For a cut cell crossed by the interface we also need to integrate a numerical flux
over the interface to enforce a given boundary condition on the interface. If we note Ik to be
the interface portion on the cell Ω̃k , then we would need to consider the following additional
term in the numerical flux described in I.4
!
Z
β k
φmn + ∇φkmn · n φkpq dγ (Dirichlet) (IV.6)
2
Ik
Z
∇φkmn · n k
φpq dγ (Neumann) (IV.7)
2
Ik
where
∀(x, y) ∈ Ω̃k , φ̃kmn (x, y) = φkmn ◦ γ −1 (x, y).
The kind of mapping we use depends on the geometry of the cut cell. We need to find one
for each of the three different cut cell geometries possible.
Pentagon. Mapping a pentagon to a square smoothly is not intuitive. The first idea is
to split the pentagon into a rectangle and a trapezoid and use the identity function on the
rectangle and the previous mapping on the trapezoid. However this mapping is only piecewise
smooth and leads to wrong convergence orders in our tests. A trick that can be used is to add
a C ∞ connection between the rectangle and the trapezoid to make the mapping of class C ∞ .
We don’t give the exact expression of this mapping here as its expression is quite cumbersome.
Even though this approach can seem quite cheesy as the resulting functional basis is not
even polynomial nor orthogonal, we observe accurate results in our tests for µ = 0 (meaning
that we do not handle the numerical flux. The case where µ 6= 0 was not numerically tested).
It would need more work to implement this properly though.
Part V
Conclusion
In this work we presented a modal discontinuous Galerkin method on non graded quadtrees
with a scalable parallel implementation. We showed that our method leads to arbitrary con-
vergence orders for solving the Poisson equation and that the method does not require any
grading property of the mesh used. We also showed that polynomial degree refinement is
more scalable than mesh refinement. Finally we showed several approaches on how to handle
adding an interface to the geometry of the problem following the level set technique, and how
to adapt the method to handle it properly, even though our implementation is not fully feature
complete for this part, and could be continued and improved in future work. This work could
also be continued and improved by giving a closer look at time dependency and how it would
integrate in the modal approach.
Bibliography
[1] Moataz O. Abu-Al-Saud, Cyprien Soulaine, Amir Riaz, and Hamdi A. Tchelepi. Level-set
method for accurate modeling of two-phase immiscible flow with moving contact lines.
2017.
[2] Satish Balay, Shrirang Abhyankar, Mark F. Adams, Jed Brown, Peter Brune, Kris Buschel-
man, Lisandro Dalcin, Victor Eijkhout, William D. Gropp, Dinesh Kaushik, Matthew G.
Knepley, Dave A. May, Lois Curfman McInnes, Richard Tran Mills, Todd Munson, Karl
Rupp, Patrick Sanan, Barry F. Smith, Stefano Zampini, Hong Zhang, and Hong Zhang.
PETSc users manual. Technical Report ANL-95/11 - Revision 3.9, Argonne National
Laboratory, 2018.
[3] Satish Balay, William D. Gropp, Lois Curfman McInnes, and Barry F. Smith. Efficient
management of parallelism in object oriented numerical software libraries. In E. Arge,
A. M. Bruaset, and H. P. Langtangen, editors, Modern Software Tools in Scientific
Computing, pages 163–202. Birkhäuser Press, 1997.
[4] B. Cockburn and C.-W. Shu. Tvb runge-kutta local projection discontinuous galerkin
methods for scalar conservation laws ii: General framework. Mathematics of Computation,
1989.
[5] H.A. Van der Vorstm. Bi-cgstab: A fast and smoothly converging variant of bi-cg for the
solution of nonsymmetric linear systems. SIAM J. Sci. and Stat. Comput., 1992.
[6] Nash Stephen Kahaner David, Moler Cleve. Numerical Methods and Software. 1989.
[7] Robert M. Kirby and George Em Karniadakis. Selecting the numerical flux in discontinuous
galerkin methods for diffusion problems. Journal of Scientific Computing, 2005.
[8] Hailiang Liu and Jue Yan. The direct discontinuous galerkin (ddg) method for diffusion
with interface corrections. Commun. Comput. Phys. Vol. 8, No. 3, 2010.
[10] W.H. Reed and T.R. Hill. Triangular mesh methods for the neutron transport equation.
Tech. Report LA-UR-73-479, 1973.
[11] Maxime Theillard, Landry Fokoua Djodom, Jean-Léopold Vié, and Frédéric Gibou. A
second-order sharp numerical method for solving the linear elasticity equations on irregular
domains and adaptive grids – application to shape optimization. Journal of Computational
Physics 233, 2012.
[12] Wikipedia contributors. Legendre polynomials — Wikipedia, the free encyclopedia, 2018.
[Online; accessed 23-August-2018].
[13] Wikipedia contributors. Runge–kutta methods — Wikipedia, the free encyclopedia, 2018.
[Online; accessed 26-August-2018].
Appendix A
Proof.
i. See [12].
iv.
n n−1
* +
0 0
X X
hP2n+1 , P2m i = 1+ (4i + 1)P2i , (4i + 3)P2i+1
i=1 i=0
n−1
X n−1
X
= (4i + 3) h1, P2i+1 i + (4i − 3)(4i + 3)hP2i−2 , P2i+1 i
i=0 i=0
Since ∀i, 2i − 2 6≡ 2i + 1 (mod 2), hP2i−2 , P2i+1 i = 0. Moreover, h1, Pn i = 2δ0n and
0 0
thus for i ≥ 0, h1, P2i+1 i = 0. Finally, hP2n+1 , P2m i = 0.
v.
n m
* +
0 0
X X
hP2n+1 , P2m+1 i = 1+ (4i + 1)P2i , 1 + (4j + 1)P2j
i=1 j=1
n
X m
X n X
X m
= h1, 1i + (4i + 1)h1, P2i i + (4j + 1)h1, P2j i + (4i + 1)(4j + 1)hP2i , P2j i
i=1 j=1 i=1 j=1
n Xm
X 2
=2+ (4i + 1)(4j + 1) δij
i=1 j=1 4i + 1
Xn
=2+2 (4i + 1) = 2 + 2n + 4n(n + 1)
i=1
= 2 + 2n(2n + 3)
vi.
*n−1 m−1
+
0 0
X X
hP2n , P2m i = (4i + 3)P2i+1 , (4j + 3)P2j+1
i=0 j=0
X m−1
n−1 X
= (4i + 3)(4j + 3)hP2i+1 , P2j+1 i
i=0 j=0
X m−1
n−1 X 2
= (4i + 3)(4j + 3) δij
i=0 j=0 4i + 3
n−1
X
=2 (4i + 3) = 4n(n − 1) + 6n
i=0
= 2n(2n + 1)