Professional Documents
Culture Documents
Jean-Marc GRATIEN
IFPEN
1 et 4 av Bois Préau
92588 Rueil-Malmaison, FRANCE
jean-marc.gratien@ifpen.fr
(ii) defining for all (uh , vh ) ∈ Uh ×Vh a bilinear form ah (uh , vh ) The discrete space for the ccG method is obtained with:
and a linear form bh (vh ), Solving the discrete problem con- (i) Sh = Th , (ii) Vh = Th , (iii) Gh = Gccg with Gccg such
h h
sists then in finding uh ∈ Uh such that: that
ah (uh , vh ) = bh (vh ) ∀vh ∈ Vh , ∀vh ∈ Vh , Gccg green
(vh , Tgh (vh )).
h (vh ) = Gh (3)
The definition of a discrete function space Uh is based on where Tgh is a linear trace reconstruction operator on the
three main ingredients : faces of Th .
• Th the mesh representing Ω, Sh a submesh of Th where Let for all (uh , vh ) ∈ Vhccg × Vhccg ,
∀S ∈ Sh , ∃TS ∈ Th , S ⊂ TS ;
Z
def
accg
h (u h , v h ) = κ∇h uh ·∇h vh
• Vh the space of vector of degree of freedoms with com- Ω
X Z
ponents indexed by the mesh entities (cells, faces or − [{κ∇h uh }ω ·nF Jvh K + Juh K{κ∇vh }ω ·nF ]
nodes) ; F ∈Fh F
Z
• Gh a linear gradient operator that defines for each vec-
X γF
+ η Juh KJvh K,
tor vh ∈ Vh a constant gradient on each element of Sh F ∈Fh
hF F
and ∇h the broken gradient operator. (4)
Using the above ingredients, we can define for all vh ∈ Vh The method reads :
a piecewise affine function vh ∈ Uh ⊂ P1d (Sh ) such that:
Z
Find uh ∈ Vhccg s.t. accg
h (uh , vh ) = f vh for all vh ∈ Vhccg .
∀S ∈ Sh , S ⊂ TS , TS ∈ Th , ∀x ∈ S, Ω
(5)
vh (x)|S = vTS + Gh (vh )|S ·(x − xTS ).
Usually three kind of submesh Sh are considered : Th the The hybrid finite volume method.
mesh itself, Ph the submesh with pyramidal subcells based recovers the SUSHI scheme[14, 13, 8, 7]. The discrete
on the face entities of Th and Nh with subcells based on the space with hybrid unknowns is then obtained with: (i) Sh =
nodes of Th . Ph , (ii) Vh = Th × Fh , (iii) Gh = Ghyb
h with Ghyb
h such that,
for all (vhT , vhF ) ∈ Th × Fh , all T ∈ Th and all F ∈ FT , a u t o Vh = newSUSHISpace ( Th ) ;
a u t o u = Vh−> t r i a l ( ”U” ) ;
a u t o v = Vh−>t e s t ( ”V” ) ;
BilinearForm a =
Ghyb T F green
h (vh , vh )|PT ,F = Gh (vhT , vhF )|T +rh (vhT , vhF )|PT ,F nT,F . i n t e g r a t e ( a l l C e l l s ( Th ) , dot (K∗ grad ( u ) , grad ( v ) ) ) ;
LinearForm b =
(6) i n t e g r a t e ( a l l C e l l s ( Th ) , f ∗v ) ;
Function space types also define the sub types Listing 2: Examples of production rules for the co-
FunctionType, TestFunctionType and TrialFunctionType efficient γ
corresponding to the mathematical notions of discrete func- BaseExpr = BaseType | BaseExpr ∗ BaseExpr ;
tions, test and trial functions in variational formulations. VarExpr = VarType | BaseExpr ∗ VarExpr | VarExpr ∗ VarExpr ;
C o e f E x p r = BaseExpr | VarExpr ;
Instances of TrialFunctionType and FunctionType are as-
sociated to a Variable object containing a vector of DOFs To obtain trial and test expressions, we introduce linear
stored in memory associated to a string key corresponding operators acting on test and trial functions. A few examples
to the variable name. For functions, the vector of DOFs are provided in Listing 3, and include (i) grad, the gradient
is used in the evaluation on a point x ∈ Ω while for trial of the trial/test function; (ii) trace operators like jump and
functions, this vector is used to receive the solution of the avg representing, respectively, the jump and average of a
discrete problem. Test functions implicitely representing the trial/test function across a face. Besides linear operators,
space basis then are not associated to any Variable objects, the production rules for trial and test expressions in Listing 3
neither vector of DOFs. Unlike FunctionType, the evalua- include various products by coefficients resulting from the
tion of TrialFunctionType and TestFunctionType is lazy production rules of Listing 2 (dot denote the vector inner
in the sense that it returns a linear combination. This lin- product).
ear combination can be used to build local linear or bilinear
contributions to the global system, or enables to postpone
the evaluation with the variable data. Listing 3: Production rules for trial and test expres-
sions
The BilinearForm and LinearForm concept represent L i n e a r O p e r a t o r = ” g r a d ” | ”jump ” | ”avg ” ;
TrialExpr = TrialFunction |
the linear and bilinear forms described in 2. They allow to CoefExpr ∗ T r i a l E x p r |
” d o t ( ”CoefExpr , T r i a l E x p r ” ) ” |
define expressions using test and trial functions, unary and LinearOperator ”( ”TrialExpr ”) ” ;
TrialExpr = TestFunction |
binary operators. CoefExpr ∗ TestExpr |
” d o t ( ”CoefExpr , T e s t E x p r ” ) ” |
L i n e a r O p e r a t o r ” ( ”TestExpr ” ) ” ;
3.3 DSEL implementaion
The main goal of the DSEL is to allow a notation as close
as possible to the mathematical notation presented in §2.
Bilinear forms.
This section focuses bilinear forms, as the ingredients for
Once test and trial expressions are available, bilinear terms
linear forms are essentially similar. The exposition is not
can be obtained as contraction products of trial and test ex-
meant to be exhaustive, but instead to present a few signif-
pressions or as sums thereof, as described in Listing 4.
icant examples from which others can be inferred. We first
define our DSEL giving some production rules that enable
to create trial and test expressions as well as bilinear terms Listing 4: Production rules for bilinear terms
using the Extended Backus–Naur Form (EBNF)[1], then we B i l i n e a r T e r m = T r i a l E x p r ∗ TestExpr |
”dot ( ”TrialExpr , TestExpr ” ) ” |
detail how the DSEL has been implemented using the tools CoefExpr ∗ B i l i n e a r T e r m |
BilinearTerm + BilinearTerm ;
provided by the Boost Proto framework.
3.3.1 Language definition Bilinear forms finally result from the integration of bilin-
ear terms on groups of mesh items (cf. Table 3.1). Produc-
tion rules for bilinear forms are given in Listing 5.
Terminals and Keywords.
The DSEL Terminals are composed of a number prede-
fined types categorized in the following families: (i) the Listing 5: Production rules for bilinear forms
BaseType family for the standard C++ types representing I n t e g r a t e B i l i n e a r T e r m = ” i n t e g r a t e ( ”MeshGroup , B i l i n e a r T e r m ” ) ” ;
BilinearForm = IntegrateBilinearTerm |
integers and reals; (ii) the VarType family for all discrete IntegrateBilinearTerm + BilinearForm ;
form, structures, (see more details in the framework docu- / / Expressions in the pde domain will be wrapped in FVDSLExpr<>
/ / and must conform to the FVDSLGrammar
mentation [16]). In this section, we detail how we have trans- s t r u c t FVDSLDomain
: p r o t o : : domain<p r o t o : : g e n e r a t o r <FVDSLExpr>,
lated our language formal definition §3.3.1 in proto objects FVDSLGrammar> { } ;
that enable to define expressions, the language Grammar, t e m p l a t e <typename Expr>
Context and Transform structures to evaluate expressions s t r u c t FVDSLExpr
: p r o t o : : e x t e n d s <Expr , FVDSLExpr<Expr >, FVDSLDomain>
and implement algorithms. {
e x p l i c i t FVDSLExpr ( Expr c o n s t &e x p r )
: p r o t o : : e x t e n d s <Expr ,
FVDSLExpr<Expr >,
Language front ends. FVDSLDomain>( e x p r )
The language front ends is defined by (i) the terminals, {}
BOOST PROTO EXTENDS USING ASSIGN ( FVDSLExpr )
(ii) the keywords listed in 1, (iii) and the grammar based on };
s t r u c t BaseTypeGrammar
: p r o t o : : t e r m i n a l < p r o t o : : c o n v e r t i b l e t o <R ea l> >
{} ;
Listing 9: Free function and meta-function associ-
ated to fvdsel::tag::dot s t r u c t MeshVarTypeGrammar
: p r o t o : : a n d < p r o t o : : t e r m i n a l <p r o t o : : >,
t e m p l a t e <typename L , typename R> p r o t o : : i f < f v d s e l : : i s m e s h v a r <p r o t o : : v a l u e >() > >
typename {} ;
p r o t o : : r e s u l t o f : : m a k e e x p r<
f v d s e l : : t a g : : dot s t r u c t TestFunctionTerminal
, FVDSLDomain : p ro to : : and < FunctionTerminal ,
, L const & p r o t o : : i f < f v d s e l : : i s t e s t f u n c t i o n <p r o t o : : v a l u e >() > >
, R const & {} ;
>:: type
dot ( L c o n s t &l , R c o n s t& r ) struct TrialFunctionTerminal
{ : p ro to : : and < FunctionTerminal ,
r e t u r n p r o t o : : m a k e e x p r< f v d s e l : : t a g : : dot , p r o t o : : i f < f v d s e l : : i s t r i a l f u n c t i o n <p r o t o : : v a l u e >() > >
FVDSLDomain >( b o o s t : : r e f ( l ) , {} ;
boost : : r e f ( r ) ) ;
} struct CoefExprGrammar ;
s t r u c t BilinearGrammar
: proto : : or <
Listing 10: terminal meta-function p r o t o : : m u l t i p l i e s <TrialExprGrammar ,
TestExprGrammar >, linear context with a factor equal to the measure of
f v d s e l : : dotop<TrialExprGrammar ,
TestExprGrammar >, the cell.
PlusBilinear ,
MinusBilinear ,
MultBilinear
{} ;
> To implement the integration algorithm associated to lin-
} ear variational formulation, we have used both Context and
Transform structures. A BilinearContext object, referenc-
ing a linear system back end object used to build the global
3.3.3 Evaluation contexts and transforms linear system with different linear algebra packages has been
developped to evaluate the global expression. On an Inte-
Language back ends : expression evaluation, algorithm grate node, this object call a IntegratorOp transform on the
implementation. expression tree. In listing 12, we detail the implementation
The DSEL back ends are composed of algebraic structures of this transform that matches in our example the expres-
(matrices, vectors, linear combinations) used in different sion with the tag fvdsel::tag::integrate, the MeshGroup
kind of algorithms based mesh entities iterations, matrices, expression allCells(Th) and the term
vectors evaluation or assembly operations. The implemen- dot(K*grad(u),grad(v)).
tation of theses algorithms will be based on the evaluation
and manipulation of FVDSLDomain expressions. Such evalu-
Listing 12: Integrator transform
ations are based on two kind of Proto concepts : Context s t r u c t I nte grator : proto : : c a l l a b l e
and Transform structures. {
/ / ... callable object that will use a BilinearIntegrator transform on
/ / a bilinear expression
• A Context is like a function object that is passed along typedef int r e s u l t t y p e ;
node types in an expression, but with rules in a Proto / / call a transform that analyze ExprT
/ / and dispatch to the appropriate transform
grammar. In this way, they are like semantic actions return 0 ;
}
in other compiler-construction toolkits. } ;
st ruc t IntegratorOp
Algorithms are implemented as specific expression tree eval- : proto : : or <
uation, as a sequence of piece of algorithms associated to p r o t o : : when<
f v d s e l : : IntegratorGrammar ,
the behaviour of Evaluation context on each node or on f v d s e l : : I n t e g r a t o r ( p r o t o : : c h i l d c <2>,
p r o t o : : c h i l d c <1>,
Transforms that match production rules. Theses pieces of proto : : s t a t e ,
proto : : data
algorithm are written respectively in the operator()() of )
the structure Context::eval for Context objects, in the p r o t o : : when<
>
struct BilinearIntegrator
: proto : : or <
p r o t o : : when< p r o t o : : m u l t i p l i e s <TrialExprGrammar , 4. APPLICATIONS
TestExprGrammar >,
MultIntegrator ( proto : : l e f t , / / ! lexpr Our benchmark is based on the following exact solution
proto : : r i g h t , / / ! rexpr
proto : : s t a t e , / / ! state for the diffusion problem (2):
proto : : data / / ! con-
text
2 3
)>, 1 0 0
p r o t o : : when< f v d s e l : : dotop<TrialExprGrammar ,
TestExprGrammar >, u(x) = sin(πx) sin(πy) sin(πz), κ = 40 1 05 ,
D o t I n t e g r a t o r ( p r o t o : : c h i l d c <0>, / / ! left
p r o t o : : c h i l d c <1>, / / ! trial
0 0 1
proto : : s t a t e , / / ! state
text
proto : : data / / ! con-
on the square domain Ω = [0, 1]3 with
)
>, f (x, y, z) = 3πsin(πx)sin(πy)sin(πz).
p r o t o : : when< p r o t o : : plus<B i l i n e a r G r a m m a r ,
B i l i n e a r G r a m m a r >, We compare the following methods: (i) the DSEL imple-
B i l i n e a r I n t e g r a t o r ( proto : : r i g h t ,
B i l i n e a r I n t e g r a t o r ( proto : : l e f t ,
mentations of the ccG method (5) provided in Listing 15;
proto : : s t a t e , (ii) the DSEL implementation of the SUSHI method with
proto : : data ) ,
proto : : data face unknowns (6) provided in Listing 16; (iii) the DSEL
)
> implementation of the G method (17) provided in Listing 17
>
{} ;
The benchmark test cases are run on a work station with a 5. CONCLUSION AND PERSPECTIVES
quad-core Intel Xeon processor GenuineIntel W3530, 2.80GHz, Our DSEL for lowest-order methods enables to describe
8MB for cach size. and solve various non trivial academic problems. Different
In our numerical tests we consider a families of h-refined numerical methods were implemented with a high level lan-
meshes with h decreasing from 0.1 to 0.0125. guage close to one used in the unified mathematical frame-
The linear systems are solved using the PETSc library[6] work. The analysis of the performance results of our study
with the BICGStab solver preconditioned by the euclid ILU(0) cases shows that the overhead of the language is not impor-
preconditioner, with relative tolerance set to 10−6 . tant regarding standard hand written codes.
The benchmarks monitor various metrics:
In some future work, we plan to extend our DSEL to
(i) Accuracy. The accuracy of the methods is evaluated take into account: (i) various types of boundary conditions,
in terms of the L2 norm of the error. For the methods (ii) the non linear formulation hiding the complexities of
of §2, the L2 -norm of the error is evaluated using the derivaties computation.
cell center as a quarature node, i.e.,
0 11
2 Within the HAMM[2] project (Hybrid architecture and
X 2 multi-level model), we plan to handle multi-level methods
ku − uh kL2 (Ω) ≈ @ |T |(u(xT ) − uT ) A .
and illustrate the interest of our approach to take advan-
T ∈Th
tage seamless of the performance of new hybrid hardware
The convergence order of a method is classically ex- architecture with GP-GPU.
pressed relating the error to the mesh size h.
102 tinit
10−2 tass
tsolve
101
−3
10
100
−4
10
G 10−1
SUSHI
ccG
10−5
10−1.8 10−1.6 10−1.4 10−1.2 10−1 103 104 105
2
(a) L -error vs. h (c) ccG : CPU cost vs. h
10−4
RaTu1
SUSHI
ccG
10−5
103 104 105 106
2
(b) L -error vs. NDOF ccG-fvCpp
10−2
ccG-DSEL
10−3
tinit
tass
10−4 SUSHI tsolve
G
SUSHI 0.5 1 1.5 2 2.5
ccG
10−5
104 105 106 107
Figure 4: Comparison of different methods and im-
2
(c) L -error vs. Nnz plementation for the 2D test case of §4 (h = 0.00625)