You are on page 1of 12

Implementing a Domain Specific Embedded Language

for lowest-order variational methods with Boost Proto


Jean-Marc Gratien

To cite this version:


Jean-Marc Gratien. Implementing a Domain Specific Embedded Language for lowest-order
variational methods with Boost Proto. 2012. <hal-00788281>

HAL Id: hal-00788281


https://hal-ifp.archives-ouvertes.fr/hal-00788281
Submitted on 14 Feb 2013

HAL is a multi-disciplinary open access L’archive ouverte pluridisciplinaire HAL, est


archive for the deposit and dissemination of sci- destinée au dépôt et à la diffusion de documents
entific research documents, whether they are pub- scientifiques de niveau recherche, publiés ou non,
lished or not. The documents may come from émanant des établissements d’enseignement et de
teaching and research institutions in France or recherche français ou étrangers, des laboratoires
abroad, or from public or private research centers. publics ou privés.
Implementing a Domain Specific Embedded Language for
lowest-order variational methods with Boost Proto

Jean-Marc GRATIEN
IFPEN
1 et 4 av Bois Préau
92588 Rueil-Malmaison, FRANCE
jean-marc.gratien@ifpen.fr

ABSTRACT level algorithms which need to deal directly with hardware


In this paper we propose an original implementation for a specificity, for performance reasons, are provided. They of-
large family of lowest-order methods to solve diffusive prob- ten offer services to manage mesh data services and linear
lems with a FreeFEM-like domain specific language targeted algebra services which are key elements to have efficient par-
at defining discrete linear and bilinear forms. We discuss allel software. However, all these frameworks often provide
how by using the Boost Proto framework we have devel- only partial answers to the problem as they only deal with
opped the back-end and the front-end of the language.We hardware complexity and low level numerical complexity
validate the proposed DSEL design by the implementation like linear algebra. The complexity related to discretization
of several academic problems. The overhead of the language methods and physical models lacks tools to help physicists to
is evaluated by comparing with a more traditional imple- develop complex applications. New paradigms for scientific
mentation. software must be developed to help them to seamlessly han-
dle the different levels of complexity so that they can focus
on their specific domain. Generative programming, compo-
General Terms nent engineering and domain-specific languages (either DSL
Concepts and Generic Programming or DSEL) are key technologies to make the development
of complex applications easier to physicists, hiding the com-
Keywords plexity of numerical methods and low level computer science
services. These paradigms allow to write code with a high
DSEL, Generative programming, Framework, Boost Proto level expressive language and take advantage of the efficiency
of generated code for low level services close to hardware
1. INTRODUCTION specificities. Their application to Scientific Computing has
Industrial simulation software have to manage: (i) the been up to now limited to Finite Element (FE) methods, for
complexity of the underlying physical models, usually ex- which a unified mathematical framework has been existing
pressed in terms of a PDE system completed with algebraic for a long time. Such kind of DSL have been developped for
closure laws, (ii) the complexity of numerical methods used finite element or Galerkin methods in projects like Freefem,
to solve the PDE systems, and finally (iii) the complexity Getdp, Getfem++, Sundance, Feel++, Fenics project. We try
of the low level computer science services required to have to extend this kind of approach to lowest order methods to
efficient software on modern hardware. Robust and effective solve the PDE systems of geo modeling applications. A re-
finite volume (FV) methods as well as advanced program- cent consistent unified mathematical frame allows a unified
ming techniques need to be combined in order to fully ben- description of a large family of these methods, and enable
efit from massively parallel architectures (implementation then, as for FE methods, the design of a high level language
of parallelism, memory handling, design of connections). inspired from the mathematical notation, that could helps
Moreover, the above methodologies and technologies become physicist to implement their application writing the mathe-
more and more sophisticated and too complex to be handled matical formulation at a high level, hiding the complexity of
by physicists alone. Nowadays, this complexity management numerical methods and low level computer science services
becomes a key issue for the development of scientific soft- guaranty of high performance. We have developped such
ware. Some frameworks already offer a number of advanced language, that we have embedded in the C++ language, on
tools to deal with the complexity related to parallelism in top of Arcane plateform [15], with the Boost Proto library
a transparent way. Hardware complexity is hidden and low [16], a powerful framework providing tools to design DSEL.
We focus on the main ingredients of the language and detail
how using the Boost Proto framework we have implemented
in a user friendly declarative way our new language. We
Permission to make digital or hard copies of all or part of this work for check the capability of our DSEL to allow the description
personal or classroom use is granted without fee provided that copies are and the resolution of various and complex problems with
not made or distributed for profit or commercial advantage and that copies different lowest-order methods. We validate the design of
bear this notice and the full citation on the first page. To copy otherwise, to the DSEL on the implementation of several academic prob-
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
lems. We present some numerical results and compare the
C++Now Aspen CO, USA, May 13–18, 2012 performance of their implementation with the DSEL to their
Copyright 20XX ACM X-XXXXX-XX-X/XX/XX ...$10.00.
hand written counterpart, evaluating in that way the over- We denote Th the space of degree of freedoms with com-
head of the language in order to illustrate the interest of the ponents indexed by cells and Fh the space of degree of free-
C++ language as the host language for our DSL. doms with components indexed only by faces. Ussually the
The paper is organized as follows : in the first section we following choices are considered:
present the mathematical domain targetted by the proposed
DSEL, in the second session we discuss the implementation
Vh = Th or Vh = Th × Fh . (1)
of the DSEL with Boost Proto framework then in the last With this framework, the model problem can be solved
session we validate our approach with numerical results. with various methods :
• the cell centered Galerkin (ccG) method and the G-
2. MATHEMATICAL SETTING method with cell unknowns only ;
The unified mathematical frame presented in [11, 12] al-
lows a unified description of a large family of lowest-order • the hybrid finite volume method with both cell and
methods. The key idea is to reformulate the method at hand face unknowns that recover the mimetic finite differ-
as a (Petrov)-Galerkine scheme based on a possibly incom- ence (MFD) and mixed/hybrid finite volume (MHFV)
plete, broken affine space. This is done by introducing a family.
piecewise constant gradient reconstruction, which is used to
recover a piecewise affine function starting from cell (and
possibly face) centered unknowns. The G-method [4].
For example, considering the following heterogeneous dif- :
fusion model problem(2): The trial space for is obtained with (i) Sh = Ph , (ii) Vh =
Th , (iii) and Gh = Ggh a gradient operator, piecewise con-
−∇·(κ∇u) = f in Ω, stant on the elements S ∈ Ph , base on the L construction,
u=0 on ∂Ω, detailed in[4]. The method reads then :
Z
with source term f ∈ L2 (Ω), κ piecewize constant. Find uh ∈ Vhg s.t. agh (uh , vh ) = f vh for all vh ∈ P0d (Th ),
The continuous weak formulation reads: Find u ∈ H01 (Ω) Ω
such that def
agh (uh , vh ) =
P R
where F ∈Fh F
{κ∇h uh }·nF Jvh K with .
a(u, v) = 0 ∀v ∈ H01 (Ω),
with The cell centered Galerkin method [9, 10].
Z : We introduce the linear gradient operator Ggreenh : Th ×
a(u, v) =
def
∇u·∇v. Fh → [P0d (Th )]d such that, for all (vT , vF ) ∈ Th × Fh and
Ω all T ∈ Th ,
In this framework, for a given partition Th of Ω, a spe- 1 X
cific lowest-order method is defined by (i) selecting a trial Ggreen
h (vT , vF )|T = |F |d−1 (vF − vT )nT,F . (2)
|T |d F ∈F
function space Uh (Th ) and a test function space Vh (Th ), T

(ii) defining for all (uh , vh ) ∈ Uh ×Vh a bilinear form ah (uh , vh ) The discrete space for the ccG method is obtained with:
and a linear form bh (vh ), Solving the discrete problem con- (i) Sh = Th , (ii) Vh = Th , (iii) Gh = Gccg with Gccg such
h h
sists then in finding uh ∈ Uh such that: that
ah (uh , vh ) = bh (vh ) ∀vh ∈ Vh , ∀vh ∈ Vh , Gccg green
(vh , Tgh (vh )).
h (vh ) = Gh (3)
The definition of a discrete function space Uh is based on where Tgh is a linear trace reconstruction operator on the
three main ingredients : faces of Th .
• Th the mesh representing Ω, Sh a submesh of Th where Let for all (uh , vh ) ∈ Vhccg × Vhccg ,
∀S ∈ Sh , ∃TS ∈ Th , S ⊂ TS ;
Z
def
accg
h (u h , v h ) = κ∇h uh ·∇h vh
• Vh the space of vector of degree of freedoms with com- Ω
X Z
ponents indexed by the mesh entities (cells, faces or − [{κ∇h uh }ω ·nF Jvh K + Juh K{κ∇vh }ω ·nF ]
nodes) ; F ∈Fh F
Z
• Gh a linear gradient operator that defines for each vec-
X γF
+ η Juh KJvh K,
tor vh ∈ Vh a constant gradient on each element of Sh F ∈Fh
hF F
and ∇h the broken gradient operator. (4)
Using the above ingredients, we can define for all vh ∈ Vh The method reads :
a piecewise affine function vh ∈ Uh ⊂ P1d (Sh ) such that:
Z
Find uh ∈ Vhccg s.t. accg
h (uh , vh ) = f vh for all vh ∈ Vhccg .
∀S ∈ Sh , S ⊂ TS , TS ∈ Th , ∀x ∈ S, Ω
(5)
vh (x)|S = vTS + Gh (vh )|S ·(x − xTS ).
Usually three kind of submesh Sh are considered : Th the The hybrid finite volume method.
mesh itself, Ph the submesh with pyramidal subcells based recovers the SUSHI scheme[14, 13, 8, 7]. The discrete
on the face entities of Th and Nh with subcells based on the space with hybrid unknowns is then obtained with: (i) Sh =
nodes of Th . Ph , (ii) Vh = Th × Fh , (iii) Gh = Ghyb
h with Ghyb
h such that,
for all (vhT , vhF ) ∈ Th × Fh , all T ∈ Th and all F ∈ FT , a u t o Vh = newSUSHISpace ( Th ) ;
a u t o u = Vh−> t r i a l ( ”U” ) ;
a u t o v = Vh−>t e s t ( ”V” ) ;
BilinearForm a =
Ghyb T F green
h (vh , vh )|PT ,F = Gh (vhT , vhF )|T +rh (vhT , vhF )|PT ,F nT,F . i n t e g r a t e ( a l l C e l l s ( Th ) , dot (K∗ grad ( u ) , grad ( v ) ) ) ;
LinearForm b =
(6) i n t e g r a t e ( a l l C e l l s ( Th ) , f ∗v ) ;

where the linear residual operator rh : Th × Fh → P0d (Ph ) is


defined as follows: For all T ∈ Th and all F ∈ FT ,
3.1 Algebraic back-end
1 i In this section we focus on the elementary ingredients used
d 2
h
rh (vhT , vhF )|PT ,F = vF − vT − Ggreen (vhT , vhF )|T ·(xF − xT ) .
dT,F h to build the terms appearing in the linear and bilinear forms
of §2, which constitute the back-end of the DSEL presented
This method with hybrid unknowns reads : in §3.3.
Z
Find uh ∈ Vhhyb s.t. asushi
h (uh , vh ) = f vh for all vh ∈ Vhhyb ,

Mesh.
The mesh concept is an important ingredient of the math-
with ematical frame. Mesh types and data structures are a very
Z
def standard issue and different kinds of implementation already
asushi
h (uh , vh ) = κ∇h uh ·∇h vh , (7)
Ω exist in various framework. We developped above Arcane
mesh data structures a mesh concept defining
and ∇h broken gradient on Ph .
(i) MeshType::dim the space dimension, (ii) the subtypes
Cell, Face and Node for mesh element of dimension re-
3. IMPLEMENTATION spectively MeshType::dim, MeshType::dim-1 and 0. Some
The framework described in §2 allows a unified descrip- free functions like allCells(<mesh>), allFaces(<mesh>),
tion for a large family of lowest methods and as for FE/DG boundaryCells(<mesh>), boundaryFaces(<mesh>),
methods, it enables the design of a high level language in- internalCells(<mesh>), internalFaces(<mesh>) are pro-
spired from the mathematical notation. Such language en- vided to manipulate the mesh and to extract different parts
ables to express the variational discretisation formulation of of the mesh.
PDE problem with various methods defining bilinear and lin-
ear forms. Algorithms are then generated to solve the prob- Vector spaces, degrees of freedom and discrete vari-
lems, evaluating the forms representing the discrete prob- ables.
lem. The language is based on concepts (mesh, function The class Variable with template parameters ItemT and
space, test trial functions, differential operators) close to ValueT manages vectors of values of type ValueT and pro-
their mathematical counterpart. They are the front end of vides data accessors to these values with either mesh el-
the language. Their implementations use algebraic objects ements of type ItemT, integer ids or iterators identifying
(vectors, matrices, linear operators) which are the back end these elements. Instances of the class Variable are man-
of the language. Linear and bilinear forms are represented by aged by VariableMng, a class that associates each variable
expressions built with the terminals of the language linked to its unique string key label corresponding to the variable
with unary, binary operators (+,-,*,/,dot(.,.)) and with name.
free functions like grad(.), div(.) integrate(.,.). The
purpose of theses expressions is first to express the vari- Linear combination, linear and bilinear contribution.
ational discretization formulation of the user problem but The point of view presented in §2 naturally leads to a finite
also to solve and find the solution of the problem by evalu- element-like assembly of local contributions stemming from
ating them with specific context. integrals over elements or faces. This procedure leads to ma-
nipulate local vectors indexed by mesh entities represented
In the first part of this section, we present the different by the concept of class LinearCombination. Associated
C++ concepts defining the front end of our language, their to an efficient linear algebra, this concept enable to create
mapping onto their mathematical counterpart and their links LinearContribution (local vectors) and BilinearContribution
with algebraic objects corresponding to the back end of the (local matrices) used in the assembly procedure of the global
language. We then introduce the DSEL that enables to ma- matrix and vector of the global linear system.
nipulate these concepts to build complex expressions close
to the mathematical discretisation formulation of continuous 3.2 Functional front-end
PDE problems. We finally explain how, evaluating these ex-
pressions, we can generate source codes that solve discrete Function spaces.
problems. Incomplete broken polynomial spaces defined by (2) are
mapped onto C++ types according to the FunctionSpace con-
For our diffusion model problem (2), such DSEL will for cept. The key role of a FunctionSpace is to bridge the gap
instance achieve to express the variational discretization for- between the algebraic representation of DOFs and the func-
mulation 5 with the programming counterpart presented in tional representation used in the methods of §2. This is
listing 1. achieved by the functions grad and eval, which are the C++
counterparts of respectively the linear operators Gh and Rh .
More specifically,
Listing 1: Diffusion problem implementation (i) for all S ∈ Sh , grad(S) returns a vector-valued linear
MeshType Th ;
Real K; combination corresponding to the (constant) restric-
tion Gh |S ; Trial (resp. test) expressions are obtained as the product
of a coefficient γu (resp. γv ) by a linear operator Lu (resp.
(ii) for all S ∈ Sh and all x ∈ S, eval(S, x) returns Lv ) acting on a trial (resp. test) function. The coefficient
a scalar-valued linear combination corresponding to can result from the algebraic combination of constant values
Rh |S (x) defined according to (2). and Variables evaluated at item I. Listing 2 defines the pro-
The linear combinations returned by grad and eval can be duction rules that enable to create coefficient expressions in-
used to build LinearContributions and BilinearContributions volving, in particular, constant values, Variables over Cells
as described in the previous sections. and products thereof.

Function space types also define the sub types Listing 2: Examples of production rules for the co-
FunctionType, TestFunctionType and TrialFunctionType efficient γ
corresponding to the mathematical notions of discrete func- BaseExpr = BaseType | BaseExpr ∗ BaseExpr ;
tions, test and trial functions in variational formulations. VarExpr = VarType | BaseExpr ∗ VarExpr | VarExpr ∗ VarExpr ;
C o e f E x p r = BaseExpr | VarExpr ;
Instances of TrialFunctionType and FunctionType are as-
sociated to a Variable object containing a vector of DOFs To obtain trial and test expressions, we introduce linear
stored in memory associated to a string key corresponding operators acting on test and trial functions. A few examples
to the variable name. For functions, the vector of DOFs are provided in Listing 3, and include (i) grad, the gradient
is used in the evaluation on a point x ∈ Ω while for trial of the trial/test function; (ii) trace operators like jump and
functions, this vector is used to receive the solution of the avg representing, respectively, the jump and average of a
discrete problem. Test functions implicitely representing the trial/test function across a face. Besides linear operators,
space basis then are not associated to any Variable objects, the production rules for trial and test expressions in Listing 3
neither vector of DOFs. Unlike FunctionType, the evalua- include various products by coefficients resulting from the
tion of TrialFunctionType and TestFunctionType is lazy production rules of Listing 2 (dot denote the vector inner
in the sense that it returns a linear combination. This lin- product).
ear combination can be used to build local linear or bilinear
contributions to the global system, or enables to postpone
the evaluation with the variable data. Listing 3: Production rules for trial and test expres-
sions
The BilinearForm and LinearForm concept represent L i n e a r O p e r a t o r = ” g r a d ” | ”jump ” | ”avg ” ;
TrialExpr = TrialFunction |
the linear and bilinear forms described in 2. They allow to CoefExpr ∗ T r i a l E x p r |
” d o t ( ”CoefExpr , T r i a l E x p r ” ) ” |
define expressions using test and trial functions, unary and LinearOperator ”( ”TrialExpr ”) ” ;
TrialExpr = TestFunction |
binary operators. CoefExpr ∗ TestExpr |
” d o t ( ”CoefExpr , T e s t E x p r ” ) ” |
L i n e a r O p e r a t o r ” ( ”TestExpr ” ) ” ;
3.3 DSEL implementaion
The main goal of the DSEL is to allow a notation as close
as possible to the mathematical notation presented in §2.
Bilinear forms.
This section focuses bilinear forms, as the ingredients for
Once test and trial expressions are available, bilinear terms
linear forms are essentially similar. The exposition is not
can be obtained as contraction products of trial and test ex-
meant to be exhaustive, but instead to present a few signif-
pressions or as sums thereof, as described in Listing 4.
icant examples from which others can be inferred. We first
define our DSEL giving some production rules that enable
to create trial and test expressions as well as bilinear terms Listing 4: Production rules for bilinear terms
using the Extended Backus–Naur Form (EBNF)[1], then we B i l i n e a r T e r m = T r i a l E x p r ∗ TestExpr |
”dot ( ”TrialExpr , TestExpr ” ) ” |
detail how the DSEL has been implemented using the tools CoefExpr ∗ B i l i n e a r T e r m |
BilinearTerm + BilinearTerm ;
provided by the Boost Proto framework.

3.3.1 Language definition Bilinear forms finally result from the integration of bilin-
ear terms on groups of mesh items (cf. Table 3.1). Produc-
tion rules for bilinear forms are given in Listing 5.
Terminals and Keywords.
The DSEL Terminals are composed of a number prede-
fined types categorized in the following families: (i) the Listing 5: Production rules for bilinear forms
BaseType family for the standard C++ types representing I n t e g r a t e B i l i n e a r T e r m = ” i n t e g r a t e ( ”MeshGroup , B i l i n e a r T e r m ” ) ” ;
BilinearForm = IntegrateBilinearTerm |
integers and reals; (ii) the VarType family for all discrete IntegrateBilinearTerm + BilinearForm ;

variable types defined in §3.1; (iii) the MeshGroupType fam-


ily for types representing collection of mesh entities such as
the ones listed in §3.1; (iv) the Function, TestFunction and 3.3.2 Language implentation with Boost Proto
TrialFunction families representing the functions, test and We have based our implementation on the boost::proto
trial functions defined in §3.2. library by Niebler [16], a powerful framework to build DSEL
The DSEL is based on some predefined keywords listed in C++. This library provides a collection of generic con-
in table 1 semantically closed to their counterpart in the cepts and metafunctions that help to design a DSL, its gram-
matematical framework. mar and tools to parse and evaluate expressions. It provides
tools for constructing, type-checking, transforming and ex-
Trial and test expressions. ecuting expression templates [3, 5, 17], more specifically, it
expr<tag::integrate> ; (ii) the DSEL keywords corresponding to the nodes of the
tree.

allCells(Th) expr<tag::dot> Listing 6: Tags definition


namespace f v d s e l {
namespace t a g {
/ / ! DSEL terminal tags
expr<tag::mult> expr<tag::grad> s t r u c t basetype {} ;
s t r u c t meshvartype{} ;
s t r u c t t e s t f u n c t i o n t y p e {} ;
s t r u c t t r i a l f u n c t i o n t y p e {} ;
s t r u c t meshzonetype{} ;
K expr<tag::grad> vh s t r u c t n u l l t y p e {} ;

/ / ! DSEL keyword tags


s t r u c t dot {} ;
s t r u c t grad {} ;
s t r u c t jump{} ;
uh s t r u c t avg{} ;
s t r u c t i n t e g r a t e {} ;
}
}

Figure 1: Expression tree for the bilinear form de-


fined at line 7 of Listing 1. Expressions are in light
gray, language terminals in dark gray FVDSL domain definition.
We have defined the domain FVDSLDomain (Listing 7) where
all expressions are encapsulated in a FVDSLExpr that com-
provides: (i) an expression tree data structure, (ii) a mech- form to the grammar FVDSLGrammar detailled in §3.3.2. This
anism for giving expressions additional behaviors and mem- mechanism enables then to the framework to overload most
bers, (iii) operator overloads for building the tree from an of C++ operators.
expression, (iv) utilities for defining the grammar to which
an expression must conform, (v) an extensible mechanism Listing 7: FVDSL expression domain definition
for immediately executing an expression template, (vi) an t e m p l a t e <typename Expr> s t r u c t FVDSLExpr ;

extensible set of tree transformations to apply to expression s t r u c t FVDSLGrammar


: p r o t o : : o r < p r o t o : : t e r m i n a l <b o o s t : : p r o t o : : >,
trees. This framework enable to design a DSEL in a declar- p r o t o : : n a r y e x p r <b o o s t : : p r o t o : : ,
p r o t o : : v a r a r g <FVDSLGrammar>
ative way with mechanisms based on concepts like: (i) tag, >
(ii) meta-function, (iii) grammar, (iv) context (v) and trans- > {};

form, structures, (see more details in the framework docu- / / Expressions in the pde domain will be wrapped in FVDSLExpr<>
/ / and must conform to the FVDSLGrammar
mentation [16]). In this section, we detail how we have trans- s t r u c t FVDSLDomain
: p r o t o : : domain<p r o t o : : g e n e r a t o r <FVDSLExpr>,
lated our language formal definition §3.3.1 in proto objects FVDSLGrammar> { } ;
that enable to define expressions, the language Grammar, t e m p l a t e <typename Expr>
Context and Transform structures to evaluate expressions s t r u c t FVDSLExpr
: p r o t o : : e x t e n d s <Expr , FVDSLExpr<Expr >, FVDSLDomain>
and implement algorithms. {
e x p l i c i t FVDSLExpr ( Expr c o n s t &e x p r )
: p r o t o : : e x t e n d s <Expr ,
FVDSLExpr<Expr >,
Language front ends. FVDSLDomain>( e x p r )
The language front ends is defined by (i) the terminals, {}
BOOST PROTO EXTENDS USING ASSIGN ( FVDSLExpr )
(ii) the keywords listed in 1, (iii) and the grammar based on };

the production rules of Listings 2, 3, 4, and 5.


Expressions are implemented with proto expression tree
structures where each node is an object of type DSEL keywords.
proto::base_expr identified by a Tag and where the leafs The DSEL keywords listed in table 1 are associated to spe-
of the tree are occupied by Terminals (cf. Listing 2), meshes cific tags. For each tag, we have implemented a free function
(cf. Listing 5), test and trial functions (cf. Listing 3). that creates an associated tree node, a meta-function that
The bilinear form asushi
h defined by (7) has the program- generates the type of that node, and a grammar element
ming counterpart given in Listing 1 and the corresponding that matches expressions and dispatches to the
expression tree is detailed in Fig. 1. proto::pass_through<> transform, as PrimitiveTransform
(cf [16], 3.3.3). For instance, Listing 8 illustrates the def-
Tag structures and meta functions. inition of the unary free function grad(.) creating nodes
The implementation of a proto expression tree is based associated fvdsel::tag::grad and the definition of
on Tag structures and on associated meta-functions that fvdsel::gradop<ExprT> the meta-function that matches
enable to create nodes, implement Grammar or Transform grad expression or dispatches transforms. Listing 9 illus-
structures. trates the definition of the binary free function dot(.,.)
The boost::proto framework already provides standard creating nodes associated to the tag fvdsel::tag::dot and
Tags for standard unary and binary C++ operator (cf tables the definition of fvdsel::dotop<LExprT,RExprT> the meta-
2) and meta-functions (like proto::result_of::tag_of<.>, function that matches inner product expression or dispatches
proto::result_of::child_c<.,.> or transforms.
proto::result_of::value<.>) to easiliy navigate in expres-
sion trees.
We have completed them with tags representing : (i) the Listing 8: free function and meta-function associ-
different types of the DSEL terminals (the leafs of the tree) ated to fvdsel::tag::grad
/ / ! grad metafunction
t e m p l a t e <typename A> t e m p l a t e <typename T> struct is base type ;
typename p r o t o : : r e s u l t o f : : m a k e e x p r< f v d s e l : : t a g : : grad , t e m p l a t e <typename T> struct is mesh var ;
FVDSLDomain , t e m p l a t e <typename T> struct is mesh group ;
A const & t e m p l a t e <typename T> struct is function ;
>:: type t e m p l a t e <typename T> struct is test function ;
grad (A c o n s t &a ) t e m p l a t e <typename T> struct is trial function ;
{
r e t u r n p r o t o : : m a k e e x p r<f v d s e l : : t a g : : grad , t e m p l a t e <typename T>
FVDSLDomain>( b o o s t : : r e f ( a ) ) ; s t r u c t IsFVDSLTerminal
} : mpl : : o r <
fvdsel : : is f u n c t i o n t y p e <T>,
/ / ! grad metafunction fvdsel : : is b a s e t y p e <T>,
t e m p l a t e <typename T> fvdsel : : is m e s h v a r <T>,
s t r u c t gradop : p r o t o : : t r a n s f o r m < gradop<T> > fvdsel : : is m e s h g r o u p <T>
{ >
/ / types {};
t y p e d e f p r o t o : : e x p r< f v d s e l : : t a g : : grad ,
proto : : l i s t 1 < T >
> type ; In listing 11 we can compare the implementation of the
t y p e d e f p r o t o : : b a s i c e x p r < f v d s e l : : t a g : : grad ,
proto : : l i s t 1 < T > DSEL grammar with the BaseTypeGrammar, MeshVarTypeGrammar,
> proto grammar ;
TestFunctionTerminal, TrialFunctionTerminal,
/ / member classes/structs/unions
t e m p l a t e <typename Expr ,
CoefExprGrammar and BilinearGrammar structures to the
typename S t a t e , EBNF definition of the production rules 2, 3, 4, and 5 spec-
typename Data>
s t r u c t impl ifying bilinear expressions.
: p r o t o : : p a s s t h r o u g h <gradop > : : t e m p l a t e i m p l<Expr ,
State ,
Data>
{}; Listing 11: Bilinear expression grammar
};
namespace fvdsel {

s t r u c t BaseTypeGrammar
: p r o t o : : t e r m i n a l < p r o t o : : c o n v e r t i b l e t o <R ea l> >
{} ;
Listing 9: Free function and meta-function associ-
ated to fvdsel::tag::dot s t r u c t MeshVarTypeGrammar
: p r o t o : : a n d < p r o t o : : t e r m i n a l <p r o t o : : >,
t e m p l a t e <typename L , typename R> p r o t o : : i f < f v d s e l : : i s m e s h v a r <p r o t o : : v a l u e >() > >
typename {} ;
p r o t o : : r e s u l t o f : : m a k e e x p r<
f v d s e l : : t a g : : dot s t r u c t TestFunctionTerminal
, FVDSLDomain : p ro to : : and < FunctionTerminal ,
, L const & p r o t o : : i f < f v d s e l : : i s t e s t f u n c t i o n <p r o t o : : v a l u e >() > >
, R const & {} ;
>:: type
dot ( L c o n s t &l , R c o n s t& r ) struct TrialFunctionTerminal
{ : p ro to : : and < FunctionTerminal ,
r e t u r n p r o t o : : m a k e e x p r< f v d s e l : : t a g : : dot , p r o t o : : i f < f v d s e l : : i s t r i a l f u n c t i o n <p r o t o : : v a l u e >() > >
FVDSLDomain >( b o o s t : : r e f ( l ) , {} ;
boost : : r e f ( r ) ) ;
} struct CoefExprGrammar ;

t e m p l a t e <typename L e ft T , typename RightT> s t r u c t CoefExprGrammar


s t r u c t dotop : p r o t o : : t r a n s f o r m < dotop<L e f tT , RightT> > : proto : : or <
{ BaseTypeGrammar ,
/ / types MeshVarTypeGrammar ,
t y p e d e f p r o t o : : e x p r< f v d s e l : : t a g : : dot , p r o t o : : plus<CoefExprGrammar ,
p r o t o : : l i s t 2 < L e f tT , CoefExprGrammar >,
RightT > p r o t o : : m u l t i p l i e s <CoefExprGrammar ,
> type ; CoefExprGrammar >,
t y p e d e f p r o t o : : b a s i c e x p r < f v d s e l : : t a g : : dot , p r o t o : : d i v i d e s <CoefExprGrammar ,
p r o t o : : l i s t 2 <L e f tT , CoefExprGrammar>
RightT> >
> proto grammar ; {} ;

/ / member classes/structs/unions s t r u c t TrialExprGrammar


t e m p l a t e <typename LExpr , : proto : : or < TrialFunctionTerminal ,
typename RExpr , p r o t o : : m u l t i p l i e s <CoefExprGrammar ,
typename S t a t e , TrialExprGrammar >,
typename Data> f v d s e l : : jumpop<TrialExprGrammar >,
s t r u c t impl f v d s e l : : avgop<TrialExprGrammar >,
: p r o t o : : p a s s t h r o u g h <dotop > : : t e m p l a t e i m p l<LExpr , f v d s e l : : gradop<TrialExprGrammar >,
RExpr , f v d s e l : : t r a c e o p <TrialExprGrammar>
State , >
Data>
{}; {} ;
} ;
s t r u c t TestExprGrammar
: proto : : or < TestFunctionTerminal ,
Table 3 lists the main keywords with their associated tags, p r o t o : : m u l t i p l i e s <CoefExprGrammar ,
free functions and meta-functions. TestExprGrammar >,
f v d s e l : : jumpop<TestExprGrammar >,
f v d s e l : : avgop<TestExprGrammar >,
f v d s e l : : gradop<TestExprGrammar >,
f v d s e l : : t r a c e o p <TestExprGrammar>
>
Grammar definition. {} ;
The Grammar of our language is based on the production struct BilinearGrammar ;
rules detailed in §3.3.1. Proto provides a set of tools that struct PlusBilinear
enable to implements each production rule in a user friendly : p r o t o : : plus< B i l i n e a r G r a m m a r , BilinearGrammar >
{};
declarative way. Terminal structures are detected with the
struct MinusBilinear
meta-function defined in listing 10. Each production rule is : p r o t o : : minus< B i l i n e a r G r a m m a r , BilinearGrammar >
implemented by a grammar structure composed with other {};

grammar structures, proto pre-defined transforms (cf table struct MultBilinear


: p r o t o : : m u l t i p l i e s < CoefExprGrammar , BilinearGrammar >
2) or some of our specific transforms (cf table 3). {};

s t r u c t BilinearGrammar
: proto : : or <
Listing 10: terminal meta-function p r o t o : : m u l t i p l i e s <TrialExprGrammar ,
TestExprGrammar >, linear context with a factor equal to the measure of
f v d s e l : : dotop<TrialExprGrammar ,
TestExprGrammar >, the cell.
PlusBilinear ,
MinusBilinear ,
MultBilinear

{} ;
> To implement the integration algorithm associated to lin-
} ear variational formulation, we have used both Context and
Transform structures. A BilinearContext object, referenc-
ing a linear system back end object used to build the global
3.3.3 Evaluation contexts and transforms linear system with different linear algebra packages has been
developped to evaluate the global expression. On an Inte-
Language back ends : expression evaluation, algorithm grate node, this object call a IntegratorOp transform on the
implementation. expression tree. In listing 12, we detail the implementation
The DSEL back ends are composed of algebraic structures of this transform that matches in our example the expres-
(matrices, vectors, linear combinations) used in different sion with the tag fvdsel::tag::integrate, the MeshGroup
kind of algorithms based mesh entities iterations, matrices, expression allCells(Th) and the term
vectors evaluation or assembly operations. The implemen- dot(K*grad(u),grad(v)).
tation of theses algorithms will be based on the evaluation
and manipulation of FVDSLDomain expressions. Such evalu-
Listing 12: Integrator transform
ations are based on two kind of Proto concepts : Context s t r u c t I nte grator : proto : : c a l l a b l e
and Transform structures. {
/ / ... callable object that will use a BilinearIntegrator transform on
/ / a bilinear expression
• A Context is like a function object that is passed along typedef int r e s u l t t y p e ;

with an expression to the proto::eval() function. It t e m p l a t e <typename ZoneT ,


typename ExprT ,
associates behaviors with node types. proto::eval() typename S t a t e T ,
typename DataT>
walks the expression and invokes your context at each int
node. o p e r a t o r ( ) ( ExprT c o n s t& e x p r ,
ZoneT c o n s t& z o n e ,
S t a t e T& s t a t e ,
• A Transform is a way to associate behaviors, not with {
DataT c o n s t& d a t a ) c o n s t

node types in an expression, but with rules in a Proto / / call a transform that analyze ExprT
/ / and dispatch to the appropriate transform
grammar. In this way, they are like semantic actions return 0 ;
}
in other compiler-construction toolkits. } ;

st ruc t IntegratorOp
Algorithms are implemented as specific expression tree eval- : proto : : or <
uation, as a sequence of piece of algorithms associated to p r o t o : : when<
f v d s e l : : IntegratorGrammar ,
the behaviour of Evaluation context on each node or on f v d s e l : : I n t e g r a t o r ( p r o t o : : c h i l d c <2>,
p r o t o : : c h i l d c <1>,
Transforms that match production rules. Theses pieces of proto : : s t a t e ,
proto : : data
algorithm are written respectively in the operator()() of )
the structure Context::eval for Context objects, in the p r o t o : : when<
>

operator()() of callable transforms objects for . Trans- p r o t o : : plus<I n t e g r a t o r O p , I n t e g r a t o r O p >,


IntegratorOp ( proto : : l e f t ,
forms. IntegratorOp ( proto : : r i g h t ,
proto : : s t a t e ,
proto : : data
For instance, in the expression defined in listing 1, proto : : data )
),

allCells(Th), K, u, v are terminals of the language. integrate, >


>
dot and grad are specific keywords of the language associ- {};

ated to the tags fvdsel::tag::integrate, fvdsel::tag::dot


and fvdsel::tag::grad. The binary operator * is associ- In the callable transform Integrator, analyzing the inte-
ated to the tag proto::tag::mult grate expression term, when a bilinear expression is matched,
At evaluation, the expression is analyzed as follows : another transform BilinearIntegrator (listing 13) match-
ing a DotExpr associated to fvdsel::tag::dot and the pro-
1. The root node of the tree, associated to the tag duction rules matching the test and trial part of the bilin-
tag::integrate is composed of an MeshGroup expres- ear expressions. The algorithm (listing 14) is called by the
sion (allCells(Th)) and the BilinearTerm expression callable transform DotIntegrator. Note that the
(dot(K*grad(u),grad(v))); BilinearContext is passed along the expression tree with
the proto::_data structure.
2. The integration algorithm consists in iterating on the
elements of the allCells(Th) collection and in evalu-
ating the bilinear expression on each cell. This bilinear Listing 13: BilinearIntegrator transform
expression is composed of: (i) a TrialExpr expression s t r u c t MultIntegrator : proto : : c a l l a b l e
: K*grad(u); (ii) a TestExpr expression : grad(v) {
typedef int r e s u l t t y p e ;
(iii) a binary operator associated to the tag : tag::dot t e m p l a t e <typename T r i a l E x p r T ,
typename TestExprT ,
The evaluation of the TrialExpr expression and of the typename S t a t e T ,
typename DataT>
TestExpr expression on a cell return two linear combi- int
nation objects which, associated to the binary opera- o p e r a t o r ( ) ( T r i a l E x p r T c o n s t& l e x p r ,
TestExprT c o n s t& r e x p r ,
tor tag lead to a bilinear contribution which is a local S t a t e T& s t a t e ,
DataT c o n s t& d a t a ) c o n s t
matrix contributing to the global linear system of the {
/ / call integrate algorithm TestExprT c o n s t& t e s t ,
/ / with tag proto::tag::mult B i l i n e a r C o n t e x t T& c t x )
{
return i n t e g r a t e <p r o t o : : t a g : : mult>( getMesh ( d a t a ) , s t a t i c c o n s t C o n t e x t : : ePhaseType p h a s e =
getGroup ( data ) , BilinearContextT : : phase type ;
lexpr , auto matrix = ctx . getMatrix ( ) ;
rexpr , f o r ( auto c e l l : group )
GetContext ( data ) ) ; {
} E v a l C o n t e x t<Item> c t x ( c e l l ) ; / / ! eval context on mesh item
} ; a u t o l u = p r o t o : : e v a l ( t r i a l , c t x ) ; / / ! trial linear combination
a u t o l v = p r o t o : : e v a l ( t e s t , c t x ) ) ; / / ! test linear combination
s t r u c t DotIntegrator : proto : : c a l l a b l e B i l i n e a r C o n t r i b u t i o n <t a g o p > uv ( l u , l v ) ;
{ a s s e m b l e <p h a s e >( m a t r i x , / / ! matrix
typedef int r e s u l t t y p e ; m e a s u r e ( mesh , c e l l ) , / / ! cell measure
t e m p l a t e <typename T r i a l E x p r T , uv ) ; / / ! bilinear contribution
typename TestExprT ,
typename S t a t e T ,
typename DataT> }
int }
o p e r a t o r ( ) ( T r i a l E x p r T c o n s t& l e x p r ,
TestExprT c o n s t& r e x p r ,
S t a t e T& s t a t e , In the same way the evaluation of a linear form expression
DataT c o n s t& d a t a ) c o n s t
{ with a linear context leads to the construction of the righ
/ / call integrate algorithm
/ / with tag proto::tag::dot hand side of a global linear system.
r e t u r n i n t e g r a t e <p r o t o : : t a g : : dot >( getMesh ( d a t a ) ,
getGroup ( data ) ,
lexpr , Once the global linear system built, it can be solved with
rexpr ,
GetContext ( data ) ) ; a linear system solver provided by the linear algebra layer.
}
} ;

struct BilinearIntegrator
: proto : : or <
p r o t o : : when< p r o t o : : m u l t i p l i e s <TrialExprGrammar , 4. APPLICATIONS
TestExprGrammar >,
MultIntegrator ( proto : : l e f t , / / ! lexpr Our benchmark is based on the following exact solution
proto : : r i g h t , / / ! rexpr
proto : : s t a t e , / / ! state for the diffusion problem (2):
proto : : data / / ! con-
text
2 3
)>, 1 0 0
p r o t o : : when< f v d s e l : : dotop<TrialExprGrammar ,
TestExprGrammar >, u(x) = sin(πx) sin(πy) sin(πz), κ = 40 1 05 ,
D o t I n t e g r a t o r ( p r o t o : : c h i l d c <0>, / / ! left
p r o t o : : c h i l d c <1>, / / ! trial
0 0 1
proto : : s t a t e , / / ! state

text
proto : : data / / ! con-
on the square domain Ω = [0, 1]3 with
)
>, f (x, y, z) = 3πsin(πx)sin(πy)sin(πz).
p r o t o : : when< p r o t o : : plus<B i l i n e a r G r a m m a r ,
B i l i n e a r G r a m m a r >, We compare the following methods: (i) the DSEL imple-
B i l i n e a r I n t e g r a t o r ( proto : : r i g h t ,
B i l i n e a r I n t e g r a t o r ( proto : : l e f t ,
mentations of the ccG method (5) provided in Listing 15;
proto : : s t a t e , (ii) the DSEL implementation of the SUSHI method with
proto : : data ) ,
proto : : data face unknowns (6) provided in Listing 16; (iii) the DSEL
)
> implementation of the G method (17) provided in Listing 17
>
{} ;

Listing 15: ccG method implementation


Listing 14 is a simple assembly algorithm. We iterate on MeshType Th ;
each entity of the mesh group and evaluate the test and Real K;
a u t o Vh = newCCGSpace ( Th ) ;
trial expression on each entity. For such evaluation, we a u t o u = Vh−> t r i a l ( ”U” ) ;
a u t o v = Vh−>t e s t ( ”V” ) ;
have defined different kind of context objects. The structure a u t o lambda = e t a ∗ v a l ( gamma ) / v a l (H( Th ) ) ;
EvalContext<ItemT> enables to compute the linear combi- BilinearForm a =
i n t e g r a t e ( a l l C e l l s ( Th ) , dot (K∗ grad ( u ) , grad ( v ) ) ) +
nation objects that return the evaluation of test or trial ex- i n t e g r a t e ( a l l F a c e s ( Th ) , jump ( u ) ∗ dot (N( Th ) , avg ( grad ( v ) ) ) −
dot (N( Th ) , avg (K∗ grad ( u ) ) ) ∗ jump ( v ) +
pression, which associated to the binary operator tag lead lambda ∗jump ( u ) ∗jump ( v ) ;
LinearForm b =
to a bilinear contribution, a local matrix contributing to the i n t e g r a t e ( a l l C e l l s ( Th ) , f ∗v ) ;
global linear system of the linear context with a factor equal
to the measure of the cell. Note that the BilinearContextT
is parametrized by a phase_type parameter that enables to
Listing 16: SUSHI method implementation
optimize and factorize global linear system construction : MeshType Th ;
intermediate computations can be stored in a cache system Real K;
a u t o Vh = newSUSHISpace ( Th ) ;
and be reused. For instance when a global linear system is a u t o u = Vh−> t r i a l ( ”U” ) ;
a u t o v = Vh−>t e s t ( ”V” ) ;
built, the global system dimensions setting phase, the sparse BilinearForm a =
structure matrix definition and the matrix filling phase can i n t e g r a t e ( a l l C e l l s ( Th ) , dot (K∗ grad ( u ) , grad ( v ) ) )
LinearForm b =
;

be separed. The first two phases can be easily factorized for i n t e g r a t e ( a l l C e l l s ( Th ) , f ∗v ) ;

several filling phases in iterative algorithms.

Listing 17: G method implementation


MeshType Th ;
Listing 14: Integration assembly algorithm Real K;
t e m p l a t e <typename ItemT , a u t o Uh = newGSpace ( Th ) ;
typename TestExprT , a u t o Vh = newP0Space ( Th ) ;
typename T r i a l E x p r T , a u t o u = Vh−> t r i a l ( ”U” ) ;
typename t a g o p , a u t o v = Vh−>t e s t ( ”V” ) ;
typename B i l i n e a r C o n t e x t T > BilinearForm a =
v o i d i n t e g r a t e ( Mesh c o n s t& mesh , i n t e g r a t e ( a l l F a c e s ( Th ) , dot (N( Th ) , avg (K∗ grad ( u ) ) ) ∗ jump ( v ) ) ;
GroupT<ItemT> c o n s t& group , LinearForm b =
T r i a l E x p r T c o n s t& t r i a l , i n t e g r a t e ( a l l C e l l s ( Th ) , f ∗v ) ;
The codes is compiled with the gcc 4.5 compiler with the comparing the times results of the ccg methods of the DSEL
following compile options: version the the fvC++ implementation, a hand written stl-
based implementation of the back-end discussed in §3.1. The
-03 -fno - builtin difference in tass is due to the fact that in the hand written
- mfpmath = sse - msse - msse2 - msse3 implementation computation stages used several times in the
- mssse3 - msse4 .1 - msse4 .2 assembly phase are naturally pre-computed and stored while
-fno - check - new -g - Wall - std = c ++0 x in our primary DSEL implemntation, cach mechanisms for
-- param -max - inline - recursive - depth =32 such computations are not already available.
-- param max - inline - insns - single =2000

The benchmark test cases are run on a work station with a 5. CONCLUSION AND PERSPECTIVES
quad-core Intel Xeon processor GenuineIntel W3530, 2.80GHz, Our DSEL for lowest-order methods enables to describe
8MB for cach size. and solve various non trivial academic problems. Different
In our numerical tests we consider a families of h-refined numerical methods were implemented with a high level lan-
meshes with h decreasing from 0.1 to 0.0125. guage close to one used in the unified mathematical frame-
The linear systems are solved using the PETSc library[6] work. The analysis of the performance results of our study
with the BICGStab solver preconditioned by the euclid ILU(0) cases shows that the overhead of the language is not impor-
preconditioner, with relative tolerance set to 10−6 . tant regarding standard hand written codes.
The benchmarks monitor various metrics:
In some future work, we plan to extend our DSEL to
(i) Accuracy. The accuracy of the methods is evaluated take into account: (i) various types of boundary conditions,
in terms of the L2 norm of the error. For the methods (ii) the non linear formulation hiding the complexities of
of §2, the L2 -norm of the error is evaluated using the derivaties computation.
cell center as a quarature node, i.e.,
0 11
2 Within the HAMM[2] project (Hybrid architecture and
X 2 multi-level model), we plan to handle multi-level methods
ku − uh kL2 (Ω) ≈ @ |T |(u(xT ) − uT ) A .
and illustrate the interest of our approach to take advan-
T ∈Th
tage seamless of the performance of new hybrid hardware
The convergence order of a method is classically ex- architecture with GP-GPU.
pressed relating the error to the mesh size h.

(ii) Memory consumption. When comparing methods fea-


turing different number of unknowns and stencils, a 6. REFERENCES
more fair comparison in terms of system size and mem- [1] ISO/IEC 14977, 1996(E).
ory consumption is obtained relating the error to the [2] HAMM Web page, 2010.
number of DOFs (NDOF ) and to the number of nonzero http://www.agence-nationale-recherche.fr/en/anr-
entries of the corresponding linear system (Nnz ). funded-project/?tx lwmsuivibilan pi2[CODE]=ANR-
10-COSINUS-009.
(iii) Performance. The last set of parameters is meant to
evaluate the CPU cost for each method and imple- [3] D. Abrahams and L. Gurtovoy. C++ template
mentation. To provide a detailed picture of the dif- metaprogramming : Concepts, tools, and thechniques
ferent stages and estimate the overhead associated to from boost and beyond. C++ in Depth Series.
the DSEL, we separately evaluate Addison-Wesley Professional, 2004.
[4] L. Agélas, D. A. Di Pietro, and J. Droniou. The G
• tinit , the time to build the discrete space; method for heterogeneous anisotropic diffusion on
• tass , the time to fill the linear systems (local/- general meshes. M2AN Math. Model. Numer. Anal.,
global assembly). When DSEL-based implemen- 44:597–625, 2010.
tations are considered, this stage carries the ad- [5] P. Aubert and N. Di Césaré. Expression templates and
ditional cost of evaluating the expression tree for forward mode automatic differentiation. Computer
bilinear and linear forms; and information Science, chapter 37, pages 311-315.
Springer, New York, NY, 2001.
• tsolve , the time to solve the linear system.
[6] S. Balay, J. Brown, K. Buschelman, W. D. Gropp,
The accuracy and memory consumption analysis is pro- D. Kaushik, M. G. Knepley, L. Curfman McInnes,
vided in Figure 2. We can check the expected linear conver- B. F. Smith, and H. Zhang. PETSc Web page, 2011.
gence behaviour of the methods. A super linear convergence http://www.mcs.anl.gov/petsc.
effect for the G method can be observed due to the regularity [7] F. Brezzi, K. Lipnikov, and M. Shashkov. Convergence
of our meshes. of mimetic finite difference methods for diffusion
The CPU cost analysis is provided in Figure 4. The cost problems on polyhedral meshes. SIAM J. Numer.
of each stage of the computation is related to the number of Anal., 45:1872–1896, 2005.
DOFs in Figure 3 to check that the expected complexity is [8] F. Brezzi, K. Lipnikov, and V. Simoncini. A family of
achieved. This is the case for all the methods considered. mimetic finite difference methods on polygonal and
A comparison in terms of absolute computation time is polyhedral meshes. M3AS, 15:1533–1553, 2005.
provided in Figure 4 on the 2D version of the test case with [9] D. A. Di Pietro. Cell-centered Galerkin methods. C.
h = 0.006125. The overhead of the DSEL is estimated by R. Math. Acad. Sci. Paris, 348:31–34, 2010.
Table 1: DSEL keywords
keyword meaning
R
integrate(.,.) (.) integration of expression tinit
tass
dot(.,.) (. · .) vector inner product 101 tsolve
jump(.) J.K jump accross a face
100
avg(.) {.} average accross a face
10−1

Table 2: Proto standard tags and meta-functions


10−2
operator narity tag meta-function
+ 2 proto::tag::plus proto::plus<.,.> 103 104 105

- 2 proto::tag::minus proto::minus<.,.> (a) G : CPU cost vs. h


* 2 proto::tag::mult proto::mult<.,.>
tinit
/ 2 proto::tag::div proto::div<.,.> tass
101 tsolve

Table 3: DSEL keywords 100

operator narity tag meta-function


10−1
integrate(.,.) 2 fvdsel::tag::integrate integrateop<.,.>
grad(.) 1 fvdsel::tag::grad gradop<.>
jump(.) 1 fvdsel::tag::jump jumpop<.> 10−2
104 105 106
avg(.) 1 fvdsel::tag::avg avgop<.>
dot(.,.) 2 fvdsel::tag::dot dotop<.,.> (b) SUSHI : CPU cost vs. h

102 tinit
10−2 tass
tsolve
101
−3
10
100

−4
10
G 10−1
SUSHI
ccG
10−5
10−1.8 10−1.6 10−1.4 10−1.2 10−1 103 104 105

2
(a) L -error vs. h (c) ccG : CPU cost vs. h

10−2 Figure 3: Performance analysis for the example of


Sect. 4
10−3

10−4
RaTu1
SUSHI
ccG
10−5
103 104 105 106
2
(b) L -error vs. NDOF ccG-fvCpp

10−2

ccG-DSEL

10−3

tinit
tass
10−4 SUSHI tsolve
G
SUSHI 0.5 1 1.5 2 2.5
ccG
10−5
104 105 106 107
Figure 4: Comparison of different methods and im-
2
(c) L -error vs. Nnz plementation for the 2D test case of §4 (h = 0.00625)

Figure 2: Accuracy and memory consumption anal-


ysis for the example of Sect. 4
[10] D. A. Di Pietro. A compact cell-centered Galerkin
method with subgrid stabilization. C. R. Acad. Sci.
Paris, Ser. I., 348(1–2):93–98, 2011.
[11] D. A. Di Pietro. Cell centered Galerkin methods for
diffusive problems. M2AN Math. Model. Numer.
Anal., 46(1):111–144, 2012.
[12] D. A. Di Pietro and J.-M. Gratien. Lowest order
methods for diffusive problems on general meshes: A
unified approach to definition and implementation. In
J. Fovrt, J. Furst, J. Halama, R. Herbin, and
F. Hubert, editors, Finite Volumes for Complex
Applications VI, pages 3–19. Springer–Verlag, 2011.
[13] J. Droniou, R. Eymard, T. Gallouët, and R. Herbin. A
unified approach to mimetic finite difference, hybrid
finite volume and mixed finite volume methods.
M3AS, 20(2):265–295, 2010.
[14] R. Eymard, T. Gallouët, and R. Herbin. Discretization
of heterogeneous and anisotropic diffusion problems on
general nonconforming meshes SUSHI: a scheme using
stabilization and hybrid interfaces. IMA J. Numer.
Anal., 30:1009–1043, 2010.
[15] G. Grospellier and B. Lelandais. The Arcane
development framework. In Proceedings of the 8th
workshop on Parallel/High-Performance
Object-Oriented Scientific Computing, POOSC ’09,
pages 4:1–4:11, New York, NY, USA, 2009. ACM.
[16] E. Niebler. boost::proto documentation, 2011.
http://www.boost.org/doc/libs/1 47 0/doc/html/proto.html.
[17] T. Veldhuizen. Using C++ template metaprograms.
C++ report, 7(4):36-43, May 1995. reprinted in C++
Gems, ed. Stanley Lippman, 1995.

You might also like