Principal Components Analysis For Turbulence-Chemistry Interaction Modeling

Chapter 4
Principal Components Analysis

for turbulence-chemistry
interaction modeling
As discussed in Chapter 2, turbulent combustion modeling is a very broad

subject and includes a wide range of coupled physics. Our primary interest is
focused on combustion models, adopted in the framework of RANS and LES ap-
proaches to close the reacting scalars source terms. Such models must provide
an adequate coupling of turbulent mixing and chemical reactions for the unre-
solved scales. Moreover, an accurate kinetic description is needed in situations
where finite-rate chemistry effects may be relevant, as in flameless combustion
regime [78, 79, 45]. However, detailed combustion mechanisms for fuels as sim-
ple as methane involve 53 species and 325 reactions [63] and the number of
species and reactions dramatically increases with the molecular weight of the
hydrocarbon fuel [83]. Therefore, solution of the species transport equations
for turbulent reacting system can become very computationally intensive, if a
reaction rate approach is adopted and no simplification is made.
The reduction of the number of species equations to be solved can be ac-
complished in two ways:
• Reduction of the kinetic mechanism [84, 85, 86]. This approach is based
on the analysis of the dominant reaction rates at the conditions of inter-
est and proceeds through the elimination of species and reactions in the
original kinetic mechanism, ultimately leading to a reduced set of species
equations to be solved.
• State space parametrization. This approach relies on the assumption

that the thermodynamic state of a reacting system relaxes onto a low-
dimensional, strongly attracting, manifold in chemical state space [87,
88]. The thermochemical state of a single-phase reacting fluid having Ns
species is uniquely determined by Ns + 1 parameters (T , p, and Ns − 1
63
Chapter 4. Principal Components Analysis for turbulence-chemistry
species mass fractions). Yet, if a set of “optimal” variables is identified, the

whole thermochemical state can be re-parametrized with a lower number
of variables, which nevertheless must provide a satisfactory approxima-
tion of the system in a lower dimensional space [87]. In the context of
turbulent non-premixed flames, a widely used approach is represented by
the flamelet model .(Chapter 2). The model adopts two parameters for
the parametrization of the thermochemical state, the mixture fraction f
and the scalar dissipation rate, χ. The choice of f is particularly appeal-
ing, as this parameter is a conserved scalar if all species diffusivities are
equal [88], and it does not require any source term closure.
The present Chapter focuses on the second of these approaches, the parametriza-
tion of the thermochemical state by a small number of parameters based on the
existence of a low-dimensional manifold. Most of the existing models exploiting
the existence of such manifolds are based on the a priori prescription of the
manifold dimensionality. For example, the flamelet model [76, 89] assumes that
the system can be satisfactory described by means of two parameters. How-
ever, such an approach restricts the subspace that thermochemistry may access,
without providing any quantitative error analysis. Indeed, as mixing and reac-
tion timescales increasingly overlap, the dimensionality of a manifold increases,
as does the error associated with a parametrization of fixed dimensionality [88].
Such consideration has prompted our interest towards the development of a
methodology to automate the selection of the optimal basis for the representa-
tion of the manifolds in thermochemical space. Principal Components Analysis
(PCA) [90, 91] offers this potential, as it provides a rigorous mathematical
formalism for reducing the dimensionality of a data set consisting of a large
number of correlated variables, while retaining most of the variation present in
the original data. The reduction is achieved by transforming to a new set of
variables, called the principal components (PCs), which are uncorrelated and
ordered so that the first few account for most of the variation present in all the
original variables. PCA provides an optimal representation of the system based
on q optimal variables, the PCs, which are linear combination of the Ns + 1
primitive variables T , p, and Yi . The linearity of the method is a particularly
appealing aspect since, once the reaction variables are identified, a few linear
combinations of the original variables could be transported in a numerical simu-
lation, if a proper closure for the source terms is employed. Nevertheless, since
the reaction variables provided by PCA are not conserved scalars, an a pri-
ori analysis of the ability of principal components to parametrize their source
terms must be assessed, as a mandatory requirement for the generated manifold
method. One of the main advantages of PCA lies in the potential to obtain
the principal components from a target system and to apply them to a similar
system. This potential could remove one of the main drawback which, to date,
affects PCA. In fact, the derivation of the manifold model via PCA requires
the availability of a data set for the extraction of principal components.
64
4.1. Definition and derivation of Principal Components
In the following, a background on PCA is provided. In particular, the

definition and derivation of the PCs, the criteria used for selecting optimal
parameters, or variables, from large data sets and the approaches to help PCs
interpretability and deal with highly non linear systems will be discussed. Then,
various examples of the application of PCA to turbulent reacting systems will
be presented, to investigate the feasibility of PCA for the identification of the
low-dimensional manifolds in thermochemical state. Finally, the possibility of
exploiting PCA as a predictive model approach will be investigated.
All the analysis carried out in this Chapter have been performed with a
MATLAB® code written on purpose, available under request. The results
have been published in [92, 93].
4.1 Definition and derivation of Principal Compo-

nents
A primary goal when dealing with multivariate data is to reduce their dimen-
sionality to the smallest number of meaningful dimensions, in order to help
data exploration and any further processing. Principal Components Analysis
(PCA) can be successfully exploited for this purpose. PCA was first introduced
by Pearson in the early 1900’s [94]. A formal treatment of the method is due
to Hotelling [95] and Rao [96].
Suppose that X is a vector of p random variables, i.e. X = (x1 , x2 , . . . , xp ),
with mean µ and covariance matrix Σ. The (i, j)th element of Σ represents the
covariance between the ith and jth variables of X, if i 6= j, or the variance of the
jth element of X, if i = j. PCA is concerned with finding a few (<< p) derived
variables, called Principal Components (PCs), which nevertheless preserve most
of the information present in the original data. The PCs are linear combinations
of the original variables; moreover, they are uncorrelated (i.e. orthogonal) and
derived so that the variance on the jth component is maximal.
The first PC of X is defined as the linear combination:
z1 = Xa1 . (4.1)
0
To determine z1 , a vector a1 is sought so that var (z1 ) = a1 Σa1 is maximized,
0
subject to the constraint a1 a1 = 1. If we adopt the standard approach of
Lagrange multipliers to solve this constrained problem, we need to maximize:
0
0
a1 Σa1 − λ a1 a1 − 1 (4.2)
where λ is a Lagrange multiplier. Differentiating with respect to a1 gives:
Σa1 − λa1 = (Σ − λIp ) a1 = 0 (4.3)

where Ip is the (p x p) identity matrix. Thus, λ is an eigenvalue of Σ and a1 is
the corresponding eigenvector. It is easy to proof that the eigenvector a1 which
65
maximizes the variance of z1 is the one corresponding to the largest eigenvalue

of Σ, being:
0 0 0
a1 Σa1 = a1 λa1 = λa1 a1 = λ = λ1 . (4.4)
0
The second PC, z2 = Xa2 , maximizes the variance var (z2 ) = a2 Σa2
subject to the constraints cov (Xa1 , Xa2 ) = 0 (z1 and z2 uncorrelated) and
0
a2 a2 = 1. Being
0 0 0 0
cov (Xa1 , Xa2 ) = a1 Σa2 = a2 Σa1 = λ1 a1 a2 = λ1 a2 a1 , (4.5)
any of the equations
0 0 0 0
a1 Σa2 = 0 a2 Σa1 = 0 a1 a2 = 0 a2 a1 = 0 (4.6)
could be used to specify no correlation between z1 and z2 . Choosing arbitrarily

the last expression in Eq. (4.6), the quantity to be maximized becomes:
0
0 0
a2 Σa2 − λ a2 a2 − 1 − φa2 a1 (4.7)
where and φ are Lagrange multipliers. Differentiating with respect to a2 and

pre-multiplying by a1 , we get:
0 0 0
a1 Σa2 − λa1 a2 − φa1 a1 = 0 (4.8)
which reduces to
φ = 0, (4.9)
being
0 0
a1 Σa2 = λa1 a2 (4.10)
due to the constraint of z1 and z2 being uncorrelated. Then, Eq. (4.8) reduces
to:
Σa2 − λa2 = 0. (4.11)
Then, λ is once more an eigenvalue of Σ and a2 is the corresponding eigenvec-

0
tor. Again, a2 Σa2 = λ. Thus, assuming that Σ has all different eigenvalues,
λ is the second largest eigenvalue of Σ, λ2 , and a2 is the corresponding eigen-
vector.
0
In general, the kth PC of is zk = Xak and var (zk ) = ak Σak = λk , where
λk is the kth largest eigenvalue of Σ and ak is the corresponding eigenvector.
66
4.2. Sample PCA
4.2 Sample PCA

In Section 4.1, the definition and derivation of PCs have been discussed for an
infinite population of measures. In practice, a random sample of n observations
of the p variables is available, so that Xi = (xi1 , xi2 , . . . , xip ) represents the ith
observation from the data set. Thus, the data available for PCA is a (n x p)
data matrix and an unbiased estimator of Σ, S 1 , is employed.
For a single observation of X, Xi , zi1 is given by:
zi1 = Xi a1 , i = 1, 2 . . . , n (4.12)
where the vector of coefficients a1 is chosen to maximize the variance
n
1 X
(zi1 − z 1 )2 , (4.13)
(n − 1)
i=1
0
subject to the constraint a1 a1 = 1. Then, for the second PC:
zi2 = Xi a2 , i = 1, 2 . . . , n (4.14)
where a2 is chosen to maximize the sample variance of z12 , subject to the
0
constraints a2 a2 = 1 and cov (Xa1 , Xa2 ) = 0.
Continuing the process in an obvious manner, zk = Xak is the kth sam-
ple PC (k = 1, 2, . . . , p) and zik is the score for the ith observation on the
kth sample PC. If the derivation of Section 4.1 is followed, but replacing popu-
lation quantities with sample variances and covariances, then it turns out that
the sample variance of the kth sample PC is lk , the kth largest eigenvalue of
the sample covariance matrix S, and that ak is the corresponding eigenvector
of S.
Let Z be the (n x p) matrix of PCs scores, with (i, k)th element equal to
zik ; then, Z and X are related by
Z = XA (4.15)
where A is the (p x p) orthogonal matrix whose columns are the eigenvectors of

S. Here we assume, for simplicity, that the variables have zero mean, otherwise
the mean of each variable is subtracted from the columns of X before PCA is
applied. Then, the sample covariance matrix, S, of X can be defined as:
1 0
S= X X. (4.16)
n−1
Recalling the eigenvector decomposition of a symmetric, non-singular matrix,
S can be decomposed as:
1
The matrix S represents the approximation of Σ for a finite population, i.e. the random
sample consisting of n observations for p variables.
67
0
S = A LA (4.17)
where L is a (p x p) diagonal matrix containing the eigenvalues of S in de-
scending order, l1 > l2 > . . . > lp .
The linear transformation given by Eq. (4.15) simply recast the original
variables into a set of new uncorrelated variables, whose coordinate axes are
described by A. Then, the original variables can be stated as a function of the
PCs as:
0
X = ZA (4.18)
0
being A orthonormal and, hence, A−1 = A . This means that, given Z, the
values of the original variables can be uniquely recovered. However, the main
objective of PCA is to replace the p elements of X with a much smaller number,
q, of PCs, which nevertheless discard a small fraction of the variance originally
contained in the data. If a subset of size q << p is used, the truncated PCs are
defined as:
Zq = XAq . (4.19)
Eq. (4.19) can be inverted to obtain:
0
Xq = Zq Aq . (4.20)
The linear transformation provided by Eq. (4.20) is particularly appealing
for size reduction in multivariate data analysis due to some optimal properties
of the PCA transformation, described in the following text.
4.2.1 Optimal properties of the PCA reduction

It can be shown [90] that PCA satisfies the following optimal properties:
• Property 1 : For any integer q, 1 ≤ q ≤ p, consider the orthonormal

0
transformation Z = XB, where B is a (p x q) matrix. Let Sz = B LB
be the variance-covariance matrix for Z. The, the trace of Sz , tr (Sz ),
is maximized by taking B = Aq , where Aq contains the first q columns
of A. Property 1 emphasizes that the PCs explain, successively, as much
as possible of the total univariate variance in the original data, being
tr (Sz ) = tr (S)
• Property 2. Consider the orthogonal transformation Z = XB. Then,

tr (Sz ) is minimized by taking B = A∗q , where A∗q consist of the last q
columns of A. The statistical implication of property 2 is that the last few
PCs are not simply unstructured leftovers after removing the important
PCs. Being the variances of the last PCs small, they can help to detect
unsuspected near-constant linear dependencies among the element of X,
68
4.2. Sample PCA
to select a subset of variables from X in regression analysis or to detect

outliers from a data set.
• Property 3. The sample covariance matrix, S, can be expressed as (spec-

tral decomposition):
0 0 0
S = l1 a1 a1 + l2 a2 a2 + . . . + lp ap ap . (4.21)
This result shows that we can decompose the whole covariance matrix
into decreasing contribution due to each PC.
• Property 4. Consider the orthogonal transformation Z = XB. Then,

the determinant of Sz , det (Sz ), is maximized by taking B = Aq . The
statistical importance of this property follows because the determinant of
a covariance matrix, called generalized variance, can be used as a simple
measure of spread for a multivariate random variable.
• Property 5. Each element of X can be predicted by a linear function of Z,

Z = XB. If σj2 is the residual variance in predicting xj from Z , then
Pp
j=1 σj is minimized by taking B = Aq . The statistical implication
2
of property 5 is that Eq. (4.20) is the best linear predictor of X in a

q-dimensional subspace, in terms of squared prediction error.
4.2.2 Data preprocessing: centering and scaling

As it was anticipated in Section 4.2, data are usually centered before PCA is
carried out. When the variable means are subtracted from the data sample, all
the observations are converted to fluctuations, thus leaving only the relevant
variation for analysis. Moreover, when working with centered variables, cen-
tered PCs are obtained. Centering is usually used with all the scaling criteria
described below.
Scaling is an essential operation when the elements of X are in different
units or when they have very different variances. These aspects have both
to be faced when analyzing the thermochemical state of a reacting system
since temperature and species concentrations have different units. Moreover,
temperature may range from ambient conditions to thousands of degrees while
species mass fractions vary between zero and one. Besides, even among species
mass fractions, there may be need for scaling. For example, radicals appear in
small concentrations and their mass fractions may range from zero to something
far less than one (i.e. 10−3 − 10−6 ), while major species mass fractions range
from 0 to 1. Taking into account centering, it is possible to define a scaled
variable, xej as:
xj − xj
x
ej = (4.22)
dj
69
where xj = n1 ni=1 xij , j = 1, 2, . . . , p, while dj is the scaling parameter relative

P
to variable xj . In matrix form, Eq. (4.22) becomes:
f = X − X D −1 (4.23)

X
where D is the diagonal matrix containing the scaling parameters, dj . When
scaling is applied, Eqs. (4.15)-(4.19) are modified as:
−1
0
1
X − X X − X D −1

S= n−1 D
Z = X − X D −1 A . (4.24)
−1
Zq = X − X D Aq
The choice of the scaling parameters is very important, and has a potentially
strong impact on the resulting eigenvectors. The following choices are available:
1. Auto scaling, also called unit variance scaling. It is commonly applied and
uses the standard deviation, sj , as the scaling factor. After auto scaling,
all the elements of X have a standard deviation equal to 1 and therefore
the data is analyzed on the basis of correlations instead of covariances
2. Vast scaling [97]. Vast is an acronym of variable stability scaling and
it is an extension of auto scaling. It focuses on stable variables, the
variables that do not show strong variation, using the standard deviation
and the so-called coefficient of variation as scaling factors. The use of the
coefficient of variation, defined as the ratio of the standard deviation and
the mean: sj/xj , results in a higher importance for variables with a small
relative standard deviation
3. Range scaling. Range scaling adopts the difference between the minimal
and the maximal value, (xj,max − xj,min ), as scaling factor. A disadvan-
tage of range scaling with respect to other scaling methods is that only
two values are used to estimate the range, while for the standard devia-
tion all measurements are taken into account. This makes range scaling
more sensitive to outliers. To increase the robustness of range scaling,
the range could also be determined by using robust range estimators or
after the outliers have been removed.
4. Level scaling. The mean values of the variables, xj , are used as scaling
factors. Level scaling converts deviations from the mean (the mean is
always subtracted) in percentages compared to the mean values. As for
the range scaling, also level scaling can be affected by outliers. Then,
a more robust estimator of the mean, the median, could be used or the
mean could be determined after outlier removal. Level scaling can be
used when large relative changes are of specific interest. However, in
the case of the thermochemical state of a system, this could lead to an
overestimation of the role of chemical species which appear in very small
concentrations, i.e. radicals.
70
4.2. Sample PCA
5. Max scaling. The variables are normalized by their maximum values,

xj,max , so that they are all bounded between zero and one. As for the
range and level scaling, a robust estimator of maximum values or a pro-
cedure for outliers removal should be employed.
As it was pointed out in the above discussion, range, level and max scaling
can be affected by the presence of outliers in the sample X. This problem can
be overcome by means of robust estimators of the quantities of interest (i.e.
median in place of sample average); however, a procedure for the detection and
removal of outlier observations could provide a viable solution to the problem.
PCA can be effectively employed for outlier detection in large data sets.
4.2.2.1 Outlier detection and removal with PCA

Experimental data sets usually contain a few unusual observations. If we refer
to a one-dimensional problem, the outliers can be classified as those observa-
tions which are either very large or very small with respect to the others. In
high dimensions, there can be outliers that do not appear as outlying observa-
tions when considering each dimension separately and, therefore, they will not
be detected using univariate criteria. Thus, a multivariate approach must be
pursued.
The usual procedure for outlier detection in multivariate data analysis is
to measure the distance of each observation, Xi = (xi1 , xi2 , . . . , xip ), from the
data center, using the so called Mahalanobis distance:
0
DM = Xi − X S −1 Xi − X . (4.25)

The observations associated to large values of DM are considered outliers and,

then, they are discarded. However, a procedure must be employed to ensure a
robust estimation of the sample mean. To this purpose, PCA can be effectively
exploited [98].
PCA has long been used for multivariate outlier detection. The sum of
squares of the PCs, standardized by the eigenvalue size, equals the Mahalanobis
distance for an observation i:
p 2
X z2 2
zi1 z2 zip
ik
= + i2 + . . . + = DM . (4.26)
lk l1 l2 lp
k=1
The first few principal components have large variances and explain most of
the variation in X. Therefore, these major components are strongly affected
by variables with relatively large variances and covariances. Consequently, the
observations that are outliers with respect to the first few components usually
correspond to outliers on one or more of the original variables. On the other
hand, the last few principal components represent linear functions of the orig-
inal variables with minimal variance. These components are sensitive to the
71
observations that are inconsistent with the correlation structure of the data but
are not outliers with respect to the original individual variables.
Based on the above considerations, the following detection scheme can be
proposed [98]:
1. Multivariate trimming. A fraction γ (0.005-0.01) of the data points char-
acterized by the largest value of DM is classified as outliers and removed.
New trimmed estimators for X and S are then computed from the re-
maining observations. The trimming process can be iterated to ensure
that X and S are resistant to outliers.
2. Principal components classifier (PCC). The PCC consists of two func-
z2
tions, one from the major PCs, qk=1 lik , and one from the minor PCs,
P
k
Pp 2
zik
k=p−r+1 lk . The first function can easily detect observations with large
values on some of the original variables; in addition, the second function
helps detect the observations that do not conform to the correlation struc-
ture of the sample. The number of major components, q, is determined
to explain about 50% of the original data variance, while r is chosen so
that the minor components used for the definition of the PCC are those
whose variance is less than 0.20, thus indicating the existence of almost
linear relations among the variables. Based on the PCC definition, an
observation Xi is classified as outlier if:
Pq zik 2 2
zik
(4.27)
Pp
k=1 lk > c1 k=p−r+1 lk > c2
where c1 and c2 are chosen as the 0.99 quantile of the empirical distribu-
z2 2
zik
tions of qk=1 lik
Pp
and .
P
k k=p−r+1 lk
An example of application of the outlined detection scheme in shown in Figure

4.1 and Figure 4.2, with reference to a data set consisting of 62766 observations
of 10 variables [99]. Outliers have been artificially introduced in the experi-
mental data: specifically, 1000 random numbers between 0 and 1 have been
generated and successively scaled using the standard deviation, sj , of the vari-
ables xj . The effect of the outliers on the PCs is very clear from Figure 4.1.
The introduced unusual observation are outliers with respect to the original
variables and they are reflected in the first few PCs; a small cluster of points,
separated from the majority of observations, appears in the plot of the first and
second PCs scores (Figure 4.1 (a)). If the outlier detection scheme is applied,
the introduced outliers are completely removed (Figure 4.1 (b)); in addition,
outliers present in the original experimental data set, affecting the last PCs
scores (multivariate outliers), are also detected with the procedure described.
Outliers must be treated with care as they can strongly affect the determination
of the covariance matrix, thus leading to the identification of false PCs (Figure
4.2).
All the data sets analyzed in the present Chapter have been preprocessed
to remove outlying observations.
72
4.2. Sample PCA
(a) (b)
Figure 4.1: Principal components scores with (a) and without (b) outliers.
(a) (b)
Figure 4.2: Eigenvalues size with (a) and without (b) outliers.
73
Figure 4.3: Size reduction process with PCA.
4.2.3 Choosing a subset of Principal Components or Variables

The major objective in many applications of PCA is to replace the p elements
of X by a much smaller number q of PCs, which nevertheless discard very little
information. Then, it is crucial to determine how small q can be taken without
serious information loss. Several criteria have been proposed and they will be
discussed in the following Sections. Once a criteria for selecting q is adopted,
size reduction can be accomplished, as described by Figure 4.3 for a simple two
dimensional case. The procedure outlined in Figure 4.3 is extremely attractive
for the analysis of turbulent reacting systems, as it provides a mathematical
formalism for the automate selection of the parameters which maximize the
variance in state space.
4.2.3.1 Cumulative percentage of total variance

The most obvious criterion for choosing q is to select a cumulative fraction
of the total variance that the PCs have to account for, i.e. 0.8 or 0.9. The
required number of PCs, q, is then the smallest value of q for which this chosen
percentage is exceeded. As discussed in Section 4.2, the PCs are successively
selected to maximize
Pp their
Pp variance, expressed by the associated eigenvalue lk .
Thus, being k=1 lk = j=1 var (xj ), the desired fraction of the total variance
can be defined as:
Pq
lk
tq = Ppk=1 . (4.28)
k=1 lk
Following the derivation of tq , an appropriate measure of lack-of-fit of the rank

q approximation of X is:
p
n X
X p
X
(xq,ij − xij )2 = lk . (4.29)
i=1 j=1 k=q+1
74
4.2. Sample PCA
For a given number of retained PCs, it is possible to define the variance

accounted for each variable by the retained eigenvectors as:
q √ 2
X ajk lk
tq,j = (4.30)
sj
k=1
where ajk is the weight of the jth variable on the kth eigenvector and sj is the
standard deviation of variable xj .
4.2.3.2 Variance of Principal Components
The rule described in this Section originally applies to correlation matrices,

although it can be easily adapted to covariance matrices.
The idea behind this rule is that if all elements of X are independent, then
the PCs are the same as the original variables and they are characterized by
unit variances in the case of a correlation matrix (e xj = xj/sj ). Thus, any PC
with variance less than 1 explains less variation than the original variables and
can be discarded. According to this rule, also known as Kaiser’s rule [100], only
those PCs whose variances lq exceed 1 are retained. However, a cut-off at lq =
1 could lead to discard important information in some circumstances. In fact,
if a variable from the sample is more or less independent of all others, it will be
characterized by small coefficients in (p − 1) PCs but it will dominate one PC,
whose variance will be close to 1 when using the correlation matrix. However,
deletion will occur if Kaiser’s rule is used, and if, due to sampling variation,
lq < 1. It is therefore advisable to choose a cut-off lower than 1, to allow for
sampling variation. Jolliffe [101] suggests that 0.7 is roughly the correct level.
The rule just described can be easily extended to covariance matrices by taking
as cut-off l∗ equal to the average value, l, of the eigenvalues, or a somewhat
lower cut-off such as l∗ = 0.7l .
4.2.3.3 Broken Stick Model
Frontier [102] proposed a broken-stick method to select the number of PCs. If

we have a stick of unit length, broken into p segments, then it can be shown
that the expected length of the qth longest segment is:
q
1X1
lq∗ = . (4.31)
p k
k=1
This method actually compares the eigenvalues from the observed sample
with the eigenvalues from random data. Based on Eq. (4.31), the observed
eigenvalues are considered interpretable if they exceed lq∗ .
75
4.2.3.4 Scree plot

Another common method to determine the number of PCs is the Scree Plot
(Figure 4.2). This is a simple plot of the eigenvalues sorted in descending order
against their indexes. The number of eigenvalues to retain is based on the
observation of the index q at which the slopes of lines joining the plotted points
are steep to the left of q, and not steep to the right of it. Cattell [103] originally
proposed that the points to the left of the straight line, defined by the smaller
eigenvalues (three components for the data in Figure 4.2), should be considered
important. Afterwards, Cattell and Vogelmann [104] concluded that also the
first eigenvalue to the right of this point should be included (four components
for the data in Figure 4.2). Often the Scree Plot approach is complicated by
either the leak of any obvious break or the possibility of multiple break points.
4.2.3.5 Choosing a subset of Variables

Principal components (PCs) are linear combinations of all variables available.
However, these variables are not necessarily equally important to the formation
of PCs: some of the variables might be critical, but some might be redundant.
Motivated by this fact, it was attempted to link the PCs back to a subset of
the original variables, selecting critical variables or eliminating irrelevant ones.
Such an approach is very appealing as it overcomes one of the major issues
related to PCA. Being linear combination of all the original variables, the PCs
often lack of physical meaning. Therefore, working in terms of the original
variables could be helpful and straightforward. However, it should be noticed
that for a given number, m, of retained variables, an equal number of PCs will
always explain more of the original data variance.
A number of methods exist for selecting a subset of m original variables
which preserve most of the variation in X. Some of them are directly related
to the PCs.
• B4 Forward method [101]. PCA is performed on the original matrix of

p variables and n observation, i.e. size (X) = (n, p). The eigenvalues of
the covariance/correlation matrix are then computed and a criterion is
chosen to retain q of them (l∗ = 0.7l [101]). If p1 components have eigen-
values less than l∗ , the eigenvectors associated with the remaining p − p1
eigenvalues are evaluated starting with the first component. The variable
associated with the highest eigenvector coefficient is then retained from
each of the p − p1 variables, as it is highly correlated with an important
PC. A second PCA is performed such that p − p1 − p2 variables remain.
PCA is then repeated until all the components have eigenvalues larger
than l∗ .
• B2 Backward method [101]. PCA is performed on the original matrix of

p variables and n observation, i.e. size (X) = (n, p). The eigenvalues
76
4.2. Sample PCA
of the covariance/correlation matrix are then computed and a criterion

is chosen to retain q of them (l∗ = 0.7l [101]). If p1 components have
eigenvalues larger than l∗ , the eigenvectors associated with the remain-
ing p − p1 eigenvalues are evaluated starting with the last component.
The variable associated with the highest eigenvector coefficient is then
discarded from each of the p − p1 variables, as it is highly correlated with
an unimportant PC. A second PCA is performed such that p − p1 − p2
variables are discarded. PCA is then repeated until all the components
have eigenvalues larger than l∗ .
• M2 backward method [105]. PCA is performed on the original matrix of

p variables and n observation, i.e. size (X) = (n, p). The eigenvalues
of the covariance/correlation matrix are then computed and a criterion
is chosen to retain q of them. The (n x q) matrix of PCs scores, Zq , is
then evaluated. The goal is to select m (m < p and m ≥ q) variables
from X which preserve most of the data variation. The PCs scores from
the reduced data are denoted by Ẑ. Being q the true dimensionality of
the data, as determined with the PCA analysis, Zq is the true matrix of
scores, also indicated as true configuration, while Ẑ is the corresponding
approximation based on m variables. The discrepancy between the two
configuration is evaluated with a Procrustes Analysis2 . The idea is to
compare the shape of Zq and Ẑ, to establish which set of m original
variables better reproduces the true configuration Zq . In practice, this
consists in the following steps:
– Find the sum of squared differences between corresponding points of

Zq and Ẑ, after they have been matched as well as possible under
translation, rotation and reflection.
∗ Matching under translation is ensured by centering both Zq and
Ẑ
∗ Matching under rotation and reflection is ensured by considering
Zq as the fixed configuration and transforming Ẑ.
– The quantity which is to be minimized in the selection of variables is
the following sum of squared differences between the configurations:
0 0 0 0

M 2 = trace Zq Zq + Ẑ Ẑ − 2ẐQ Zq =
0 0
(4.32)
= trace Zq Zq + Ẑ Ẑ − 2Σ
0 0 0
Q=VU Ẑ Zq = U ΣV (4.33)
0
where Σ is the matrix of singular values from the SVD of Ẑ Zq .
2
In statistics, Procrustes analysis is a form of statistical shape analysis used to analyze
the distribution of a set of shapes.
77
The M2 algorithm employs the following backward elimination procedure

for the retention of m variables from the original data:
1. Initially, set m = p and, for a fixed q, compute the matrix of PCs

scores, Zq .
2. Obtain and store the matrix of PCs scores obtained by deleting in
turn each variable from X.
3. Compute M 2 for each matrix of scores and identify the variable xu
which yields the smallest M 2. Let Ẑu denote the corresponding
matrix of scores.
4. Delete variable xj . Set Zq = Ẑu and return to stage 2 with p − 1
variables. Continue the cycle until only m variables are left.
• McCabe criteria [106]. This approach originates from the observation that
the PCs satisfy a certain number of optimality criteria (Section 4.2.1). A
subset of the original variables that optimizes one of these criteria is
called a set of principal variables by McCabe [106]. Suppose that the
set of variables of X is partitioned into subsets X (1) and X (2) . The
covariance matrix of X can be partitioned as:

S 11 S 12
S= . (4.34)
S 21 S 22
Then, the partial covariance matrix for X (2) given X (1) is:
S 22,1 = S 22 − S 21 S −1
11 S 12 . (4.35)
The criteria proposed by McCabe [106] for the definition of the principal
variables are:
min m
Q
MC1 max |S 11 | = min |S 22,1 | = P k=1 δk
MC2 min tr (S 22,1 ) = min m δ
2 Pmk=1 2k (4.36)
MC3 min
Pr kS 22,1 k = min k=1 δk
MC4 max k=1 ρ2k , with r = min (m, p − m)
where δk are the eigenvalues of S 22,1 and ρk are the canonical correla-
tions between the selected and not selected variables. As McCabe [106]
points out, after the selection of the PVs, S 22,1 represents the information
left in the remaining unselected variables and, then, it is quite plausible
that three of the optimality criteria should be functions of this matrix.
McCabe [106] criteria are very appealing as they satisfy well defined prop-
erties. For instance, criterion MC1 maximizes the variance of the data
explained by the subset of variables, while MC2 and MC3 both minimize
the reconstruction error. However, the criteria rapidly becomes compu-
tationally unfeasible for very large data sets.
78
4.2. Sample PCA
• Principal Features [107]. Using this method the dimension reduction is

accomplished by choosing a subset of the original features that contains
most of the essential information, both in the sense of maximum vari-
ability of the variables in the lower dimensional space and in the sense
of minimizing the reconstruction error. The rows of the eigenvector ma-
trix, Aq , denoted as Vj , represent the projection of the jth variable onto
the lower dimensional space, that is, the q elements of Vj correspond
to the weights of xj on each axis of the subspace. The key observation
of the method is that variables that are highly correlated or have high
mutual information will have similar weight vectors Vj . On the two ex-
treme sides, two independent variables have maximally separated weight
vectors; while two fully correlated variables have identical weight vectors
(up to a change of sign). Therefore, the structure of the rows Vj is first
analyzed, to find the subsets of variables that are highly correlated; then,
a variable is extracted from each subset. The chosen variables represent
each group optimally in terms of spread in the lower dimension, recon-
struction and insensitivity to noise. The principal features algorithm can
be summarized in the following five steps [107]:
1. Compute the sample covariance/correlation matrix, S.

2. Compute the PCs and eigenvalues of S, lk .
3. Choose the subspace dimension q and construct the matrix Aq .
4. Cluster the q vectors, Vj , to m ≥ q clusters using the k-means algo-
rithm [108]. The distance measure used for the k-means algorithm
is the Euclidean distance. Choosing m greater than q is usually
necessary if the same variability of the PCs subset is desired.
5. For each cluster, find the corresponding vector Vj which is closest to
the mean of the cluster. Choose the corresponding feature, xj , as
a principal variable. This step yields the choice of m features. The
reason for choosing the vector nearest to the mean is twofold. This
feature can be thought of as the central feature of that cluster, the
one dominant in it, and which holds the least redundant information
of features in other clusters.
4.2.4 Interpretation of principal components

The principal components are, by construction, linear combinations of all the
measured variables. Therefore, their physical interpretation is usually not
straightforward. Various attempts have been made to overcome this difficulty;
among them, PCs rotation represents a very common solution. Through rota-
tion, the weights can be redefined to meet alternative criteria. In particular,
rotation is aimed at attaining a simple structure for Aq , so that weights on a
principal component are either close to unity or close to zero and, thus, variables
have large weights on only few or (ideally) one principal component.
79
More formally, rotation is concerned with finding an orthogonal matrix, T ,

so that orthogonally rotated weights (or loadings) for the PCs can be defined
as:
Bq = Aq T . (4.37)
The matrix T is chosen to optimize one of many simplicity criteria available

[109]. The most common orthogonal rotation is based on the maximization of
the VARIMAX criterion [110]:
 !2 
q p p
1 1X
X X
VM AX (Aq ) = a4ik − a2ik . (4.38)
p p
k=1 i=1 i=1
Kaiser [110] refers to this as raw VARIMAX, but it is the version that has
become most popular. Verbally, this is simply the sum of the column-wise
variances of the squared elements of Aq . In other words, a criterion is defined
to maximize the amount of variance explained for any of the original variables
on single PCs. After VARIMAX rotation, Aq will generally have fewer large
loadings in its columns, thereby making the columns more easily interpretable.
A simple analytical solution for the maximization of the criterion in Eq. (4.38)
exist for the two-dimensional case [110]. Indicating the columns of Aq with k
and l, the two-dimensional solutions is:
Pp 2 − a2 (2a a ) +

t = 2p
P 2 i=1 aik il
Pp ik il
ahik − a2il

−2 i=1 (2aik ail ) i
2 , (4.39)
b = p pi=1 a2ik − a2il − (2aik ail )2 +
P
2 2
− [ pi=1 (2aik ail )]
P 2
aik − a2il
P
−
thus leading to the following definition of the optimal rotation angle, φ
1
φ = arctan (t, b) . (4.40)
4
The optimal rotation matrix is then given by:

cos (φ) −sin (φ)
B2 = . (4.41)
sin (φ) cos (φ)
The two-dimensional solution just presented can be extended to the more

general case of a sample X with dimensionality p. In this case, a possible
algorithm consists in applying the planar solution over all the (2 x 2) orthogonal
B2 matrices, through a sequence of iterates converging to a final solution. Little
is known about the convergence behavior of the algorithm, but in practice it
usually converges to a local maximizer of the VARIMAX criterion.
80
4.3. Local Principal Components Analysis
Figure 4.4: Schematic illustration of the VARIMAX rotation [110].
4.3 Local Principal Components Analysis

The PCA transformation described in Section 4.2 can suffer from its reliance on
second order statistics. In fact, the PCs are uncorrelated, i.e. their second-order
product moment is zero, but they can still be highly statistically dependent.
This is particularly important when the relationships among the correlated
variables are non-linear, as it usually happens for a reacting system. In this
case, PCA fails to find the most compact description of the data and it usually
requires a larger number of components to model the low-dimensional hyper
plane embedded in the original space, with respect to a non-linear technique.
This simple realization has prompted the development of non-linear alternatives
to PCA. A considerable amount of work has been done in the context of neural
networks. Nevertheless, here we are more interested in a different approach,
introduced by Kambhatla and Leen [111] in the field of images processing and
known as Local Principal Components Analysis (LPCA).
LPCA employs a local linear approach to reduce the statistical dependency
between the variables of a sample and to achieve the desired optimal dimension
reduction. According to LPCA, a Vector Quantization (VQ) algorithm first par-
titions the data space into disjoint regions and then PCA is performed in each
cluster, relying on the observation that, if the local regions are small enough,
the data manifold will not curve much over the extent of the region and the
linear model will be a good fit. For the LPCA to be effective, the VQ algorithm
should not be independent of the PCA analysis. For example, a partitioning
based on the Euclidean distance is very intuitive and easy to implement but
the sample clustering is carried out without any connection with the following
projection onto the lower-dimensional subspace. For this reason, Kambhatla
and Leen [111] introduce a VQ algorithm based on a reconstruction error met-
ric. Given an observation from the sample X, Xi , a global reconstruction error
for each observation can be defined as:

(k)
GRE Xi , X = kXi − Xi,q k =

(k)
= Xi − X + Zi,q A(k) (4.42)

q
81
(k)
where X is kth cluster centroid, Xi,q is the rank q approximation of Xi ,
Zi,q is the ith value of the truncated set of PCs, Zq , and A(k)q is the matrix ob-
tained by retaining only the first q eigenvectors of the covariance matrix, S (k) ,
associated to the kth cluster. In the context of reacting systems Eq. (4.42)
needs to be modified to take into account the differences in size and units of
the state variables. In fact, a clustering based on GRE would lead to an op-
timization with respect to temperature only. Therefore, the original LPCA
algorithm from Kambhatla and Leen [111] was modified [92] to include data
preprocessing (Section 4.2.2) in the quantization scheme. A very stable algo-
rithm is obtained by using a global scaled reconstruction error metric, GSRE ,
defined as:

(k)
GSRE Xi , X , D = X − X (4.43)
f
i i,q
f
where X ei is the ith observation of the sample scaled by D, the diagonal matrix
whose jth diagonal element is the scaling factor dj associated to xj . The
proposed LPCA algorithm, briefly referred as VQPCA, can be summarized as
follows:
(k)
1. Initialization: the cluster centroids, X , are randomly chosen from the
data set and S (k) is initialized to the identity matrix for each cluster.
2. Partition: each observation from the sample is assigned to a cluster using

the squared reconstruction distance given by GSRE .
3. Update: the cluster centroids are updated on the basis of partitioning

carried out at step 2.
4. Local PCA: PCA is performed in each disjoint region of the sample.
5. Steps 2-4 are iterated until convergence is reached.
The VQPCA algorithm is illustrated in Figure 4.5. The vector quantization step
partitions the data into cluster, trying to follow the curvature of the manifold
in the low-dimensional space. Then, the points are assigned to the clusters
depending on their low-dimensional projection on each of the identified clusters.
The goodness of reconstruction given by VQPCA is measured with respect
to the mean variance in the data as:
E (GSRE )
GSRE,n = (4.44)
E [var (e
xj )]
where E denotes the expectation operator and x ej is the scaled jth variable
from X. If auto scaling is employed in data preprocessing, Eq. (4.44) reduces
to:
GSRE,n = E (GSRE ) . (4.45)
82
4.3. Local Principal Components Analysis
Figure 4.5: Schematic illustration of the VQPCA algorithm [92] .
Convergence can be judged using the following criteria:

1. The normalized global scaled reconstruction error, GSRE,n , is below a
specific threshold, ∗GSRE,n .
2. The relative change in cluster centroids between two successive iterations

is below a fixed threshold, i.e. 10−8 .
3. The relative change in GSRE,n between two successive iterations is below

a fixed threshold, i.e. 10−8 .
Requirements 2 and 3 are particularly useful if an explanatory analysis on
the performances of VQPCA in terms of GSRE,n is of interest. In this case,
requirement 1 can be relaxed and the variation of GSRE,n as a function of the
number of eigenvalues and clusters can be analyzed by enforcing requirements
2 and 3. Otherwise, all the three conditions can be used and an iterative
procedure for the determination of the number of eigenvalues required to achieve
a fixed GSRE,n can be employed. Staring with q = 1, the number of eigenvalues
can be increased progressively until the desired error level is reached.
The VQPCA approach is based on the unsupervised partitioning of the data
into clusters, based on the minimization of the reconstruction error. Therefore,
the approach is optimal from the point of view of error minimization; how-
ever, being the partitioning iterative, the approach can result computationally
intensive for very large data sets (i.e. data from DNS). Therefore, a viable
alternative to VQPCA could be represented by a supervised partition of data
into clusters, based on the a priori knowledge on the conditioning variable.
In the context of turbulent non-premixed combustion, the obvious choice is
represented by mixture fraction. Therefore, a Mixture fraction PCA (FPCA)
algorithm can be proposed as follows [92]:
1. Partition. The data is partitioned into bins of mixture fractions.
2. Local PCA: PCA performed in each of mixture fraction bin.

The FPCA approach is schematized in Figure 4.6. The data are partitioned
into two mixture fraction bins, i.e. rich and lean regions, and a one-dimensional
83
Figure 4.6: Schematic illustration of the FPCA algorithm [92] for a CO/H2
flame [112]..
coordinate system is identified in each cluster. With respect to the VQPCA

approach, FPCA allows a very fast clustering. However, it is not possible to
state a priori that the choice of the mixture fraction as conditioning variable
is the best available.
In the following, the local approaches will be compared to the classic ap-
proach consisting in the application of PCA to compete sets of data, i.e. taking
k = 1, and denoted with Global PCA (GPCA).
4.4 Data sets for model validation

A prerequisite for the application of the PCA methodology described in the
previous Sections is the availability of data sets for the extraction of the PCs.
Both experimental and numerical data were used in the present study. The
data to be analyzed with PCA have been organized in (n x p) matrices, X,
whose rows represent instantaneous spatial snapshots of the reacting species
concentrations and temperature.
4.4.1 Experimental data

High fidelity experimental data provided under the framework of the Workshop
on Measurement and Computation of Turbulent Non-premixed Flames (TNF
workshop) [80] have been used to assess the PCA methodology.
The first flame investigated in the present study is a turbulent non-premixed
CO/H2 /N2 (0.4/0.3/0.3 by vol.) jet flame [112], hereafter called simply jet
flame, selected as base case for the analysis due to its favorable properties. In
84
4.4. Data sets for model validation
fact, the flame does not experience any liftoff or localized extinction and retains
the simple flow geometry of the hydrogen jet flames [80], while adding a modest
level of chemical kinetic complexity. Moreover, the flame is fully characterized
in terms of scalar data. Simultaneous Raman/Rayleigh/LIF measurements
of temperature and species concentrations were conducted at Sandia National
Laboratories, California. About 800 to 1000 measurements were taken at dif-
ferent spatial locations, for a total of 66.275 data points of nine different state
variables (T, N2 , O2 , H2 O, H2 , CO, CO2 , OH, NO). The mixture fraction
calculated following Bilger [75] is also available in the experimental results.
The stoichiometric value of the mixture fraction for this flame is 0.295. The
experimental uncertainties can be obtained from [112].
The second flame is a CH4 flame, part of a series of four piloted jet flames,
C, D, E and F, investigated by Barlow and Frank [99]. Starting from Flame
D, the velocity, and then the Reynolds number, associated to the main jet
is raised, thus increasing the probability of local extinction phenomena. The
flames of interest for the present study are Flame D and, in particular, Flame F.
Flame F shows severe non-equilibrium effects and it is close to global extinction
in the downstream part of the flame; therefore, it represents a challenging
system to judge PCA capabilities in terms of chemical manifolds identification
and parametrization. Likewise the jet flame, Raman/LIF measurements of
temperature and species’ concentrations (T, N2 , O2 , H2 O, H2 , CH4 , CO, CO2 ,
OH, NO), are provided at different spatial locations for a total of 62.766 data
points. The mixture fraction is defined according to Bilger [75], except that
only the elemental mass fractions of hydrogen and carbon are included. The
mixture fraction for this flame is 0.351. The experimental uncertainties can be
obtained from [99, 113].
The third system investigated is the jet in hot co-flow (JHC) burner [46,
49, 50], hereafter denoted with JHC, designed to emulate the flameless com-
bustion regime (Section 1). It consists of a central fuel jet (80% CH4 and 20%
H2 ) within an annular co-flow of hot exhaust products from a secondary burner
mounted upstream of the jet exit plane. The O2 level of the co-flow is controlled
at three different levels, i.e. 3, 6 and 9% (by vol.), while the temperature and
exit velocity are kept constant. Similarly to the other flames, around 56.000
observations are provided for temperature and species concentrations (T, N2 ,
O2 , H2 O, H2 , CH4 , CO, CO2 , OH, NO). The mixture fraction is defined accord-
ing to Bilger [75], The experimental uncertainties can be obtained from [46].
The availability of experimental data for the JHC system has been particularly
important for the present Thesis, as it has allowed to give insights for the CFD
analysis of the combustion systems investigated in Chapters 5-7. In particular,
information regarding turbulence/chemistry interactions in flameless combus-
tion regime have been extracted from the PCA analysis.
85
4.4.2 Numerical data

In conjunction with high fidelity experimental data, numerical results from the
Direct Numerical Simulation (DNS) of CO/H2 oxidation with detailed chem-
istry [114] have also been considered. Details about the DNS simulations and
code can be found in Sutherland et al. [88]. Two DNS data sets have been con-
sidered: a spatially evolving and a temporally evolving jet, characterized by a
significant degree of extinction. For the first data set, indicated as DNS1, three
temporal slices, each consisting of approximately 1.500.000 scalar observations
(T, H2 , O2 , O, OH, H2 O, H, HO2 , H2 O2 , CO, CO2 , HCO), are available; the
second data set, DNS2, consists of twelve temporal slices, each one comprising
around 700.000 observations of the same variables.
The advantage of DNS data with respect to experimental data lies in the
large amount of data accessible. Moreover, DNS simulations give access to
many additional variables, beside scalar values, which are not provided by any
experimental campaign. In particular, the scalar source terms can be extracted
from DNS simulations, thus allowing to judge the capabilities of the extracted
PCs to parametrize not only the original variables, but also their source terms.
Of course, in the perspective of adopting PCA as a predictive model, the gener-
ation of data with DNS for PCs extraction does not represent a viable solution
and other approaches, such as One Dimensional Turbulence (ODT) [115], could
be pursued.
4.5 Results
This section describes the results of the PCA methodology applied to the ex-
perimental and numerical data sets described in Section 4.4 are here presented.
First, the capabilities of PCA for the identification of low-dimensional mani-
folds in turbulent reacting systems is investigated. In particular, the effect of
the preprocessing strategies and modeling approaches (i.e. GPCA vs. LPCA)
on the manifold dimensionality is thoroughly discussed, trying to provide also
a physical interpretation for the extracted PCs.
Then, the feasibility of a PCA based combustion model is discussed. The
PCA model is validated a priori using the DNS data sets and its performances
are compared to those of an ideal flamelet parametrization (Chapter 2).
4.5.1 PCA for the identification of low-dimensional manifolds

The objective of the present Section is to provide a methodology i) to in-
vestigate the existence of low-dimensional manifolds in turbulent flames, ii)
to find the most compact representation for them and iii) to guide the se-
lection of “optimal” reaction variables able to accurately reproduce the state
space of a reacting system. PCA has been previously applied to combustion.
Frouzakis et al. [116] applied PCA for data reduction of two-dimensional DNS
86
4.5. Results
data of opposed jet flames. The analysis was aimed at identifying the num-
ber of components required to accurately approximate the original data. To
this purpose, the correlations among velocities, pressure and species concentra-
tions at different times were taken into account, thus leading to eigenvectors
which are linear combination of the temporal snapshots considered. Similarly,
Danby and Echekki [117] implemented PCA for the analysis of an unsteady
two-dimensional direct numerical simulation of auto ignition in homogeneous
hydrogen air mixtures, with the main purpose of determining the requirements
to reproduce passive and reactive scalars during the process of auto ignition.
The approach presented here is quite different from the ones described above.
The main purpose of the developed PCA methodology is to find correlations
among the state variables (temperature and species concentration) to allow
an optimal approximation of the system in a low-dimensional space. Such an
approach leads to the determination of eigenvectors which are linear combina-
tions of the original variables in a way that allows reducing the dimension of the
system. A similar method was proposed by Maas and Thévenin [118] for the
analysis of DNS data. However, they only considered a very small sampling in
state space. The current study provides significantly more depth in its analysis,
and applies PCA to both experimental and numerical data sets.
4.5.1.1 GPCA of experimental data sets
Figure 4.7 shows the magnitude of the eigenvalues associated with the PCA
reduction of the jet flame data set, together with the contribution of the q
largest eigenvalues to the amount of variance explained by the new basis vectors.
The eigenvalue distribution reflects the covariance structure of the data set,
shown in Table 4.1, and obtained by applying the auto scaling criterion. It is
clear that the first two eigenvalues alone account for more than the 92% of the
total variance in the data. On the other hand, the last four smallest eigenvalues
are very close to zero; therefore, they contain no useful information and only
explain linear dependencies among the original variables. Therefore, a strong
size reduction, from 9 to 2 or 3, can be accomplished by using PCA, through
the identification of the most active directions in the original data. The total,
tq , and individual variance, tq,j , accounted for the jet flame by the first two
or three eigenvalues are listed in the first two columns of Table 4.2. It can be
observed that, by choosing q = 2, it is possible to capture more than 90% of the
individual variances of all the main species and temperature, while the minor
species, OH and NO, require an additional component, q = 3, to reach levels
of approximation comparable to the other state variables.
This is confirmed by the analysis of the parity plots of temperature and
species mass fractions given by the PCA reconstruction for the cases q = 2
(Figure 4.8) and q = 3 (Figures 4.9). It can be observed that the addition of a
component has a small effect on temperature and main species, whose variation
is mainly explained by the first two components (Table 4.2). Moreover, the par-
87
Figure 4.7: Scree-graph and histograms of the q largest eigenvalues for the jet
flame data set, preprocessed with auto scaling.
ity plots of temperature (Figures 4.8 and 4.9 (a)), H2 O mass fraction (Figures
4.8 and 4.9 (d)) and minor species such as OH and NO (Figures 4.8 and 4.9
(e, f)) point out the existence of non linear deviations in the recovered data,
which can be probably ascribed to non linear dependencies among the original
variables. This result suggests that the low-dimensional projection of the ther-
mochemical state shows significant non linearities which cannot be taken into
account with a global linear approach. Therefore, specific algorithms perform-
ing PCA in locally linear regions of the data (Section 4.3) could be taken into
account, to improve the accuracy of the parametrization.
Figure 4.10 shows the eigenvalue size distribution and the contribution of
the q largest eigenvalues to the total explained variance, tq , for Flame D, F and
JHC data sets. The covariance matrices for the data sets are shown in Table
4.3-4.5. Similarly to the jet flame, a significant size reduction can be achieved
for D and F flames, although an additional component is required, q = 3 or
q = 4, due to the higher complexity of the piloted flames (Section 4.4.1). On
the other hand, the JHC data set shows a higher dimensionality and at least 4
components are needed to explain as much as 90% of the total variance in the
original data. The number of required PCs, q, increases to 5 if an individual
variance, tq,i , above 90% is desired for all the variables, as indicated in Table
4.6. Such result is particularly interesting for the present Thesis, as it confirms
the complexity in the numerical modeling of the flameless combustion regime
[45, 78, 79], caused by the overlap between chemical and mixing scale and,
thus, by the need of optimal progress variables for the description of complex
interactions which take place in such regime.
Table 4.6 lists the values of tq and tq,j accounted for Flame D, F and for
JHC. It is interesting to observe the very strong similarities between Flame
D and F, confirmed by the analysis of their covariance structure (Tables 4.3
88
4.5. Results
Table 4.1: Covariance matrix for the jet flame data set. Scaling criterion adopted: auto scaling.
 
T Y O2 YN 2 YH2 YH2 O YCO YCO2 YOH YN O

 T 1.000 −0.825 −0.512 0.005 0.938 0.117 0.984 0.771 0.815 


 YO2 1.000 0.887 −0.541 −0.909 −0.646 −0.767 −0.562 −0.558 


 YN2 1.000 −0.835 −0.667 −0.902 −0.438 −0.266 −0.256 

89

 YH2 1.000 0.196 0.973 −0.082 −0.168 −0.170 


 YH2 O 1.000 0.329 0.892 0.725 0.678 


 YCO 1.000 0.024 −0.081 −0.113 


 YCO2 1.000 0.793 0.855 

 YOH 1.000 0.639 
YN O 1.000
Table 4.2: Total, tq , and individual variance, tq,j , accounted for the jet flame data set, as a function of the number of retained
PCs, q, and the preprocessing criterion.
tq,i (%)
auto range max vast level
q=2 q=3 q=2 q=3 q=2 q=3 q=2 q=3 q=2 q=3
T 0.971 0.973 0.983 0.991 0.979 0.990 0.992 0.992 0.896 0.943
YO2 0.986 0.986 0.994 0.994 0.997 0.997 0.975 0.978 0.942 0.961
0.986 0.986 0.981 0.981 0.971 0.971 1.000 1.000 0.965 0.970
90
YN2
YH2 0.968 0.969 0.962 0.963 0.957 0.960 0.945 0.947 0.991 0.991
YH2 O 0.930 0.936 0.945 0.945 0.944 0.944 0.940 0.978 0.870 0.884
YCO 0.994 0.994 0.995 0.997 0.990 0.994 0.979 0.980 0.987 0.987
YCO2 0.973 0.977 0.979 0.987 0.977 0.988 0.981 0.985 0.908 0.959
YOH 0.738 0.940 0.731 0.991 0.745 0.992 0.660 0.687 0.870 0.993
YN O 0.772 0.930 0.728 0.795 0.729 0.802 0.744 0.970 0.701 0.926
tq (%) 0.924 0.966 0.946 0.975 0.942 0.975 0.992 0.996 0.949 0.980
4.5. Results
(a) (b)
(c) (d)
(e) (f)
Figure 4.8: Parity plots of temperature (a), H2 O (b), H2 (c), CO (d), OH (e)
and NO (f) mass fractions illustrating the GPCA (q = 2) reduction of the jet
flame data set. Scaling criterion adopted: auto scaling.
91
(a) (b)
(c) (d)
(e) (f)
and NO (f) mass fractions illustrating the GPCA (q = 3) reduction of the jet
flame data set. Scaling criterion adopted: auto scaling.
92
4.5. Results
and 4.4), thus indicating that the relations between the state variables are not
strongly affected by the increase in Reynolds number from one flame to the
other.
A closer look at the covariance matrices structure indicate that, with the
exception of the JHC data set, there is always a strong correlation between
temperature, oxidation products (CO2 , H2 O), OH and NO (Table 4.1, Tables
4.3-4.4), as it is expected for a turbulent non premixed flame. The covariance
matrix for the JHC data set still shows a strong correlation between temper-
ature and product’s mass fractions; however, the covariance between tempera-
ture and the minor species, i.e. OH and NO, is lower. Once again, this indicate
the existence of a more complex flame structure, arising from a balance between
turbulent mixing and chemical kinetics.
Figure 4.11 and Figure 4.12 show the GPCA reconstruction of Flame F,
with q = 3 and q = 4, respectively. Similarly to the jet flame, the addition
of a PC barely affects the accuracy in the prediction of the major species, as
it mainly acts on the prediction of the minor species, i.e. OH and NO (Table
4.6). Very similar results are observed for Flame D.
With regard to the JHC system, very large (non linear) deviations are ob-
served for temperature (Figure 4.13 (a)), CO (Figure 4.13 (c)) and OH (Figure
4.13 (e)), for the case q = 4. The increase of the number of PCs to q = 5
strongly improves the prediction of CO (Figure 4.14 (c)) and OH (Figure 4.14
(e)), but not temperature (Figure 4.13 (a)) and other species, i.e. CO2 . It is
noteworthy that NO is very well captured, even with q = 4. This results sug-
gests that one of the retained PCs is highly correlated with NO, thus leading
to the observed result.
PCs interpretation and rotation It is interesting to provide an interpre-

tation of the results described above by looking at the structure of the eigenvec-
tors matrices for the different experimental data sets. Tables 4.7-4.10 report the
weights of the original variables on the retained principal components, before
(a) and after applying VARIMAX rotation, for the jet flame, Flame D, Flame F
and JHC, respectively. As it was pointed out in Section 4.2.4, the PCs weights
are determined to maximize variance and not physical interpretability. How-
ever, PCs rotation can help overcome such difficulty, through the determination
of a simpler structure for the eigenvectors.
The analysis of the rotated eigenvectors matrices shows a common pattern
for the different systems, again with the exception of the JHC data set. It
can be observed how the first (rotated) PC is always an ensemble component,
consisting of temperature, oxidizer, product species and NO. This component
has the effect of capturing as much as possible of the original data variance in
the data, trying to explain the (non linear) relations among the state variables
with a single parameter. The other PCs differs from one data set to the other.
For the jet flame, the second PC consists of reactants (CO, H2 , Air)., while
the third is basically OH, which is determinant for capturing the reaction zone
93
(a)
(b)
(c)
Figure 4.10: Scree-graph and histograms of the q largest eigenvalues for Flame
D (a), Flame F (b) and JHC (c). Scaling criterion adopted: auto scaling.
94
4.5. Results
Table 4.3: Covariance matrix for Flame D data set. Scaling criterion adopted: auto scaling.
 
T Y O2 YN2 YH2 YH2 O YCH4 YCO YCO2 YOH YN O

 T 1.000 −0.960 −0.134 0.418 0.979 −0.295 0.535 0.984 0.681 0.912 

 YO2 1.000 0.323 −0.589 −0.977 0.093 −0.688 −0.932 −0.645 −0.859 
 
 YN2 1.000 −0.473 −0.194 −0.867 −0.451 −0.109 −0.061 −0.052 
 
 
YH2 1.000 0.548 0.194 0.919 0.320 0.056
95
 0.240 
 
 YH2 O 1.000 −0.257 0.658 0.949 0.666 0.883 
 
 YCH4 1.000 0.102 −0.312 −0.221 −0.329 
 
 YCO 1.000 0.442 0.213 0.372 
 
 YCO2 1.000 0.708 0.933 
 
 YOH 1.000 0.688 
YN O 1.000
Table 4.4: Covariance matrix for Flame F data set. Scaling criterion adopted: auto scaling.
 
T YO2 YN 2 YH2 YH2 O YCH4 YCO YCO2 YOH YN O

 T 1.000 −0.968 −0.073 0.418 0.984 −0.312 0.543 0.981 0.745 0.824 


 YO2 1.000 0.241 −0.545 −0.976 0.128 −0.660 −0.940 −0.748 −0.790  

 YN2 1.000 −0.378 −0.109 −0.882 −0.349 −0.053 −0.057 −0.026  
YH2 1.000 0.512 0.124 0.926 0.305 0.189 0.229 
 
96

YH2 O 1.000 −0.296 0.636 0.956 0.754 0.816 
 

YCH4 1.000 0.041 −0.327 −0.232 −0.297 
 

YCO 1.000 0.432 0.262 0.331 
 

 
 YCO2 1.000 0.767 0.851 
 
 YOH 1.000 0.633 
YN O 1.000
4.5. Results
Table 4.5: Covariance matrix for JHC data set. Scaling criterion adopted: auto scaling.
 
T YO2 YN 2 YH2 YH2 O YCH4 YCO YCO2 YOH YN O

 T 1.000 −0.476 0.616 −0.534 0.892 −0.534 0.292 0.913 0.427 0.388 
 YO2 1.000 0.306 −0.420 −0.619 −0.418 −0.378 −0.614 −0.150 −0.126 
 
 YN2 1.000 −0.990 0.483 −0.991 0.139 0.465 0.216 0.253 
 
 
YH2
97
 1.000 −0.392 0.998 −0.085 −0.376 −0.195 −0.240 
 
 YH2 O 1.000 −0.398 0.516 0.927 0.340 0.389 
 
 YCH4 1.000 −0.089 −0.381 −0.196 −0.241 
 
 YCO 1.000 0.266 −0.072 0.123 
 
 YCO2 1.000 0.362 0.376 
 
 YOH 1.000 0.214 
YN O 1.000
(a) (b)
(c) (d)
(e) (f)
Figure 4.11: Parity plots of temperature (a), H2 O (b), CO (c), H2 (d), OH (e)
and NO (f) mass fractions illustrating the GPCA (q = 3) reduction of Flame
F. Scaling criterion adopted: auto scaling.
98
4.5. Results
(a) (b)
(c) (d)
(e) (f)
and NO (f) mass fractions illustrating the GPCA (q = 4) reduction of Flame
F. Scaling criterion adopted: auto scaling.
99
(a) (b)
(c) (d)
(e) (f)
and NO (f) mass fractions illustrating the GPCA (q = 4) reduction of JHC
data set. Scaling criterion adopted: auto scaling.
100
4.5. Results
(a) (b)
(c) (d)
(e) (f)
and NO (f) mass fractions illustrating the GPCA (q = 5) reduction of JHC
data set. Scaling criterion adopted: auto scaling.
101
Table 4.6: Total, tq , and individual variance, tq,j , accounted for Flame D, F
and JHC data sets by the GPCA reduction, as a function of the number of
retained PCs, q.
tq,i (%)
Flame D Flame F JHC
q=3 q=4 q=3 q=4 q=4 q=5
T 0.971 0.985 0.967 0.971 0.932 0.948
YO2 0.982 0.986 0.978 0.979 0.961 0.974
YN2 0.979 0.981 0.979 0.980 0.991 0.991
YH2 0.959 0.966 0.969 0.970 0.998 0.998
YH2 O 0.987 0.988 0.983 0.984 0.966 0.966
YCH4 0.984 0.986 0.984 0.984 0.999 0.999
YCO 0.940 0.965 0.961 0.969 0.757 0.998
YCO2 0.965 0.985 0.969 0.974 0.911 0.970
YOH 0.743 1.000 0.711 0.978 0.735 1.000
YN O 0.902 0.932 0.792 0.892 0.999 1.000
tq (%) 0.941 0.977 0.946 0.968 0.925 0.984
correctly.
Moving on to Flame D and F, the second PC observed for the jet flame is
somehow split into two components, one representative of a mixture fraction
(both N2 and CH4 are very highly correlated to mixture fraction) and one rep-
resentative of the intermediate product species (CO, H2 ). The last component
is again OH, the flame marker. The eigenvectors structures of Flame D and
F are very similar. There is only a significant difference which could be high-
lighted, namely the NO weights on the fourth component. For Flame D, NO
does not appear as a relevant weight on the last PC, whereas it is no negligible
for the fourth component of Flame F, thus reflecting the lower correlations be-
Table 4.7: Retained (a) and rotated (b) eigenvectors for the jet flame data set.
a1 a2 a3 a1,r a2,r a3,r

T 0.40 0.18 -0.07 T 0.44 0.00 0.03
YO2 -0.41 0.15 0.00 YO2 -0.30 0.31 -0.07
YN2 -0.33 0.38 0.01 YN2 -0.13 0.48 -0.02
YH2 0.14 -0.55 -0.05 YH2 -0.10 -0.56 -0.07
(a) (b)
YH2 O 0.41 0.05 0.12 YH2 O 0.36 -0.13 0.20
YCO 0.18 -0.53 0.02 YCO -0.06 -0.56 0.01
YCO2 0.39 0.24 -0.11 YCO2 0.46 0.05 0.00
YOH 0.31 0.27 0.74 YOH 0.22 0.11 0.81
YN O 0.31 0.29 -0.65 YN O 0.54 0.13 -0.54
102
4.5. Results
Table 4.8: Retained (a) and rotated (b) eigenvectors for Flame D data set.
a1 a2 a3 a3 a1,r a2,r a3,r a3

T 0.40 -0.08 0.07 0.11 T 0.46 0.02 0.00 0.02
YO2 -0.40 -0.06 -0.07 -0.04 YO2 -0.39 0.08 0.13 0.03
YN2 -0.06 -0.59 -0.38 -0.05 YN2 -0.07 0.69 0.05 0.02
YH2 0.23 0.39 -0.55 0.04 YH2 0.00 -0.01 -0.69 0.08
(a) YH2 O 0.41 -0.03 0.00 0.05 (b) YH2 O 0.38 0.04 -0.15 -0.05
YCH4 -0.10 0.57 0.41 0.01 YCH4 -0.07 -0.72 0.06 0.02
YCO 0.28 0.33 -0.49 -0.14 YCO 0.01 0.02 -0.67 -0.07
YCO2 0.39 -0.13 0.18 0.11 YCO2 0.48 0.00 0.09 0.01
YOH 0.32 -0.11 0.25 -0.83 YOH 0.00 0.00 0.01 -0.99
YN O 0.34 -0.15 0.22 0.51 YN O 0.50 0.01 0.16 0.04
Table 4.9: Retained (a) and rotated (b) eigenvectors for Flame F data set.
a1 a2 a3 a3 a1,r a2,r a3,r a3

T 0.40 0.10 0.04 0.20 T 0.43 -0.03 -0.02 -0.01
YO2 -0.40 0.06 -0.03 -0.11 YO2 -0.39 -0.08 0.11 0.07
YN2 -0.10 0.56 -0.39 -0.08 YN2 -0.07 -0.70 0.04 0.00
YH2 0.23 -0.40 -0.49 -0.14 YH2 0.03 0.01 -0.70 0.12
(a) YH2 O 0.41 0.03 -0.05 0.07 (b) YH2 O 0.39 -0.04 -0.11 -0.05
YCH4 -0.08 -0.55 0.45 0.08 YCH4 -0.08 0.70 0.04 0.00
YCO 0.28 -0.34 -0.44 -0.26 YCO 0.04 -0.01 -0.66 -0.09
YCO2 0.39 0.14 0.13 0.24 YCO2 0.45 -0.01 0.09 -0.03
YOH 0.29 0.19 0.40 -0.84 YOH 0.13 0.00 0.06 -0.92
YN O 0.37 0.18 0.16 0.29 YN O 0.53 0.02 0.19 0.35
tween the two variables (Table 4.4), probably determined by the higher physical
complexity of the system.
Finally, regarding the eigenvectors of the JHC system, it can be observed
how the first rotated component does not show a large influence of NO, differ-
ently from all the other systems. This can be explained by taking into account
that the first PC tries to explain as much as possible of the data variability.
It is well known [3, 4, 2, 45] that NO formation in flameless combustion is
more homogeneous than in traditional non premixed combustion, due to the
smoother temperature gradients; therefore, NO is characterized by less vari-
ability and disappears from the first PC. The second and third PCs are, again,
representative of reactant and intermediate combustion products (Table 4.5),
reflecting a similar pattern to that observed for Flame F (and D). Differently
from the piloted flames, the fourth component is exclusively NO, thus meaning
that none of the previous components can take into account NO formation and
a specific PC is needed. Then, the OH component, present in all the other
103
Table 4.10: Retained (a) and rotated (b) eigenvectors for JHC data set.
a1 a2 a3 a4 a5
T 0.42 -0.16 0.08 0.12 0.16
YO2 -0.11 0.59 0.02 -0.11 -0.15
YN2 0.38 0.34 -0.10 0.07 0.03
YH2 -0.35 -0.39 0.08 -0.04 0.00
(a) YH2 O 0.40 -0.27 -0.10 0.06 0.03
YCH4 -0.35 -0.39 0.08 -0.04 0.00
YCO 0.16 -0.23 -0.67 -0.10 -0.64
YCO2 0.39 -0.26 0.08 0.11 0.31
YOH 0.19 -0.07 0.68 0.20 -0.67
YN O 0.22 -0.07 0.21 -0.95 0.00
a1,r a2,r a3,r a4,r a5,r

T 0.47 0.14 0.06 0.00 -0.06
YO2 -0.49 0.37 0.08 -0.06 -0.04
YN2 0.09 0.51 -0.02 0.02 0.01
YH2 -0.02 -0.53 0.00 0.00 0.00
(b) YH2 O 0.46 0.06 -0.18 -0.03 -0.01
YCH4 -0.02 -0.53 0.01 0.00 0.00
YCO -0.02 0.02 -0.97 0.00 0.00
YCO2 0.56 0.04 0.14 -0.01 0.05
YOH 0.01 -0.02 0.00 0.00 -1.00
YN O 0.01 -0.01 0.00 -1.00 0.00
system, is also found here as last PC.
Principal variables As it was pointed out in the previous Section, the orig-
inal variables do not contribute equally in the determination of the PCs and
rotation can improve eigenvectors interpretability, transforming them to meet
a simpler structure. Another way to help interpretation is to extract the so
called Principal Variables (PVs), described in Section 4.2.3.5.
Table 4.11 lists the PVs determined using the methods outlined in Section
4.2.3.5 for the jet flame. At first glance, the results may appear to vary widely,
going from one method to the other. However, a more careful analysis shows
the existence of many similarities. In particular, methods B4, B2, M2, MC2
and MC3 lead to very similar results, identifying a major variable (T, CO2
or H2 O), a fuel species (CO or H2 ) and OH as PVs. In fact, T, CO2 of H2 O
are highly correlated (Table 4.1), cov (T, CO2 ) = 0.984, cov (T, H2 O) = 0.938
and cov (H2 O, CO2 ) = 0.892. Similarly, H2 and CO are also exchangeable
variables, being cov (H2 O, CO2 ) = 0.973. Method MC1 replaces T (or CO2
or H2 O) with NO, which shows a strong correlation with T, although weaker
104
4.5. Results
Table 4.11: Principal variables for the jet flame data set, as provided by the
different methods described in Section 4.2.3.5.
Method Principal Variables

B4 H2 O, H2 , OH
B2 CO2 , H2 , OH
M3 T, CO, OH
MC1 NO, CO, OH
MC2 CO2 , CO, OH
MC3 CO2 , CO, OH
PF T, O2 , H2
Table 4.12: Principal variables for Flame D, F and the JHC data set. PV
method: MC2 (Section 4.2.3.5).
Data set Principal Variables

Flame D CH4 , CO, CO2 , OH
Flame F CH4 , CO, CO2 , OH
JHC CH4 , CO, CO2 , NO, OH
than that CO2 and H2 O. Finally, the PF method provides a different solution,
neglecting OH as PV and replacing it with O2 . However, this solution was
considered unreliable, being very far from the pattern identified by all the other
methods.
On the basis of the results obtained for the jet flame case, it was chosen
to adopt the MC2 method for the extraction of the PVs, as it provides results
comparable to most of the other models and satisfies a very appealing prop-
erty of PCA, the minimization of the reconstruction error. Applying the MC2
methods to the other data sets, we get the results in Table 4.12. It is very
interesting to observe that the same considerations derived from the analysis of
the rotated PCs can be done here, with a clearer physical interpretation. The
PVs selected for Flame D and F reflect the patters of the PCs, as they include
a mixture fraction variable, an intermediate and a product species and OH.
Finally, for the JHC system, the same set of PVs obtained for Flame D (and F)
is recovered, although augmented with NO, thus confirming the need to take
explicitly into account the formation of such pollutant species.
Effect of preprocessing strategies on the manifold dimensionality In

this Paragraph, the effect of preprocessing strategies on the PCA reduction is
presented, focusing on the jet flame data set. The performances of auto scaling
have been compared to those of other scaling criteria, presented in Section 4.2.2.
Figure 4.15 shows the eigenvalue size distribution and the contribution of the
q largest eigenvalues to the total variance explained, when applying scaling
105
criterion different than auto scaling, namely range (a), vast (b), level (c) and
max (d) scaling. If we compare Figure 4.15 to Figure 4.7, it is clear that all
the methods identify the manifold dimensionality to be equal to three, with
the exception of vast scaling (Figure 4.15 (b)), which identifies only two PCs,
with a very dominant first PC. Columns 3-10 in Table 4.2 show the values of
tq and tq,j obtained by applying range, max, vast and level scaling to the jet
flame data set. Results confirm that auto scaling is the only criterion able to
provide a uniform reconstruction of the state variables, leading to comparable
values of tq,j for all of them. Range and max scaling (columns 3-6, Table
4.2), very similar as expected, perform slightly better than auto scaling for
most of the main species and temperature. However, they proves to be unable
to properly capture NO variation, even with q = 3. Similarly, vast scaling
(columns 7-8, Table 4.2) concentrates on extremely stable variables, i.e. N2 ,
but completely fails in recovering OH properly. Then, the higher values of tq
given by range, max and vast scaling, compared to auto scaling, are due to the
higher variance explained for the major variables; however, these approaches
miss very important features, such as the parametrization of NO and OH. The
variance explained by auto scaling for OH and NO is up to 16% and 25% higher,
respectively, than that explained by the other scaling methods.
On the opposite, level scaling (columns 9-10, Table 4.2) focuses on variables
characterized by large relative changes and leads to an overestimation of the
role of minor species in the PCA reduction. Therefore, the prediction of minor
species such as OH and NO is very accurate, but major species such as H2 O
are badly recovered.
On the basis of the described sensitivity, it was chosen to adopt the auto
scaling as default preprocessing criterion for the analysis. Obviously, in ap-
plications which do not require the accurate parametrization of minor species,
other options could provide better results than to auto scaling.
4.5.1.2 LPCA of experimental and numerical data sets

The GPCA analysis presented in Section 4.5.1.1 has shown the existence of se-
vere non linearities in the parity plots of observed and predicted state variables.
Therefore, the determination of the manifold dimensionality with GPCA can
be somehow biased, as a globally linear approach is adopted to model complex
non linear interactions. In this context, LPCA (Section 4.3) can provide locally
linear models, able to follow the non linear development of the thermochemical
manifold in low-dimensional space.
Table 4.13 lists the values of the error metric, GSRE,n (Section 4.3), given
by GPCA, VQPCA and FPCA for the jet flame, Flame F and the JHC data
set, as a function of the number of clusters, k, and retained PCs, q.
It is interesting to observe (Table 4.13) that, when the reconstruction er-
ror is evaluated on a scaled basis, all the state variables become relevant and
the goodness of reconstruction can be properly judged. It should be recalled
106
4.5. Results
(a) (b)
(c) (d)
Figure 4.15: Scree-graph and histograms of the q largest eigenvalues for the
jet flame data set, preprocessed with range (a), vast (b), level (c) and max (d)
scaling.
Table 4.13: Values of GSRE,n associated with the GPCA, VQPCA and FPCA
reconstructions of the jet flame, flame F and JHC data set, as a function of the
number of clusters, k, and retained PCs, q.
k Jet flame Flame F JHC

q=2 q=3 q=3 q=4 q=3 q=4
GPCA 1 0.681 0.309 0.707 0.320 1.552 0.752
VQPCA 2 0.208 0.106 0.205 0.119 0.214 0.093
4 0.112 0.056 0.131 0.076 0.099 0.050
6 0.091 0.046 0.095 0.052 0.078 0.037
8 0.079 0.034 0.090 0.040 0.058 0.028
FPCA 2 0.214 0.084 0.263 0.147 0.410 0.105
4 0.121 0.066 0.158 0.087 0.123 0.059
6 0.103 0.051 0.134 0.067 0.112 0.052
8 0.092 0.045 0.122 0.063 0.093 0.044
107
here that (Section 4.3) GSRE,n is a mean global scaled reconstruction error,
normalized by the mean variance present in the original data. So, for exam-
ple, the value of GSRE,n associated with the GPCA reduction of the jet flame
with q = 2 is fairly large, GSRE,n = 0.68, thus reflecting the large deviations
observed in Figure 4.8 for some state variables. Even when q in increased to
three, a significant error, GSRE,n = 0.309 is obtained, confirming the persis-
tence of mainly non-linear departures from the original data. If lower values
of GSRE,n are desired, i.e. < 0.1, the value of q should be increased to 4 or 5
(GSRE,n = 0.098 and GSRE,n = 0.046, respectively). However, the manifold
dimensionality obtained could not be regarded as a true manifold dimension-
ality. It can be argued that fake components need to be added to account for
non linear interactions because of the global linear nature of the adopted model.
However, if a locally linear model is employed, i.e. VQPCA, much higher per-
formances in terms of GSRE,n are obtained, even for smaller values of q, i.e.
q = 2 or q = 3. Figure 4.16 shows the VQPCA reconstruction of temperature
(a), H2 O (b), CO (c), H2 (d), OH (e) and NO (f), with k = 8 and q = 2. A
much better agreement between original and reconstructed data is observed, as
it is confirmed by the value of 0.08 obtained for GSRE,n . A similar value of
GSRE,n would require q = 5 if GPCA is applied. Moreover, the parity plots for
the state variables in Figure 4.16 show how, after partitioning, the relationships
between the original and reconstructed data are mainly linear.
The values of GSRE,n for Flame F, as provided by the different approaches,
are also shown Table 4.13. Similarly to the jet flame, the reconstruction error,
GSRE,n , associated with the GPCA reductions (Figure 4.11 and Figure 4.12)
obtained by choosing q = 3 and q = 4 are very high, GSRE,n = 0.71 and
GSRE,n = 0.32, respectively, thus confirming the inability of GPCA to deter-
mine the most compact description of the data in a lower dimensional manifold.
Figure 4.17 shows the VQPCA reconstruction of temperature (a), H2 O (b), CO
(c), H2 (d), OH (e) and NO (f), with k = 8 and q = 3. The value of GSRE,n
obtained is 0.09, almost eight and four times smaller than the values given by
GPCA with q = 3 and q = 4, respectively. Also, GPCA would require 6 PCs
to give a value of GSRE,n as small as 0.09.
Finally, the same considerations hold for the JHC data set. In fact, the
effect of data partitioning is even more evident for this case with respect to the
jet flame and Flame F. Table 4.13 clearly points out that GSRE,n dramatically
decreases when VQPCA with k = 2 is employed, going from 1.552 and 0.752
to 0.241 and 0.093, for q = 3 and q = 4, respectively. This corresponds to a
reduction of around 85% for both cases, which is not observed for any of the
other investigated data sets. Such result is extremely interesting and suggests
the existence of different flame structures within the JHC system. Figure 4.18
shows the VQPCA reconstruction of temperature (a), H2 O (b), CO (c), H2
(d), OH (e) and NO (f), with k = 6 and q = 3. With respect to the jet flame
and Flame F, it is possible to reach values of GSRE,n below 0.1 with a smaller
number, i.e. k = 6, of clusters.
108
4.5. Results
(a) (b)
(c) (d)
(e) (f)
and NO (f) mass fractions illustrating the VQPCA (q = 2, k = 8) reduction of
the jet flame data set. GSRE,n = 0.08.
109
(a) (b)
(c) (d)
(e) (f)
Flame F data set. GSRE,n = 0.08.
110
4.5. Results
(a) (b)
(c) (d)
(e) (f)
JHC data set. GSRE,n = 0.08.
111
Table 4.14: Total, tq , and individual variance, tq,j , accounted for by the re-
tained PCs for the DNS1 and DNS2 data sets as a function of the number of
components, q.
k DNS1 DNS2
q=2 q=3 q=3 q=4
GPCA 1 3.130 1.830 1.800 1.130
VQPCA 2 0.816 0.176 0.734 0.369
4 0.307 0.065 0.235 0.076
6 0.116 0.025 0.141 0.046
8 0.043 0.010 0.114 0.038
10 0.036 0.009 0.096 0.033
FPCA 2 0.625 0.216 0.773 0.417
4 0.243 0.052 0.263 0.081
6 0.122 0.030 0.204 0.066
8 0.062 0.020 0.167 0.054
10 0.046 0.015 0.140 0.043
VQPCA has been exploited also for the analysis of the DNS data sets, DNS1
and DNS2, described in Section 4.4.2. Regarding the DNS2 data set, multiple
time steps have been merged before analyzing the data, namely t = 1.5e − 03 s,
t = 2.0e − 03 s, t = 2.5e − 03 s and t = 3.0e − 03 s. However, the resulting data
set (∼3.800.000 data points) have been conditioned in mixture fraction space,
between f = 0.1 and f = 0.8, to overcome memory issues (Figure 4.19).
Table 4.14 lists the values of GSRE,n given by GPCA, VQPCA and FPCA
for the DNS data sets. Similarly to the JHC case, the first partition is charac-
terized by a dramatic reduction of GSRE,n also for the DNS data sets. This
indicates, once again, that a global approach would lead to misleading estima-
tion of the manifold dimensionality. Table 4.14 also shows that DNS2 requires
an additional PC with respect to DNS1 to reach acceptable levels of accuracy.
This is determined by the higher complexity of the DNS2 data set, characterized
by a significant degree of extinction.
Figure 4.20 shows the contour plots of original and recovered temperature
and OH mass fraction distribution for the DNS1 data set. It can be observed
how a VQPCA approach with q = 3 and k = 8 allows to capture with great
accuracy the flame features, resulting in a very small reconstruction error,
GSRE,n = 0.01. This is a very appealing result, indicating that VQPCA could
be effectively exploited for the compression of DNS data sets, characterized by
very large storage requirements, for visualization and post-processing purposes.
Very strong compressions could be achieved, as shown here, prescribing the de-
sired accuracy of the recovered data. For a given manifold dimensionality, the
dimensions of the reduced data sets are independent of the number of clusters;
therefore, the parameters q and k can be varied to optimize the accuracy and
112
4.5. Results
(a)
(b)
Figure 4.19: Original (a) and conditioned (b) temperature field for DNS2 data
set at time step t = 1.5e − 03 s.
113
the disk space required for any data set.

The quality of the VQPCA reconstruction can be also qualitatively assessed
by means of contour plots of recovered and observed state variables (Figure
4.21). As it was pointed out for the experimental data, after the VQPCA
reduction, the parity plots show a linear behavior throughout the domain of the
state variable, confirming the capability of the VQPCA to follow the curvature
of the low-dimensional manifold in the reduced state space.
Figure 4.22 shows the contour plots of (conditioned) original and recovered
temperature distribution at time steps t = 1.5e−03 s (a, a’) and t = 2.0e−03 s
(b’). As for Flame F (Figure 4.17), VQPCA is able to capture flame extinc-
tion and re-ignition remarkably well, resulting in the quantitative agreement
between observed and recovered variables shown in Figure 4.23.
Comparison of VQPCA and FPCA Rows 6-9 in Table 4.13 and rows 7-11
in Table 4.14 list the values of GSRE,n given by FPCA for the experimental
and numerical data sets, respectively, as a function of the number of clusters, k,
and retained PCs, q. These values can be weighed against those obtained with
the VQPCA algorithm (rows 2-5 in Table 4.13 and rows 2-6 in Table 4.14) for
the same values of q and k. The comparison is illustrated graphically in Figure
4.24 for the experimental data, and in Figure 4.25 for the DNS data sets.
For the jet flame (Figure 4.24 (a)), VQPCA performs generally better with
respect to FPCA. The values of GSRE,n given by VQPCA are 6 − 25% lower
than those provided by FPCA with the exception of the cases corresponding to
k = 2 and q = 3. However, the performances of FPCA can be considered very
satisfying and promising, especially from the point of view of model implemen-
tation. In fact, FPCA partitioning is much simpler and straightforward than
the one underlying the VQPCA algorithm. Moreover, it is interesting to inves-
tigate how the VQPCA partitioning is reflected in the mixture fraction space.
To this purpose, the indexes of the observations with respect to the original
data matrix, X,f were stored and used to reconstruct the mixture fraction vec-
tors in each cluster identified by VQPCA. Figure 4.26 shows the temperature
as a function of mixture fraction for the two clusters selected by VQPCA, with
q = 2. It can be observed that the data are almost clustered into two regions
corresponding to the rich and lean zones of the flame, being the stoichiometric
mixture fraction of the jet flame flame equal to 0.295. This result suggests that,
for the jet flame, the mixture fraction can be considered an optimal variable for
the parametrization of the thermochemical state of the system, as it is gener-
ally assumed in many models for non-premixed combustion. The information
added here is that mixture fraction is an optimal variable from the point of
view of reconstruction error minimization
As far as Flame F flame is concerned, Table 4.13 and Figure 4.24 (b) point
out that VQPCA outperforms FPCA in all cases, providing values of GSRE,n
13-46% lower than those given by FPCA. These results confirm the notorious
complexity of this flame, characterized by significant local extinction and re-
114
4.5. Results
(a)
(a’)
(b)
(b’)
Figure 4.20: Contour plots of original and recovered temperature (a, a’) and
OH mass fraction (b, b’) distribution for DNS1. VQPCA reduction with q = 3
and k = 8. GSRE,n = 0.01.
115
(a) (b)
Figure 4.21: Parity plots of original and recovered temperature (a, a’) and OH
mass fraction (b, b’) distribution for DNS1. VQPCA reduction with q = 3 and
k = 8. GSRE,n = 0.04.
ignition. In the context of the Conditional Moment Closure [73], for example,
it has been recognized [119] that conditioning on mixture fraction is not suffi-
cient for Flame F and a second conditioning variable should be used. Figure
4.27 shows the temperature as a function of mixture fraction for the two clus-
ters selected by VQPCA with q = 3. It can be observed that, differently from
the jet flame, the VQPCA algorithm extracts features from the whole mixture
fraction space in order to achieve the best q-dimensional representation of the
thermochemical state of the system. Then, it can be concluded that, for Flame
F, mixture fraction does not represent an optimal reaction variables. There-
fore, VQPCA could provide an appealing alternative to guide the selection of
the most compact subset of reaction variables needed to properly describe the
thermochemical state of such reacting system.
Figure 4.24 (c) shows the comparison between the reconstruction provided
by VQPCA and FPCA for the JHC data set. Similarly to Flame F, the VQPCA
algorithm provides GSRE,n , 10-50% lower than those obtained with FPCA.
The mixture fraction partitioning does not optimally follow the curvature of
the manifold in state space, indicating the complexity of turbulence/chemistry
interactions for the system. This is further confirmed by Figure 4.28, which
shows the partition of temperature in the two clusters selected by VQPCA
with q = 3, in mixture fraction space. The algorithm selects a first cluster
characterized by a lean branch and part of a rich region, with an aspect charac-
teristic of a non premixed flame. On the other hand, the second cluster shows
important non equilibrium phenomena, such as extinction, similarly to cluster
2 for Flame F (Figure 4.27 (b)).
To better understand the underlying mechanism of the VQ partitioning
algorithm, it is possible to analyze the structure of the rotated eigenvectors in
116
4.5. Results
(a) (a’)
(b) (b’)
Figure 4.22: Contour plots of original (a, b) and recovered (a’, b’) temperature
distribution for DNS2, at two different time steps, i.e. t = 1.5e − 03 s (a,
a’) and t = 2.0e − 03 s (b, b’). VQPCA reduction with q = 4 and k = 8.
GSRE,n = 0.04.
(a) (b)
Figure 4.23: Parity plots of temperature (a) and OH (b) mass fraction illustrat-
ing the VQPCA (q = 4, k = 8) reduction of DNS2 data set. GSRE,n = 0.04.
117
118
Figure 4.24: Values of GSRE,n as a function of the number of clusters, k, and retained PCs, q, for the jet flame, Flame F and
JHC data sets.
4.5. Results
Figure 4.25: Values of GSRE,n as a function of the number of clusters, k, and

retained PCs, q, for the DNS1 and DNS2 data sets.
(a) (b)
Figure 4.26: Temperature as a function of mixture fraction in the two clusters

selected by VQPCA for the jet flame. q = 2 and GSRE,n = 0.21.
119
(a) (b)

selected by VQPCA for Flame F. q = 3 and GSRE,n = 0.21.
(a) (b)

selected by VQPCA for the JHC data set. q = 3 and GSRE,n = 0.21.
120
4.5. Results
Table 4.15: Rotated eigenvector in the first (a) and second (b) cluster identified
by VQPCA for Flame F. q = 3 and GSRE,n = 0.21
a1,r a2,r a1 a2
T -0.01 0.55 T 0.30 -0.01
YO2 -0.09 -0.44 YO2 -0.34 -0.09
YN2 -0.70 -0.07 YN2 -0.12 -0.12
YH2 -0.01 -0.04 YH2 0.05 0.64
(a) YH2 O -0.03 0.44 (b) YH2 O 0.33 -0.09
YCH4 0.71 -0.10 YCH4 -0.01 0.02
YCO 0.00 0.07 YCO 0.12 0.70
YCO2 0.00 0.51 YCO2 0.35 -0.11
YOH 0.00 0.05 YOH 0.60 -0.17
YN O -0.02 0.18 YN O 0.42 -0.13
the two clusters identified by VQPCA for Flame F (Table 4.15) and for the
JHC data set (Table 4.16). For Flame F, the eigenvectors associated to the
first cluster (Table 4.15 (a)) are a mixture fraction 3 and a linear combination
of major species and temperature, respectively. This supports the graphical
observation provided by Figure 4.27 (a), which shows the first cluster to be
characterized by the lean and rich branches of the flame. On the other hand,
the reaction region identified by Figure 4.27 (b), needs to be described by means
of parameters with a strong contribution of intermediate and minor species, as
it is shown in Table 4.15 (b).
With regard to the JHC data set, the structure of the rotated eigenvec-
tors prompts very interesting considerations. In particular, the second cluster
Table (4.16 (b)) is parametrized by a first component with significant weights
on the fuel species, intermediate species and temperature, whereas the second
PC reduces to OH. Thus, VQPCA is able to extract the subset of the data set
dominated by finite rate chemistry effects by means of progress variables able to
capture the ignition process. In the context of the numerical modeling of flame-
less combustion, such result confirms the need of combustion models suited for
the description of turbulence-chemistry interactions in such combustion regime.
As far as the numerical data are concerned, the VQPCA and FPCA reduc-
tions appear comparable for the DNS1 data set (Table 4.14), while VQPCA
outperforms FPCA for DNS2 (Table 4.14).This confirms that mixture fraction
is not optimal from the point of view of error minimization when the physics
under investigation become too complex. This is somehow expected, as mixture
fraction is only a measure of the local system stoichiometry and, then, it can
only cover relatively fast scales.
The small discrepancies between FPCA and VQPCA for DNS1 (and for
3
The denomination mixture fraction is used here because the variables which define the
first PC are highly correlated with the f , cov (f, N2 ) = 0.97 and cov (f, CH4 ) = 0.90.
121
Table 4.16: Rotated eigenvector in the first (a) and second (b) cluster identified
by VQPCA for the JHC data set. q = 3 and GSRE,n = 0.21
a1 a2 a1,r a2,r
T 0.48 0.01 T -0.25 0.21
YO2 -0.51 0.03 YO2 -0.13 -0.03
YN2 0.07 -0.02 YN2 -0.50 0.00
YH2 0.00 0.00 YH2 0.48 -0.06
(a) YH2 O 0.46 0.04 (b) YH2 O -0.23 0.06
YCH4 0.00 0.00 YCH4 0.49 0.00
YCO 0.05 -0.03 YCO 0.15 -0.07
YCO2 0.53 0.04 YCO2 -0.32 0.05
YOH 0.09 0.01 YOH 0.10 0.96
YN O -0.03 1.00 YN O -0.05 0.14
the jet flame) suggest that VQPCA actually tends to FPCA when dealing with
relatively simple systems, characterized by fast chemistry and a small degree
of extinction. This is confirmed by Figure 4.29, showing the VQPCA (a-d) and
FPCA (a’-d’) partition of the DNS1 data. Both the approaches identify a rich
and lean region, together with a rich and lean reacting layer.
Computational cost of the analysis In the above discussion, VQPCA

has been showed to be generally superior to FPCA from the point of view
of reconstruction error minimization. However, it should be reminded that
VQPCA is an iterative algorithm, whereas FPCA is based on the supervised
partitioning of data into bins of mixture fraction (Section 4.3). Therefore,
the CPU time associated with VQPCA is certainly higher that that of FPCA;
moreover, it increases with k, as shown in Figure 4.30, for an experimental
and numerical data set. It is clear how the CPU associated with VQPCA can
reach values of the order of minutes, for the experimental data sets (Figure
4.30 (a)), and hours, for the numerical data sets (Figure 4.30 (b)), whereas
the corresponding CPU time of FPCA is of the order of seconds and minutes.
Therefore, FPCA represents certainly a valid solution for applications similar
to the jet or the DNS1 flame, as it optimizes both CPU time and accuracy of
predictions.
4.6 Development of a PCA based combustion model

In the previous Section, a methodology based on Principal Components Anal-
ysis (PCA) has proposed for the identification of low-dimensional manifolds
in turbulent flames, the estimation of their dimensionality and the selection
of optimal reaction variables. The reduced representation given by PCA has
great potential, especially in its local formulations, i.e. VQPCA and FPCA.
122
4.6. Development of a PCA based combustion model
(a) (b)
(c) (d)
(a’) (b’)
(c’) (d’)
JHC data set. GSRE,n = 0.08.
Figure 4.30: CPU time associated with the FPCA and VQPCA reductions.as a
function of the number of clusters, k, and retained PCs, q, for the experimental
(a) and numerical (b) data sets.
123
In fact, the selection of optimal variables for turbulent reacting systems could
be exploited for the development of turbulence-chemistry interaction models.
In this context, the linearity of the PCA method is extremely appealing. In
fact, if a set of reaction variables is selected, only few linear combinations of
the original variables need to be transported in a numerical simulation.
4.6.1 Transport equations for the PCs

Recalling the conservation equation for a reacting species (Chapter 2), it is
possible to show that the equation

∂ρYk ∂ρuj Yk ∂ ∂Yk
+ = ρDk + ω̇k
∂t ∂xj ∂xj ∂xj
can be transformed into a transport equation for the reacting species. Intro-
ducing the material derivative and the Lewis number, the species equation
becomes:
λ ∂2

DYk ∂Yk
ρ = + ω̇k (4.46)
Dt cp Lek ∂x2j ∂xj
where we have assumed the density and species diffusivity to be constant.
If the species mean, Y k , is subtracted and a scaling factor, dk , is applied to
the centered variable, we get:

(Yk −Y k )
D " #
dk λ ∂2 Yk − Y k ω̇k
= + . (4.47)
Dt ρcp Lek ∂x2j dk ρdk
Indicating with aki the weight of the kth variable on the ith PC, the following
equation is obtained:

(Yk −Y k )
D aki " #
dk λ ∂2 Yk − Y k ω̇k aki
= aki + . (4.48)
Dt ρcp Lek ∂x2j dk ρdk
Summing over all the variables we have:

Pp (Yk −Y k )
D aki " p #
k=1 dk λ ∂ 2 X Yk − Y k
= aki +
Dt ρcp Lek ∂x2j dk
k=1
p
1 X ω̇k aki
+ . (4.49)
ρ dk
k=1
124
(Y −Y )
But pk=1 k dk k aki is simply the definition of the ith PC, zi . Therefore Eq.
P
(4.49) can be rewritten as:
Dzi λ ∂ 2 zi
= + ω̇zi (4.50)
Dt cp Lei ∂x2j
where ω̇zi is the source term for zi
p
1 X ω̇k aki
ω̇zi = . (4.51)
ρ dk
k=1
If temperature is also included in the PCs definition, Eq. (4.51) becomes:

p
Qr 1 X ω̇k aki
ω̇zi = + (4.52)
ρcp dT ρ dk
k=1
where Qr is the heat released by reaction, akT is the weight of temperature on

the ith PC and dT is the scaling factor for temperature.
A prerequisite for the validity of the above equation is that the matrix of
PCs coefficients A is constant. Being the proposed PCA analysis based on the
processing of a multitude of observations in both space and time, A is constant
by construction,
More compactly, we can express Eq (4.50) as:
D
(Z) = −∇ · (jZ ) + (ω̇Z ) , (4.53)
Dt
where jZ is the mass diffusive flux of Z In Eq. (4.53), the source terms of
temperature and all species contribute to the source term for each PC.
4.6.2 PCA Modeling Approach

A complete PCA modeling approach requires several ingredients. First, the PCs
must be identified using the procedure outlined in Section 4.1. This identifica-
tion requires high-fidelity, fully-resolved data including source terms. Once the
PCs are selected, transport equations may be derived for each PC as described
in Section 4.6.1.
Second, the initial conditions (ICs) and boundary conditions (BCs) on the
PCs must be defined using the transformation matrix A. For Dirichlet BCs
on all the original variables, we obtain Dirichlet conditions on the PCs (ICs
are analogously defined). Likewise, Neumann conditions on X yield Neumann
conditions on Z. Mixed conditions on X yield Robin boundary conditions on
Z.
Diffusion terms in the transport equations for Z require evaluation of the
diffusive fluxes for each component of X. In turbulent flow calculations, the
molecular diffusion term is typically augmented by a “turbulent diffusion” term
125
arising from closure of the convective term. In many cases, and particularly
at high Reynolds number, the molecular diffusion term is small relative to the
turbulent diffusion term and it is neglected. However, even when one wishes
to retain the full description of molecular diffusion, the treatment with PCA is
straightforward. First, X is approximated from Z. Next, the diffusive terms for
X, jX , are constructed. Finally, the diffusive fluxes for the PCs are calculated
as jZ = jX A.
Source terms for the PCs, ω̇Z , can be parametrized by Z and tabulated
a priori to avoid run-time calculation. The accurate parametrization of the
source terms is crucial for the successful application of PCA as a modeling
approach. Therefore, the data adopted for PCs extraction must have source
terms for all X, which is currently impossible to obtain from experimental
data. Then, the decisive step for moving on from data analysis (Section 4.5.1)
to predictive modeling is the availability of computational data generated from
reliable chemical mechanisms, using methods such as DNS or ODT. Further-
more, the reliability of PCA as a modeling approach also hinges on the relative
invariance of PCs from one data set to another which is nearby in parameter
space.
4.6.3 Parametrizing the State Variables

This section presents the results of PCA applied to two DNS data sets of
non-premixed CO/H2 combustion. The DNS data-sets (Case A and B) have
been obtained using a code with 8th order spatial and 4th order temporal dis-
cretization. Detailed kinetics of CO/H2 oxidation have been used [114], along
with mixture-averaged transport approximations. The fuel stream is 0.45% CO,
0.05% H2 , and 0.5% N2 , giving a stoichiometric mixture fraction of fst = 0.4375,
and both fuel and air streams are at 300 K.
Case A is a spatially-evolving jet with an initial χmax = 25 s−1 , while case
B is a temporally-evolving jet with an initial χmax = 125 s−1 . The primary
difference between the two data-sets is the initial scalar dissipation rate (χ)
and turbulence intensity, which affects the degree of extinction observed; case
A exhibits virtually no extinction, while case B exhibits moderate extinction.
The existence of moderate extinction in case B is shown qualitatively in Figure
4.31 (a), which shows T versus χ at fst 4 . Additional details of the DNS code
and simulation configuration may be found elsewhere [120, 88].
To quantify the error in representing the data in low-dimensional space,
parametrized by Z, we calculate the R2 value,
n
" #" n #−1
X X
2
R =1− (xij − x∗ij )2 (xij − x¯j )2
(4.54)
i=1 i=1
4
The results shown in this Section refer to data conditioned on mixture fraction, f , since
this is a convenient variable to “force” as the first component.
126
(a) (b)
Figure 4.31: Parametrization of temperature at fst by χ (a) and z1 (b) for case
B. Solid lines are the doubly-conditional mean temperature. R2 is calculated
from Eq. (4.54).
where xi is the ith observation of the jth variable, x∗ij is its parametrized ap-
proximation, and x̄j is the mean of xj . For the state variables, R2 is equivalent
to the parameter tq,j , introduced in Section 4.2.3.1. However, for the source
term, such parameter is not available and the R2 can be directly calculated.
Figures 4.31 (a) and 4.31 (b) show the parametrization (at fst ) of T by χ
and the first PC, z1 , respectively, for Case B. Examining Figure 4.31 (b), we
see that z1 acts as a progress variable, capturing the extinction process remark-
ably well. This has also been observed for other choices of progress variables
such as CO2 [88]. Comparing the two-parameter PCA approach with the (f, χ)
parametrization is reasonable since both are two-parameter models, although
the second parameter (χ versus z1 ) represents different physical phenomena
(gradient versus chemical state). Figure 4.32 shows the parametrization of the
OH mass fraction by the common (f, χ) and the proposed (f, z1 ) parametriza-
tions. This demonstrates that the PCA approach can be used to represent a
wide range of the state variables, not temperature alone.
Also shown on Figures 4.31 (a) and 4.31 (b) is the R2 value as calculated by
Eq.(4.54). Table 4.17 lists R2 values for reconstruction of the temperature and
all species mass fractions as a function of the number of parameters adopted,
q. These values are a concise, quantitative representation of the information
presented graphically in Figs. 4.31 and 4.32. For example, for Case B with
q = 1, we obtain RT2 = 0.967, corresponding to Figure 4.31 (b). For comparison,
Table 4.17 also lists the R2 values given by the (f, χ) parametrization, RT2 =
0.801 (Figure 4.31 (a)). Clearly, the two-parameter (f, z1 ) parametrization
reconstructs the temperature and most other state variables with much more
accuracy than the (f, χ) parametrization. It should be noted that the results for
the (f, χ) parametrization represent the best possible performance of a model
based on (f, χ); the steady laminar flamelet model typically does not perform
ideally [88].
127
Table 4.17: R2 values defined by Eq. (4.54). Also shown are results for the χ parametrization. All results are at f = fst = 0.4375.
q T H2 O2 O OH H2 O H HO2 H2 O 2 CO CO2 HCO
A χ 0.789 0.344 0.811 0.718 0.165 0.085 0.695 0.839 0.816 0.803 0.827 0.828
1 0.983 0.259 0.976 0.930 0.240 0.178 0.823 0.986 0.916 0.978 0.956 0.980
128
2 0.983 0.936 0.968 0.958 0.963 0.924 0.964 0.980 0.985 0.969 0.976 0.980
B χ 0.801 0.509 0.807 0.697 0.426 0.186 0.648 0.665 0.729 0.810 0.058 0.817
1 0.967 0.370 0.910 0.614 0.736 0.531 0.524 0.940 0.849 0.907 0.094 0.901
2 0.996 0.845 0.982 0.882 0.931 0.990 0.858 0.974 0.941 0.981 0.378 0.984
3 0.990 0.904 0.982 0.984 0.979 0.991 0.985 0.977 0.933 0.981 0.854 0.980
(a) (b)
Figure 4.32: Parametrization of OH mass fraction at fs t by χ (a) and z1 (b)

for case B. Solid lines are the doubly-conditional mean temperature. R2 is
calculated from Eq. (4.54).
Table 4.17 also demonstrates that increasing the number of retained PCs
increases the accuracy with which the state variables are represented. This
indicates that one may select a desired error threshold and then determine the
minimum number of PCs required to achieve that accuracy. Conversely, one
may choose the number of PCs and estimate a priori the associated error.
4.6.4 Parametrizing Source Terms
The PCs are not conserved variables and their source terms must be parametrized
by the PCs. In this section we explore the ability of PCA to parametrize
source terms. Any function of X may be approximated by F(X) ≈ F (XAq ) .
However, it is more accurate to calculate F(X) directly from the data in p-
dimensional space and then project it onto Z by calculating the conditional
mean hF (X) |Z i. Thus, source terms are calculated directly from the original
observables, X, and their conditional means are projected onto Z. Figure 4.33
illustrates this for the two-dimensional (f, z1 ) parametrization of ω̇z1 .
Table 4.18 summarizes the ability of an q-dimensional PCA to parametrize
the source terms of the PCs. We first consider the columns describing the
results at fst . For case A, a two-dimensional parametrization (f, z1 ) captures
ω̇z1 with Rω̇2 z = 0.978. For case B, 3 PCs are required to parametrize ω̇z1 to
1
a similar degree of accuracy. Comparing the dimensionality requirements for
parametrizing ω̇Z with those for parametrizing the state variables (Table 4.17),
we see that parametrizing the source terms does not require more PCs than
parametrization of the state variables themselves, an encouraging result.
129
(a)
Figure 4.33: Parametrization of ω̇z1 at fst by z1 for case B. Solid line: doubly-
conditional mean value of ω̇z1 . R2 is calculated from Eq. (4.54).
Table 4.18: R2 values defined by Eq. (4.54) for PC source terms, sη .
f = 0.2 f = fst = 0.4375 f = 0.6

q 1 2 3 1 2 3 1 2 3
A ω̇z1 0.993 0.985 - 0.978 0.985 - 0.923 0.934 -
ω̇z2 - 0.996 - - 0.922 - - 0.876 -
B ω̇z1 0.270 0.844 0.967 0.815 0.932 0.958 0.809 0.852 0.902
ω̇z2 - 0.835 0.955 - 0.951 0.961 - 0.883 0.909
ω̇z3 - - 0.976 - - 0.731 - - 0.831
130
4.7. Summary
4.6.5 Global versus Semi-Local PCA

The results presented thus far have been obtained “locally” at fst . One may
consider whether a PCA performed at fst is applicable at other f . We term
this a “semi-local” PCA. If the PCA is highly dependent on mixture fraction,
then one of two options must be considered
• Eliminate the mixture fraction as a parameter and seek a global PCA on
the entire data set. This approach typically requires more PCs than a
PCA obtained at fst (Section 4.5.1).
• Perform a local PCA (Section 4.5.1) in f -space and derive transport equa-
tions for Z|f . These equations would have exchange terms representing
transport in mixture fraction space. This approach is further complicated
by the fact that the definition of the PCs would vary with f .
If the PCA obtained at fst reasonably represents the data at other f , then
the transport equations derived in Section 4.6.2 may be used directly at all f ,
eliminating the need for conditional equations in f -space.
Tables 4.19 and 4.20 provide parametrization errors for the state variables
at f = 0.2 and f = 0.6, respectively. Table 4.18 shows the parametrization
errors for ω̇Z at f = 0.2 and f = 0.6. Interestingly, the parametrizations do
not perform well at lean conditions (especially for CASE B); the same is true for
the (f, χ) parametrization. A posteriori testing is necessary to fully determine
the parametrization accuracy required. However, these results show promise
for the ability to use a PCA obtained at fst globally.
4.7 Summary
In the first part of the present Chapter, a novel methodology based on Principal
Components Analysis (PCA) has been proposed for the identification of low-
dimensional manifolds in turbulent flames, the estimation of their dimensional-
ity and the selection of optimal reaction variables. To this purpose, high fidelity
experimental and numerical data sets have been investigated. Three different
PCA approaches are proposed. A global PCA analysis, GPCA, has been com-
pared to two local PCA models, VQPCA and FPCA, based on the partitioning
of the experimental data into separate clusters where PCA is performed locally.
However, the partitioning algorithm used by VQPCA is unsupervised and based
on reconstruction error minimization while FPCA conditions the data a priori
on the mixture fraction. Results show that the local PCA approaches (VQPCA
and FPCA) outperform the global approach in all cases. Indeed, GPCA is un-
able to provide a compact representation of the data in a low-dimensional space
due to the highly non-linear relationships existing among the state variables.
Regarding the local approaches, the performances of VQPCA and FPCA are
comparable for a simple jet flames, while FPCA proves unable to capture im-
portant features for systems characterized by complex equilibrium phenomena
131
Table 4.19: R2 values at f = 0.2 using the PCA obtained at fst . Also shown are results for the χ parametrization.
q T H2 O2 O OH H2 O H HO2 H 2 O2 CO CO2 HCO
A χ 0.097 0.798 0.169 0.774 0.736 0.245 0.827 0.812 0.811 0.580 0.432 0.881
1 0.500 0.413 0.816 0.212 0.188 0.134 0.319 0.433 0.398 0.666 0.555 0.619
132
2 0.968 0.910 0.881 0.868 0.859 0.940 0.888 0.838 0.855 0.867 0.940 0.934
B χ 0.497 0.542 0.390 0.303 0.329 0.269 0.558 0.537 0.390 0.417 0.206 0.689
1 0.979 0.741 0.866 0.337 0.219 0.749 0.127 0.805 0.858 0.859 0.513 0.403
2 0.996 0.877 0.945 0.819 0.822 0.994 0.806 0.970 0.960 0.958 0.737 0.860
3 0.990 0.963 0.958 0.989 0.977 0.994 0.978 0.984 0.982 0.968 0.808 0.955
4.7. Summary
Table 4.20: R2 values at f = 0.6 using the PCA obtained at fst . Also shown are results for the χ parametrization.
q T H2 O2 O OH H2 O H HO2 H2 O 2 CO CO2 HCO

A χ 0.676 0.190 0.740 0.642 0.548 0.073 0.434 0.741 0.727 0.467 0.572 0.555
1 0.956 0.287 0.958 0.887 0.587 0.076 0.542 0.966 0.867 0.836 0.868 0.751
2 0.959 0.962 0.949 0.923 0.775 0.826 0.768 0.955 0.898 0.911 0.919 0.889
133
B χ 0.628 0.081 0.593 0.662 0.112 0.246 0.365 0.508 0.616 0.521 0.268 0.570
1 0.964 0.134 0.904 0.804 0.197 0.755 0.721 0.938 0.650 0.844 0.442 0.896
2 0.984 0.612 0.928 0.836 0.373 0.986 0.822 0.960 0.791 0.873 0.542 0.930
3 0.986 0.769 0.948 0.888 0.543 0.991 0.913 0.967 0.839 0.909 0.841 0.941
(e. g. Flame F, JHC, DNS2), resulting in higher reconstruction error with

respect to VQPCA.
In the second part of the Chapter, a modeling approach based on PCA has
been proposed and tested a priori using DNS data. This modeling approach
is complete, with the exception of a turbulent closure model which would be
required if this model were used in a LES or RANS context. The model is based
on a rotation of the thermochemical state basis from one based on temperature,
pressure, and Ns − 1 species mass fractions to one which best represents the
variance in the data. Implementation of the model requires transport equa-
tions for the principal components, which are reacting scalars. Results from
a quantitative a priori analysis of this approach using DNS data show great
promise. State variables and source terms both are parametrized well by a
two-parameter (f, z1 ) model, and adding additional parameters provides a
significant increase in accuracy for all state variables and their reaction rates.
Results also indicate a uniformly better representation of the DNS data using
an (f, z1 ) parametrization over the commonly used (f, χ). There are many
potential applications of this modeling approach. For example, laminar flame
calculations could benefit from PCA modeling approaches to provide rapid
solutions using a reduced set of equations. Once a full calculation has been
performed, subsequent calculations may be performed using PCs rather than
the full set of species and energy equations. The number of PCs retained can be
chosen by the desired accuracy. This could be particularly useful for stochastic
models such as the Linear Eddy Model (LEM) [121] and the one-dimensional
turbulence (ODT) model [115, 122], which require many realizations of a flow
field. The first realization could employ full chemistry while subsequent realiza-
tions utilize a reduced set of equations defined by PCA. Another application is
in modeling turbulent flows, where a compact parametrization of the thermo-
chemical state is key to achieving affordable simulations. Additionally, while
the analysis presented herein has been applied to non-premixed combustion, the
PCA approach applies in principle to all combustion regimes from premixed to
non-premixed. For application to turbulent flows, additional closure models
are required for the unresolved convective and source terms. In the context of
transported PDF methods, a PCA modeling approach could drastically reduce
the computational cost by significantly reducing the thermochemical dimen-
sionality while maintaining a quantified error bound on the thermochemical
reduction.
Future work will focus on examining the feasibility of PCA with various
fuels and exploring the universality of the PCA, i.e. the applicability of a PCA
obtained under one set of conditions to be applied to a simulation at another
set of conditions. Also, a posteriori tests will be conducted to determine the
effect of nonlinear propagation of errors in source term parametrization.
134

Principal Components Analysis For Turbulence-Chemistry Interaction Modeling

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Principal Components Analysis For Turbulence-Chemistry Interaction Modeling

Uploaded by

Copyright:

Available Formats

Chapter 4

Principal Components Analysis

As discussed in Chapter 2, turbulent combustion modeling is a very broad

• State space parametrization. This approach relies on the assumption

species mass fractions). Yet, if a set of “optimal” variables is identified, the

In the following, a background on PCA is provided. In particular, the

4.1 Definition and derivation of Principal Compo-

Σa1 − λa1 = (Σ − λIp ) a1 = 0 (4.3)

maximizes the variance of z1 is the one corresponding to the largest eigenvalue

any of the equations

could be used to specify no correlation between z1 and z2 . Choosing arbitrarily

where and φ are Lagrange multipliers. Differentiating with respect to a2 and

Σa2 − λa2 = 0. (4.11)

Then, λ is once more an eigenvalue of Σ and a2 is the corresponding eigenvec-

4.2 Sample PCA

where A is the (p x p) orthogonal matrix whose columns are the eigenvectors of

4.2.1 Optimal properties of the PCA reduction

• Property 1 : For any integer q, 1 ≤ q ≤ p, consider the orthonormal

• Property 2. Consider the orthogonal transformation Z = XB. Then,

to select a subset of variables from X in regression analysis or to detect

• Property 3. The sample covariance matrix, S, can be expressed as (spec-

• Property 4. Consider the orthogonal transformation Z = XB. Then,

• Property 5. Each element of X can be predicted by a linear function of Z,

of property 5 is that Eq. (4.20) is the best linear predictor of X in a

4.2.2 Data preprocessing: centering and scaling

where xj = n1 ni=1 xij , j = 1, 2, . . . , p, while dj is the scaling parameter relative

5. Max scaling. The variables are normalized by their maximum values,

4.2.2.1 Outlier detection and removal with PCA

The observations associated to large values of DM are considered outliers and,

An example of application of the outlined detection scheme in shown in Figure

Figure 4.3: Size reduction process with PCA.

4.2.3 Choosing a subset of Principal Components or Variables

4.2.3.1 Cumulative percentage of total variance

Following the derivation of tq , an appropriate measure of lack-of-fit of the rank

For a given number of retained PCs, it is possible to define the variance

4.2.3.2 Variance of Principal Components

The rule described in this Section originally applies to correlation matrices,

4.2.3.3 Broken Stick Model

Frontier [102] proposed a broken-stick method to select the number of PCs. If

4.2.3.4 Scree plot

4.2.3.5 Choosing a subset of Variables

• B4 Forward method [101]. PCA is performed on the original matrix of

• B2 Backward method [101]. PCA is performed on the original matrix of

of the covariance/correlation matrix are then computed and a criterion

• M2 backward method [105]. PCA is performed on the original matrix of

– Find the sum of squared differences between corresponding points of

The M2 algorithm employs the following backward elimination procedure

1. Initially, set m = p and, for a fixed q, compute the matrix of PCs

• Principal Features [107]. Using this method the dimension reduction is

1. Compute the sample covariance/correlation matrix, S.

4.2.4 Interpretation of principal components

More formally, rotation is concerned with finding an orthogonal matrix, T ,

The matrix T is chosen to optimize one of many simplicity criteria available

thus leading to the following definition of the optimal rotation angle, φ

The two-dimensional solution just presented can be extended to the more

Figure 4.4: Schematic illustration of the VARIMAX rotation [110].

4.3 Local Principal Components Analysis

2. Partition: each observation from the sample is assigned to a cluster using

3. Update: the cluster centroids are updated on the basis of partitioning

4. Local PCA: PCA is performed in each disjoint region of the sample.

5. Steps 2-4 are iterated until convergence is reached.

GSRE,n = E (GSRE ) . (4.45)

Figure 4.5: Schematic illustration of the VQPCA algorithm [92] .

GSRE,n = E (GSRE ) . (4.45)

3. The relative change in GSRE,n between two successive iterations is below

Figure 4.25: Values of GSRE,n as a function of the number of clusters, k, and