Professional Documents
Culture Documents
• Reduction of the kinetic mechanism [84, 85, 86]. This approach is based
on the analysis of the dominant reaction rates at the conditions of inter-
est and proceeds through the elimination of species and reactions in the
original kinetic mechanism, ultimately leading to a reduced set of species
equations to be solved.
63
Chapter 4. Principal Components Analysis for turbulence-chemistry
interaction modeling
The present Chapter focuses on the second of these approaches, the parametriza-
tion of the thermochemical state by a small number of parameters based on the
existence of a low-dimensional manifold. Most of the existing models exploiting
the existence of such manifolds are based on the a priori prescription of the
manifold dimensionality. For example, the flamelet model [76, 89] assumes that
the system can be satisfactory described by means of two parameters. How-
ever, such an approach restricts the subspace that thermochemistry may access,
without providing any quantitative error analysis. Indeed, as mixing and reac-
tion timescales increasingly overlap, the dimensionality of a manifold increases,
as does the error associated with a parametrization of fixed dimensionality [88].
Such consideration has prompted our interest towards the development of a
methodology to automate the selection of the optimal basis for the representa-
tion of the manifolds in thermochemical space. Principal Components Analysis
(PCA) [90, 91] offers this potential, as it provides a rigorous mathematical
formalism for reducing the dimensionality of a data set consisting of a large
number of correlated variables, while retaining most of the variation present in
the original data. The reduction is achieved by transforming to a new set of
variables, called the principal components (PCs), which are uncorrelated and
ordered so that the first few account for most of the variation present in all the
original variables. PCA provides an optimal representation of the system based
on q optimal variables, the PCs, which are linear combination of the Ns + 1
primitive variables T , p, and Yi . The linearity of the method is a particularly
appealing aspect since, once the reaction variables are identified, a few linear
combinations of the original variables could be transported in a numerical simu-
lation, if a proper closure for the source terms is employed. Nevertheless, since
the reaction variables provided by PCA are not conserved scalars, an a pri-
ori analysis of the ability of principal components to parametrize their source
terms must be assessed, as a mandatory requirement for the generated manifold
method. One of the main advantages of PCA lies in the potential to obtain
the principal components from a target system and to apply them to a similar
system. This potential could remove one of the main drawback which, to date,
affects PCA. In fact, the derivation of the manifold model via PCA requires
the availability of a data set for the extraction of principal components.
64
4.1. Definition and derivation of Principal Components
z1 = Xa1 . (4.1)
0
To determine z1 , a vector a1 is sought so that var (z1 ) = a1 Σa1 is maximized,
0
subject to the constraint a1 a1 = 1. If we adopt the standard approach of
Lagrange multipliers to solve this constrained problem, we need to maximize:
0
0
a1 Σa1 − λ a1 a1 − 1 (4.2)
where λ is a Lagrange multiplier. Differentiating with respect to a1 gives:
65
Chapter 4. Principal Components Analysis for turbulence-chemistry
interaction modeling
0 0 0
a1 Σa1 = a1 λa1 = λa1 a1 = λ = λ1 . (4.4)
0
The second PC, z2 = Xa2 , maximizes the variance var (z2 ) = a2 Σa2
subject to the constraints cov (Xa1 , Xa2 ) = 0 (z1 and z2 uncorrelated) and
0
a2 a2 = 1. Being
0 0 0 0
cov (Xa1 , Xa2 ) = a1 Σa2 = a2 Σa1 = λ1 a1 a2 = λ1 a2 a1 , (4.5)
0 0 0 0
a1 Σa2 = 0 a2 Σa1 = 0 a1 a2 = 0 a2 a1 = 0 (4.6)
0
0 0
a2 Σa2 − λ a2 a2 − 1 − φa2 a1 (4.7)
0 0 0
a1 Σa2 − λa1 a2 − φa1 a1 = 0 (4.8)
which reduces to
φ = 0, (4.9)
being
0 0
a1 Σa2 = λa1 a2 (4.10)
due to the constraint of z1 and z2 being uncorrelated. Then, Eq. (4.8) reduces
to:
66
4.2. Sample PCA
zi1 = Xi a1 , i = 1, 2 . . . , n (4.12)
where the vector of coefficients a1 is chosen to maximize the variance
n
1 X
(zi1 − z 1 )2 , (4.13)
(n − 1)
i=1
0
subject to the constraint a1 a1 = 1. Then, for the second PC:
zi2 = Xi a2 , i = 1, 2 . . . , n (4.14)
where a2 is chosen to maximize the sample variance of z12 , subject to the
0
constraints a2 a2 = 1 and cov (Xa1 , Xa2 ) = 0.
Continuing the process in an obvious manner, zk = Xak is the kth sam-
ple PC (k = 1, 2, . . . , p) and zik is the score for the ith observation on the
kth sample PC. If the derivation of Section 4.1 is followed, but replacing popu-
lation quantities with sample variances and covariances, then it turns out that
the sample variance of the kth sample PC is lk , the kth largest eigenvalue of
the sample covariance matrix S, and that ak is the corresponding eigenvector
of S.
Let Z be the (n x p) matrix of PCs scores, with (i, k)th element equal to
zik ; then, Z and X are related by
Z = XA (4.15)
1 0
S= X X. (4.16)
n−1
Recalling the eigenvector decomposition of a symmetric, non-singular matrix,
S can be decomposed as:
1
The matrix S represents the approximation of Σ for a finite population, i.e. the random
sample consisting of n observations for p variables.
67
Chapter 4. Principal Components Analysis for turbulence-chemistry
interaction modeling
0
S = A LA (4.17)
where L is a (p x p) diagonal matrix containing the eigenvalues of S in de-
scending order, l1 > l2 > . . . > lp .
The linear transformation given by Eq. (4.15) simply recast the original
variables into a set of new uncorrelated variables, whose coordinate axes are
described by A. Then, the original variables can be stated as a function of the
PCs as:
0
X = ZA (4.18)
0
being A orthonormal and, hence, A−1 = A . This means that, given Z, the
values of the original variables can be uniquely recovered. However, the main
objective of PCA is to replace the p elements of X with a much smaller number,
q, of PCs, which nevertheless discard a small fraction of the variance originally
contained in the data. If a subset of size q << p is used, the truncated PCs are
defined as:
Zq = XAq . (4.19)
Eq. (4.19) can be inverted to obtain:
0
Xq = Zq Aq . (4.20)
The linear transformation provided by Eq. (4.20) is particularly appealing
for size reduction in multivariate data analysis due to some optimal properties
of the PCA transformation, described in the following text.
68
4.2. Sample PCA
This result shows that we can decompose the whole covariance matrix
into decreasing contribution due to each PC.
xj − xj
x
ej = (4.22)
dj
69
Chapter 4. Principal Components Analysis for turbulence-chemistry
interaction modeling
f = X − X D −1 (4.23)
X
where D is the diagonal matrix containing the scaling parameters, dj . When
scaling is applied, Eqs. (4.15)-(4.19) are modified as:
−1
0
1
X − X X − X D −1
S= n−1 D
Z = X − X D −1 A . (4.24)
−1
Zq = X − X D Aq
The choice of the scaling parameters is very important, and has a potentially
strong impact on the resulting eigenvectors. The following choices are available:
1. Auto scaling, also called unit variance scaling. It is commonly applied and
uses the standard deviation, sj , as the scaling factor. After auto scaling,
all the elements of X have a standard deviation equal to 1 and therefore
the data is analyzed on the basis of correlations instead of covariances
2. Vast scaling [97]. Vast is an acronym of variable stability scaling and
it is an extension of auto scaling. It focuses on stable variables, the
variables that do not show strong variation, using the standard deviation
and the so-called coefficient of variation as scaling factors. The use of the
coefficient of variation, defined as the ratio of the standard deviation and
the mean: sj/xj , results in a higher importance for variables with a small
relative standard deviation
3. Range scaling. Range scaling adopts the difference between the minimal
and the maximal value, (xj,max − xj,min ), as scaling factor. A disadvan-
tage of range scaling with respect to other scaling methods is that only
two values are used to estimate the range, while for the standard devia-
tion all measurements are taken into account. This makes range scaling
more sensitive to outliers. To increase the robustness of range scaling,
the range could also be determined by using robust range estimators or
after the outliers have been removed.
4. Level scaling. The mean values of the variables, xj , are used as scaling
factors. Level scaling converts deviations from the mean (the mean is
always subtracted) in percentages compared to the mean values. As for
the range scaling, also level scaling can be affected by outliers. Then,
a more robust estimator of the mean, the median, could be used or the
mean could be determined after outlier removal. Level scaling can be
used when large relative changes are of specific interest. However, in
the case of the thermochemical state of a system, this could lead to an
overestimation of the role of chemical species which appear in very small
concentrations, i.e. radicals.
70
4.2. Sample PCA
As it was pointed out in the above discussion, range, level and max scaling
can be affected by the presence of outliers in the sample X. This problem can
be overcome by means of robust estimators of the quantities of interest (i.e.
median in place of sample average); however, a procedure for the detection and
removal of outlier observations could provide a viable solution to the problem.
PCA can be effectively employed for outlier detection in large data sets.
The first few principal components have large variances and explain most of
the variation in X. Therefore, these major components are strongly affected
by variables with relatively large variances and covariances. Consequently, the
observations that are outliers with respect to the first few components usually
correspond to outliers on one or more of the original variables. On the other
hand, the last few principal components represent linear functions of the orig-
inal variables with minimal variance. These components are sensitive to the
71
Chapter 4. Principal Components Analysis for turbulence-chemistry
interaction modeling
observations that are inconsistent with the correlation structure of the data but
are not outliers with respect to the original individual variables.
Based on the above considerations, the following detection scheme can be
proposed [98]:
1. Multivariate trimming. A fraction γ (0.005-0.01) of the data points char-
acterized by the largest value of DM is classified as outliers and removed.
New trimmed estimators for X and S are then computed from the re-
maining observations. The trimming process can be iterated to ensure
that X and S are resistant to outliers.
2. Principal components classifier (PCC). The PCC consists of two func-
z2
tions, one from the major PCs, qk=1 lik , and one from the minor PCs,
P
k
Pp 2
zik
k=p−r+1 lk . The first function can easily detect observations with large
values on some of the original variables; in addition, the second function
helps detect the observations that do not conform to the correlation struc-
ture of the sample. The number of major components, q, is determined
to explain about 50% of the original data variance, while r is chosen so
that the minor components used for the definition of the PCC are those
whose variance is less than 0.20, thus indicating the existence of almost
linear relations among the variables. Based on the PCC definition, an
observation Xi is classified as outlier if:
Pq zik 2 2
zik
(4.27)
Pp
k=1 lk > c1 k=p−r+1 lk > c2
where c1 and c2 are chosen as the 0.99 quantile of the empirical distribu-
z2 2
zik
tions of qk=1 lik
Pp
and .
P
k k=p−r+1 lk
72
4.2. Sample PCA
(a) (b)
Figure 4.1: Principal components scores with (a) and without (b) outliers.
(a) (b)
Figure 4.2: Eigenvalues size with (a) and without (b) outliers.
73
Chapter 4. Principal Components Analysis for turbulence-chemistry
interaction modeling
74
4.2. Sample PCA
q √ 2
X ajk lk
tq,j = (4.30)
sj
k=1
where ajk is the weight of the jth variable on the kth eigenvector and sj is the
standard deviation of variable xj .
q
1X1
lq∗ = . (4.31)
p k
k=1
This method actually compares the eigenvalues from the observed sample
with the eigenvalues from random data. Based on Eq. (4.31), the observed
eigenvalues are considered interpretable if they exceed lq∗ .
75
Chapter 4. Principal Components Analysis for turbulence-chemistry
interaction modeling
76
4.2. Sample PCA
0 0 0
Q=VU Ẑ Zq = U ΣV (4.33)
0
where Σ is the matrix of singular values from the SVD of Ẑ Zq .
2
In statistics, Procrustes analysis is a form of statistical shape analysis used to analyze
the distribution of a set of shapes.
77
Chapter 4. Principal Components Analysis for turbulence-chemistry
interaction modeling
• McCabe criteria [106]. This approach originates from the observation that
the PCs satisfy a certain number of optimality criteria (Section 4.2.1). A
subset of the original variables that optimizes one of these criteria is
called a set of principal variables by McCabe [106]. Suppose that the
set of variables of X is partitioned into subsets X (1) and X (2) . The
covariance matrix of X can be partitioned as:
S 11 S 12
S= . (4.34)
S 21 S 22
Then, the partial covariance matrix for X (2) given X (1) is:
S 22,1 = S 22 − S 21 S −1
11 S 12 . (4.35)
The criteria proposed by McCabe [106] for the definition of the principal
variables are:
min m
Q
MC1 max |S 11 | = min |S 22,1 | = P k=1 δk
MC2 min tr (S 22,1 ) = min m δ
2 Pmk=1 2k (4.36)
MC3 min
Pr kS 22,1 k = min k=1 δk
MC4 max k=1 ρ2k , with r = min (m, p − m)
where δk are the eigenvalues of S 22,1 and ρk are the canonical correla-
tions between the selected and not selected variables. As McCabe [106]
points out, after the selection of the PVs, S 22,1 represents the information
left in the remaining unselected variables and, then, it is quite plausible
that three of the optimality criteria should be functions of this matrix.
McCabe [106] criteria are very appealing as they satisfy well defined prop-
erties. For instance, criterion MC1 maximizes the variance of the data
explained by the subset of variables, while MC2 and MC3 both minimize
the reconstruction error. However, the criteria rapidly becomes compu-
tationally unfeasible for very large data sets.
78
4.2. Sample PCA
79
Chapter 4. Principal Components Analysis for turbulence-chemistry
interaction modeling
Bq = Aq T . (4.37)
Kaiser [110] refers to this as raw VARIMAX, but it is the version that has
become most popular. Verbally, this is simply the sum of the column-wise
variances of the squared elements of Aq . In other words, a criterion is defined
to maximize the amount of variance explained for any of the original variables
on single PCs. After VARIMAX rotation, Aq will generally have fewer large
loadings in its columns, thereby making the columns more easily interpretable.
A simple analytical solution for the maximization of the criterion in Eq. (4.38)
exist for the two-dimensional case [110]. Indicating the columns of Aq with k
and l, the two-dimensional solutions is:
Pp 2 − a2 (2a a ) +
t = 2p
P 2 i=1 aik il
Pp ik il
ahik − a2il
−2 i=1 (2aik ail ) i
2 , (4.39)
b = p pi=1 a2ik − a2il − (2aik ail )2 +
P
2 2
− [ pi=1 (2aik ail )]
P 2
aik − a2il
P
−
1
φ = arctan (t, b) . (4.40)
4
The optimal rotation matrix is then given by:
cos (φ) −sin (φ)
B2 = . (4.41)
sin (φ) cos (φ)
80
4.3. Local Principal Components Analysis
81
Chapter 4. Principal Components Analysis for turbulence-chemistry
interaction modeling
(k)
where X is kth cluster centroid, Xi,q is the rank q approximation of Xi ,
Zi,q is the ith value of the truncated set of PCs, Zq , and A(k)q is the matrix ob-
tained by retaining only the first q eigenvectors of the covariance matrix, S (k) ,
associated to the kth cluster. In the context of reacting systems Eq. (4.42)
needs to be modified to take into account the differences in size and units of
the state variables. In fact, a clustering based on GRE would lead to an op-
timization with respect to temperature only. Therefore, the original LPCA
algorithm from Kambhatla and Leen [111] was modified [92] to include data
preprocessing (Section 4.2.2) in the quantization scheme. A very stable algo-
rithm is obtained by using a global scaled reconstruction error metric, GSRE ,
defined as:
(k)
GSRE Xi , X , D =
X − X (4.43)
f
i i,q
f
where X ei is the ith observation of the sample scaled by D, the diagonal matrix
whose jth diagonal element is the scaling factor dj associated to xj . The
proposed LPCA algorithm, briefly referred as VQPCA, can be summarized as
follows:
(k)
1. Initialization: the cluster centroids, X , are randomly chosen from the
data set and S (k) is initialized to the identity matrix for each cluster.
The VQPCA algorithm is illustrated in Figure 4.5. The vector quantization step
partitions the data into cluster, trying to follow the curvature of the manifold
in the low-dimensional space. Then, the points are assigned to the clusters
depending on their low-dimensional projection on each of the identified clusters.
The goodness of reconstruction given by VQPCA is measured with respect
to the mean variance in the data as:
E (GSRE )
GSRE,n = (4.44)
E [var (e
xj )]
where E denotes the expectation operator and x ej is the scaled jth variable
from X. If auto scaling is employed in data preprocessing, Eq. (4.44) reduces
to:
82
4.3. Local Principal Components Analysis
83
Chapter 4. Principal Components Analysis for turbulence-chemistry
interaction modeling
Figure 4.6: Schematic illustration of the FPCA algorithm [92] for a CO/H2
flame [112]..
84
4.4. Data sets for model validation
fact, the flame does not experience any liftoff or localized extinction and retains
the simple flow geometry of the hydrogen jet flames [80], while adding a modest
level of chemical kinetic complexity. Moreover, the flame is fully characterized
in terms of scalar data. Simultaneous Raman/Rayleigh/LIF measurements
of temperature and species concentrations were conducted at Sandia National
Laboratories, California. About 800 to 1000 measurements were taken at dif-
ferent spatial locations, for a total of 66.275 data points of nine different state
variables (T, N2 , O2 , H2 O, H2 , CO, CO2 , OH, NO). The mixture fraction
calculated following Bilger [75] is also available in the experimental results.
The stoichiometric value of the mixture fraction for this flame is 0.295. The
experimental uncertainties can be obtained from [112].
The second flame is a CH4 flame, part of a series of four piloted jet flames,
C, D, E and F, investigated by Barlow and Frank [99]. Starting from Flame
D, the velocity, and then the Reynolds number, associated to the main jet
is raised, thus increasing the probability of local extinction phenomena. The
flames of interest for the present study are Flame D and, in particular, Flame F.
Flame F shows severe non-equilibrium effects and it is close to global extinction
in the downstream part of the flame; therefore, it represents a challenging
system to judge PCA capabilities in terms of chemical manifolds identification
and parametrization. Likewise the jet flame, Raman/LIF measurements of
temperature and species’ concentrations (T, N2 , O2 , H2 O, H2 , CH4 , CO, CO2 ,
OH, NO), are provided at different spatial locations for a total of 62.766 data
points. The mixture fraction is defined according to Bilger [75], except that
only the elemental mass fractions of hydrogen and carbon are included. The
mixture fraction for this flame is 0.351. The experimental uncertainties can be
obtained from [99, 113].
The third system investigated is the jet in hot co-flow (JHC) burner [46,
49, 50], hereafter denoted with JHC, designed to emulate the flameless com-
bustion regime (Section 1). It consists of a central fuel jet (80% CH4 and 20%
H2 ) within an annular co-flow of hot exhaust products from a secondary burner
mounted upstream of the jet exit plane. The O2 level of the co-flow is controlled
at three different levels, i.e. 3, 6 and 9% (by vol.), while the temperature and
exit velocity are kept constant. Similarly to the other flames, around 56.000
observations are provided for temperature and species concentrations (T, N2 ,
O2 , H2 O, H2 , CH4 , CO, CO2 , OH, NO). The mixture fraction is defined accord-
ing to Bilger [75], The experimental uncertainties can be obtained from [46].
The availability of experimental data for the JHC system has been particularly
important for the present Thesis, as it has allowed to give insights for the CFD
analysis of the combustion systems investigated in Chapters 5-7. In particular,
information regarding turbulence/chemistry interactions in flameless combus-
tion regime have been extracted from the PCA analysis.
85
Chapter 4. Principal Components Analysis for turbulence-chemistry
interaction modeling
4.5 Results
This section describes the results of the PCA methodology applied to the ex-
perimental and numerical data sets described in Section 4.4 are here presented.
First, the capabilities of PCA for the identification of low-dimensional mani-
folds in turbulent reacting systems is investigated. In particular, the effect of
the preprocessing strategies and modeling approaches (i.e. GPCA vs. LPCA)
on the manifold dimensionality is thoroughly discussed, trying to provide also
a physical interpretation for the extracted PCs.
Then, the feasibility of a PCA based combustion model is discussed. The
PCA model is validated a priori using the DNS data sets and its performances
are compared to those of an ideal flamelet parametrization (Chapter 2).
86
4.5. Results
data of opposed jet flames. The analysis was aimed at identifying the num-
ber of components required to accurately approximate the original data. To
this purpose, the correlations among velocities, pressure and species concentra-
tions at different times were taken into account, thus leading to eigenvectors
which are linear combination of the temporal snapshots considered. Similarly,
Danby and Echekki [117] implemented PCA for the analysis of an unsteady
two-dimensional direct numerical simulation of auto ignition in homogeneous
hydrogen air mixtures, with the main purpose of determining the requirements
to reproduce passive and reactive scalars during the process of auto ignition.
The approach presented here is quite different from the ones described above.
The main purpose of the developed PCA methodology is to find correlations
among the state variables (temperature and species concentration) to allow
an optimal approximation of the system in a low-dimensional space. Such an
approach leads to the determination of eigenvectors which are linear combina-
tions of the original variables in a way that allows reducing the dimension of the
system. A similar method was proposed by Maas and Thévenin [118] for the
analysis of DNS data. However, they only considered a very small sampling in
state space. The current study provides significantly more depth in its analysis,
and applies PCA to both experimental and numerical data sets.
Figure 4.7 shows the magnitude of the eigenvalues associated with the PCA
reduction of the jet flame data set, together with the contribution of the q
largest eigenvalues to the amount of variance explained by the new basis vectors.
The eigenvalue distribution reflects the covariance structure of the data set,
shown in Table 4.1, and obtained by applying the auto scaling criterion. It is
clear that the first two eigenvalues alone account for more than the 92% of the
total variance in the data. On the other hand, the last four smallest eigenvalues
are very close to zero; therefore, they contain no useful information and only
explain linear dependencies among the original variables. Therefore, a strong
size reduction, from 9 to 2 or 3, can be accomplished by using PCA, through
the identification of the most active directions in the original data. The total,
tq , and individual variance, tq,j , accounted for the jet flame by the first two
or three eigenvalues are listed in the first two columns of Table 4.2. It can be
observed that, by choosing q = 2, it is possible to capture more than 90% of the
individual variances of all the main species and temperature, while the minor
species, OH and NO, require an additional component, q = 3, to reach levels
of approximation comparable to the other state variables.
This is confirmed by the analysis of the parity plots of temperature and
species mass fractions given by the PCA reconstruction for the cases q = 2
(Figure 4.8) and q = 3 (Figures 4.9). It can be observed that the addition of a
component has a small effect on temperature and main species, whose variation
is mainly explained by the first two components (Table 4.2). Moreover, the par-
87
Chapter 4. Principal Components Analysis for turbulence-chemistry
interaction modeling
Figure 4.7: Scree-graph and histograms of the q largest eigenvalues for the jet
flame data set, preprocessed with auto scaling.
ity plots of temperature (Figures 4.8 and 4.9 (a)), H2 O mass fraction (Figures
4.8 and 4.9 (d)) and minor species such as OH and NO (Figures 4.8 and 4.9
(e, f)) point out the existence of non linear deviations in the recovered data,
which can be probably ascribed to non linear dependencies among the original
variables. This result suggests that the low-dimensional projection of the ther-
mochemical state shows significant non linearities which cannot be taken into
account with a global linear approach. Therefore, specific algorithms perform-
ing PCA in locally linear regions of the data (Section 4.3) could be taken into
account, to improve the accuracy of the parametrization.
Figure 4.10 shows the eigenvalue size distribution and the contribution of
the q largest eigenvalues to the total explained variance, tq , for Flame D, F and
JHC data sets. The covariance matrices for the data sets are shown in Table
4.3-4.5. Similarly to the jet flame, a significant size reduction can be achieved
for D and F flames, although an additional component is required, q = 3 or
q = 4, due to the higher complexity of the piloted flames (Section 4.4.1). On
the other hand, the JHC data set shows a higher dimensionality and at least 4
components are needed to explain as much as 90% of the total variance in the
original data. The number of required PCs, q, increases to 5 if an individual
variance, tq,i , above 90% is desired for all the variables, as indicated in Table
4.6. Such result is particularly interesting for the present Thesis, as it confirms
the complexity in the numerical modeling of the flameless combustion regime
[45, 78, 79], caused by the overlap between chemical and mixing scale and,
thus, by the need of optimal progress variables for the description of complex
interactions which take place in such regime.
Table 4.6 lists the values of tq and tq,j accounted for Flame D, F and for
JHC. It is interesting to observe the very strong similarities between Flame
D and F, confirmed by the analysis of their covariance structure (Tables 4.3
88
4.5. Results
Table 4.1: Covariance matrix for the jet flame data set. Scaling criterion adopted: auto scaling.
T Y O2 YN 2 YH2 YH2 O YCO YCO2 YOH YN O
T 1.000 −0.825 −0.512 0.005 0.938 0.117 0.984 0.771 0.815
YO2 1.000 0.887 −0.541 −0.909 −0.646 −0.767 −0.562 −0.558
YN2 1.000 −0.835 −0.667 −0.902 −0.438 −0.266 −0.256
89
YH2 1.000 0.196 0.973 −0.082 −0.168 −0.170
YH2 O 1.000 0.329 0.892 0.725 0.678
YCO 1.000 0.024 −0.081 −0.113
YCO2 1.000 0.793 0.855
YOH 1.000 0.639
YN O 1.000
Chapter 4. Principal Components Analysis for turbulence-chemistry
Table 4.2: Total, tq , and individual variance, tq,j , accounted for the jet flame data set, as a function of the number of retained
PCs, q, and the preprocessing criterion.
tq,i (%)
auto range max vast level
q=2 q=3 q=2 q=3 q=2 q=3 q=2 q=3 q=2 q=3
T 0.971 0.973 0.983 0.991 0.979 0.990 0.992 0.992 0.896 0.943
YO2 0.986 0.986 0.994 0.994 0.997 0.997 0.975 0.978 0.942 0.961
0.986 0.986 0.981 0.981 0.971 0.971 1.000 1.000 0.965 0.970
90
YN2
YH2 0.968 0.969 0.962 0.963 0.957 0.960 0.945 0.947 0.991 0.991
YH2 O 0.930 0.936 0.945 0.945 0.944 0.944 0.940 0.978 0.870 0.884
YCO 0.994 0.994 0.995 0.997 0.990 0.994 0.979 0.980 0.987 0.987
YCO2 0.973 0.977 0.979 0.987 0.977 0.988 0.981 0.985 0.908 0.959
YOH 0.738 0.940 0.731 0.991 0.745 0.992 0.660 0.687 0.870 0.993
YN O 0.772 0.930 0.728 0.795 0.729 0.802 0.744 0.970 0.701 0.926
interaction modeling
tq (%) 0.924 0.966 0.946 0.975 0.942 0.975 0.992 0.996 0.949 0.980
4.5. Results
(a) (b)
(c) (d)
(e) (f)
Figure 4.8: Parity plots of temperature (a), H2 O (b), H2 (c), CO (d), OH (e)
and NO (f) mass fractions illustrating the GPCA (q = 2) reduction of the jet
flame data set. Scaling criterion adopted: auto scaling.
91
Chapter 4. Principal Components Analysis for turbulence-chemistry
interaction modeling
(a) (b)
(c) (d)
(e) (f)
Figure 4.9: Parity plots of temperature (a), H2 O (b), H2 (c), CO (d), OH (e)
and NO (f) mass fractions illustrating the GPCA (q = 3) reduction of the jet
flame data set. Scaling criterion adopted: auto scaling.
92
4.5. Results
and 4.4), thus indicating that the relations between the state variables are not
strongly affected by the increase in Reynolds number from one flame to the
other.
A closer look at the covariance matrices structure indicate that, with the
exception of the JHC data set, there is always a strong correlation between
temperature, oxidation products (CO2 , H2 O), OH and NO (Table 4.1, Tables
4.3-4.4), as it is expected for a turbulent non premixed flame. The covariance
matrix for the JHC data set still shows a strong correlation between temper-
ature and product’s mass fractions; however, the covariance between tempera-
ture and the minor species, i.e. OH and NO, is lower. Once again, this indicate
the existence of a more complex flame structure, arising from a balance between
turbulent mixing and chemical kinetics.
Figure 4.11 and Figure 4.12 show the GPCA reconstruction of Flame F,
with q = 3 and q = 4, respectively. Similarly to the jet flame, the addition
of a PC barely affects the accuracy in the prediction of the major species, as
it mainly acts on the prediction of the minor species, i.e. OH and NO (Table
4.6). Very similar results are observed for Flame D.
With regard to the JHC system, very large (non linear) deviations are ob-
served for temperature (Figure 4.13 (a)), CO (Figure 4.13 (c)) and OH (Figure
4.13 (e)), for the case q = 4. The increase of the number of PCs to q = 5
strongly improves the prediction of CO (Figure 4.14 (c)) and OH (Figure 4.14
(e)), but not temperature (Figure 4.13 (a)) and other species, i.e. CO2 . It is
noteworthy that NO is very well captured, even with q = 4. This results sug-
gests that one of the retained PCs is highly correlated with NO, thus leading
to the observed result.
93
Chapter 4. Principal Components Analysis for turbulence-chemistry
interaction modeling
(a)
(b)
(c)
Figure 4.10: Scree-graph and histograms of the q largest eigenvalues for Flame
D (a), Flame F (b) and JHC (c). Scaling criterion adopted: auto scaling.
94
4.5. Results
Table 4.3: Covariance matrix for Flame D data set. Scaling criterion adopted: auto scaling.
T Y O2 YN2 YH2 YH2 O YCH4 YCO YCO2 YOH YN O
T 1.000 −0.960 −0.134 0.418 0.979 −0.295 0.535 0.984 0.681 0.912
YO2 1.000 0.323 −0.589 −0.977 0.093 −0.688 −0.932 −0.645 −0.859
YN2 1.000 −0.473 −0.194 −0.867 −0.451 −0.109 −0.061 −0.052
YH2 1.000 0.548 0.194 0.919 0.320 0.056
95
0.240
YH2 O 1.000 −0.257 0.658 0.949 0.666 0.883
YCH4 1.000 0.102 −0.312 −0.221 −0.329
YCO 1.000 0.442 0.213 0.372
YCO2 1.000 0.708 0.933
YOH 1.000 0.688
YN O 1.000
Chapter 4. Principal Components Analysis for turbulence-chemistry
Table 4.4: Covariance matrix for Flame F data set. Scaling criterion adopted: auto scaling.
T YO2 YN 2 YH2 YH2 O YCH4 YCO YCO2 YOH YN O
T 1.000 −0.968 −0.073 0.418 0.984 −0.312 0.543 0.981 0.745 0.824
YO2 1.000 0.241 −0.545 −0.976 0.128 −0.660 −0.940 −0.748 −0.790
YN2 1.000 −0.378 −0.109 −0.882 −0.349 −0.053 −0.057 −0.026
YH2 1.000 0.512 0.124 0.926 0.305 0.189 0.229
96
YH2 O 1.000 −0.296 0.636 0.956 0.754 0.816
YCH4 1.000 0.041 −0.327 −0.232 −0.297
YCO 1.000 0.432 0.262 0.331
YCO2 1.000 0.767 0.851
YOH 1.000 0.633
YN O 1.000
interaction modeling
4.5. Results
Table 4.5: Covariance matrix for JHC data set. Scaling criterion adopted: auto scaling.
T YO2 YN 2 YH2 YH2 O YCH4 YCO YCO2 YOH YN O
T 1.000 −0.476 0.616 −0.534 0.892 −0.534 0.292 0.913 0.427 0.388
YO2 1.000 0.306 −0.420 −0.619 −0.418 −0.378 −0.614 −0.150 −0.126
YN2 1.000 −0.990 0.483 −0.991 0.139 0.465 0.216 0.253
YH2
97
1.000 −0.392 0.998 −0.085 −0.376 −0.195 −0.240
YH2 O 1.000 −0.398 0.516 0.927 0.340 0.389
YCH4 1.000 −0.089 −0.381 −0.196 −0.241
YCO 1.000 0.266 −0.072 0.123
YCO2 1.000 0.362 0.376
YOH 1.000 0.214
YN O 1.000
Chapter 4. Principal Components Analysis for turbulence-chemistry
interaction modeling
(a) (b)
(c) (d)
(e) (f)
Figure 4.11: Parity plots of temperature (a), H2 O (b), CO (c), H2 (d), OH (e)
and NO (f) mass fractions illustrating the GPCA (q = 3) reduction of Flame
F. Scaling criterion adopted: auto scaling.
98
4.5. Results
(a) (b)
(c) (d)
(e) (f)
Figure 4.12: Parity plots of temperature (a), H2 O (b), CO (c), H2 (d), OH (e)
and NO (f) mass fractions illustrating the GPCA (q = 4) reduction of Flame
F. Scaling criterion adopted: auto scaling.
99
Chapter 4. Principal Components Analysis for turbulence-chemistry
interaction modeling
(a) (b)
(c) (d)
(e) (f)
Figure 4.13: Parity plots of temperature (a), H2 O (b), CO (c), H2 (d), OH (e)
and NO (f) mass fractions illustrating the GPCA (q = 4) reduction of JHC
data set. Scaling criterion adopted: auto scaling.
100
4.5. Results
(a) (b)
(c) (d)
(e) (f)
Figure 4.14: Parity plots of temperature (a), H2 O (b), H2 (c), CO (d), OH (e)
and NO (f) mass fractions illustrating the GPCA (q = 5) reduction of JHC
data set. Scaling criterion adopted: auto scaling.
101
Chapter 4. Principal Components Analysis for turbulence-chemistry
interaction modeling
Table 4.6: Total, tq , and individual variance, tq,j , accounted for Flame D, F
and JHC data sets by the GPCA reduction, as a function of the number of
retained PCs, q.
tq,i (%)
Flame D Flame F JHC
q=3 q=4 q=3 q=4 q=4 q=5
T 0.971 0.985 0.967 0.971 0.932 0.948
YO2 0.982 0.986 0.978 0.979 0.961 0.974
YN2 0.979 0.981 0.979 0.980 0.991 0.991
YH2 0.959 0.966 0.969 0.970 0.998 0.998
YH2 O 0.987 0.988 0.983 0.984 0.966 0.966
YCH4 0.984 0.986 0.984 0.984 0.999 0.999
YCO 0.940 0.965 0.961 0.969 0.757 0.998
YCO2 0.965 0.985 0.969 0.974 0.911 0.970
YOH 0.743 1.000 0.711 0.978 0.735 1.000
YN O 0.902 0.932 0.792 0.892 0.999 1.000
tq (%) 0.941 0.977 0.946 0.968 0.925 0.984
correctly.
Moving on to Flame D and F, the second PC observed for the jet flame is
somehow split into two components, one representative of a mixture fraction
(both N2 and CH4 are very highly correlated to mixture fraction) and one rep-
resentative of the intermediate product species (CO, H2 ). The last component
is again OH, the flame marker. The eigenvectors structures of Flame D and
F are very similar. There is only a significant difference which could be high-
lighted, namely the NO weights on the fourth component. For Flame D, NO
does not appear as a relevant weight on the last PC, whereas it is no negligible
for the fourth component of Flame F, thus reflecting the lower correlations be-
Table 4.7: Retained (a) and rotated (b) eigenvectors for the jet flame data set.
102
4.5. Results
Table 4.8: Retained (a) and rotated (b) eigenvectors for Flame D data set.
Table 4.9: Retained (a) and rotated (b) eigenvectors for Flame F data set.
tween the two variables (Table 4.4), probably determined by the higher physical
complexity of the system.
Finally, regarding the eigenvectors of the JHC system, it can be observed
how the first rotated component does not show a large influence of NO, differ-
ently from all the other systems. This can be explained by taking into account
that the first PC tries to explain as much as possible of the data variability.
It is well known [3, 4, 2, 45] that NO formation in flameless combustion is
more homogeneous than in traditional non premixed combustion, due to the
smoother temperature gradients; therefore, NO is characterized by less vari-
ability and disappears from the first PC. The second and third PCs are, again,
representative of reactant and intermediate combustion products (Table 4.5),
reflecting a similar pattern to that observed for Flame F (and D). Differently
from the piloted flames, the fourth component is exclusively NO, thus meaning
that none of the previous components can take into account NO formation and
a specific PC is needed. Then, the OH component, present in all the other
103
Chapter 4. Principal Components Analysis for turbulence-chemistry
interaction modeling
Table 4.10: Retained (a) and rotated (b) eigenvectors for JHC data set.
a1 a2 a3 a4 a5
T 0.42 -0.16 0.08 0.12 0.16
YO2 -0.11 0.59 0.02 -0.11 -0.15
YN2 0.38 0.34 -0.10 0.07 0.03
YH2 -0.35 -0.39 0.08 -0.04 0.00
(a) YH2 O 0.40 -0.27 -0.10 0.06 0.03
YCH4 -0.35 -0.39 0.08 -0.04 0.00
YCO 0.16 -0.23 -0.67 -0.10 -0.64
YCO2 0.39 -0.26 0.08 0.11 0.31
YOH 0.19 -0.07 0.68 0.20 -0.67
YN O 0.22 -0.07 0.21 -0.95 0.00
Principal variables As it was pointed out in the previous Section, the orig-
inal variables do not contribute equally in the determination of the PCs and
rotation can improve eigenvectors interpretability, transforming them to meet
a simpler structure. Another way to help interpretation is to extract the so
called Principal Variables (PVs), described in Section 4.2.3.5.
Table 4.11 lists the PVs determined using the methods outlined in Section
4.2.3.5 for the jet flame. At first glance, the results may appear to vary widely,
going from one method to the other. However, a more careful analysis shows
the existence of many similarities. In particular, methods B4, B2, M2, MC2
and MC3 lead to very similar results, identifying a major variable (T, CO2
or H2 O), a fuel species (CO or H2 ) and OH as PVs. In fact, T, CO2 of H2 O
are highly correlated (Table 4.1), cov (T, CO2 ) = 0.984, cov (T, H2 O) = 0.938
and cov (H2 O, CO2 ) = 0.892. Similarly, H2 and CO are also exchangeable
variables, being cov (H2 O, CO2 ) = 0.973. Method MC1 replaces T (or CO2
or H2 O) with NO, which shows a strong correlation with T, although weaker
104
4.5. Results
Table 4.11: Principal variables for the jet flame data set, as provided by the
different methods described in Section 4.2.3.5.
Table 4.12: Principal variables for Flame D, F and the JHC data set. PV
method: MC2 (Section 4.2.3.5).
than that CO2 and H2 O. Finally, the PF method provides a different solution,
neglecting OH as PV and replacing it with O2 . However, this solution was
considered unreliable, being very far from the pattern identified by all the other
methods.
On the basis of the results obtained for the jet flame case, it was chosen
to adopt the MC2 method for the extraction of the PVs, as it provides results
comparable to most of the other models and satisfies a very appealing prop-
erty of PCA, the minimization of the reconstruction error. Applying the MC2
methods to the other data sets, we get the results in Table 4.12. It is very
interesting to observe that the same considerations derived from the analysis of
the rotated PCs can be done here, with a clearer physical interpretation. The
PVs selected for Flame D and F reflect the patters of the PCs, as they include
a mixture fraction variable, an intermediate and a product species and OH.
Finally, for the JHC system, the same set of PVs obtained for Flame D (and F)
is recovered, although augmented with NO, thus confirming the need to take
explicitly into account the formation of such pollutant species.
105
Chapter 4. Principal Components Analysis for turbulence-chemistry
interaction modeling
criterion different than auto scaling, namely range (a), vast (b), level (c) and
max (d) scaling. If we compare Figure 4.15 to Figure 4.7, it is clear that all
the methods identify the manifold dimensionality to be equal to three, with
the exception of vast scaling (Figure 4.15 (b)), which identifies only two PCs,
with a very dominant first PC. Columns 3-10 in Table 4.2 show the values of
tq and tq,j obtained by applying range, max, vast and level scaling to the jet
flame data set. Results confirm that auto scaling is the only criterion able to
provide a uniform reconstruction of the state variables, leading to comparable
values of tq,j for all of them. Range and max scaling (columns 3-6, Table
4.2), very similar as expected, perform slightly better than auto scaling for
most of the main species and temperature. However, they proves to be unable
to properly capture NO variation, even with q = 3. Similarly, vast scaling
(columns 7-8, Table 4.2) concentrates on extremely stable variables, i.e. N2 ,
but completely fails in recovering OH properly. Then, the higher values of tq
given by range, max and vast scaling, compared to auto scaling, are due to the
higher variance explained for the major variables; however, these approaches
miss very important features, such as the parametrization of NO and OH. The
variance explained by auto scaling for OH and NO is up to 16% and 25% higher,
respectively, than that explained by the other scaling methods.
On the opposite, level scaling (columns 9-10, Table 4.2) focuses on variables
characterized by large relative changes and leads to an overestimation of the
role of minor species in the PCA reduction. Therefore, the prediction of minor
species such as OH and NO is very accurate, but major species such as H2 O
are badly recovered.
On the basis of the described sensitivity, it was chosen to adopt the auto
scaling as default preprocessing criterion for the analysis. Obviously, in ap-
plications which do not require the accurate parametrization of minor species,
other options could provide better results than to auto scaling.
106
4.5. Results
(a) (b)
(c) (d)
Figure 4.15: Scree-graph and histograms of the q largest eigenvalues for the
jet flame data set, preprocessed with range (a), vast (b), level (c) and max (d)
scaling.
Table 4.13: Values of GSRE,n associated with the GPCA, VQPCA and FPCA
reconstructions of the jet flame, flame F and JHC data set, as a function of the
number of clusters, k, and retained PCs, q.
107
Chapter 4. Principal Components Analysis for turbulence-chemistry
interaction modeling
here that (Section 4.3) GSRE,n is a mean global scaled reconstruction error,
normalized by the mean variance present in the original data. So, for exam-
ple, the value of GSRE,n associated with the GPCA reduction of the jet flame
with q = 2 is fairly large, GSRE,n = 0.68, thus reflecting the large deviations
observed in Figure 4.8 for some state variables. Even when q in increased to
three, a significant error, GSRE,n = 0.309 is obtained, confirming the persis-
tence of mainly non-linear departures from the original data. If lower values
of GSRE,n are desired, i.e. < 0.1, the value of q should be increased to 4 or 5
(GSRE,n = 0.098 and GSRE,n = 0.046, respectively). However, the manifold
dimensionality obtained could not be regarded as a true manifold dimension-
ality. It can be argued that fake components need to be added to account for
non linear interactions because of the global linear nature of the adopted model.
However, if a locally linear model is employed, i.e. VQPCA, much higher per-
formances in terms of GSRE,n are obtained, even for smaller values of q, i.e.
q = 2 or q = 3. Figure 4.16 shows the VQPCA reconstruction of temperature
(a), H2 O (b), CO (c), H2 (d), OH (e) and NO (f), with k = 8 and q = 2. A
much better agreement between original and reconstructed data is observed, as
it is confirmed by the value of 0.08 obtained for GSRE,n . A similar value of
GSRE,n would require q = 5 if GPCA is applied. Moreover, the parity plots for
the state variables in Figure 4.16 show how, after partitioning, the relationships
between the original and reconstructed data are mainly linear.
The values of GSRE,n for Flame F, as provided by the different approaches,
are also shown Table 4.13. Similarly to the jet flame, the reconstruction error,
GSRE,n , associated with the GPCA reductions (Figure 4.11 and Figure 4.12)
obtained by choosing q = 3 and q = 4 are very high, GSRE,n = 0.71 and
GSRE,n = 0.32, respectively, thus confirming the inability of GPCA to deter-
mine the most compact description of the data in a lower dimensional manifold.
Figure 4.17 shows the VQPCA reconstruction of temperature (a), H2 O (b), CO
(c), H2 (d), OH (e) and NO (f), with k = 8 and q = 3. The value of GSRE,n
obtained is 0.09, almost eight and four times smaller than the values given by
GPCA with q = 3 and q = 4, respectively. Also, GPCA would require 6 PCs
to give a value of GSRE,n as small as 0.09.
Finally, the same considerations hold for the JHC data set. In fact, the
effect of data partitioning is even more evident for this case with respect to the
jet flame and Flame F. Table 4.13 clearly points out that GSRE,n dramatically
decreases when VQPCA with k = 2 is employed, going from 1.552 and 0.752
to 0.241 and 0.093, for q = 3 and q = 4, respectively. This corresponds to a
reduction of around 85% for both cases, which is not observed for any of the
other investigated data sets. Such result is extremely interesting and suggests
the existence of different flame structures within the JHC system. Figure 4.18
shows the VQPCA reconstruction of temperature (a), H2 O (b), CO (c), H2
(d), OH (e) and NO (f), with k = 6 and q = 3. With respect to the jet flame
and Flame F, it is possible to reach values of GSRE,n below 0.1 with a smaller
number, i.e. k = 6, of clusters.
108
4.5. Results
(a) (b)
(c) (d)
(e) (f)
Figure 4.16: Parity plots of temperature (a), H2 O (b), CO (c), H2 (d), OH (e)
and NO (f) mass fractions illustrating the VQPCA (q = 2, k = 8) reduction of
the jet flame data set. GSRE,n = 0.08.
109
Chapter 4. Principal Components Analysis for turbulence-chemistry
interaction modeling
(a) (b)
(c) (d)
(e) (f)
Figure 4.17: Parity plots of temperature (a), H2 O (b), CO (c), H2 (d), OH (e)
and NO (f) mass fractions illustrating the VQPCA (q = 3, k = 8) reduction of
Flame F data set. GSRE,n = 0.08.
110
4.5. Results
(a) (b)
(c) (d)
(e) (f)
Figure 4.18: Parity plots of temperature (a), H2 O (b), CO (c), H2 (d), OH (e)
and NO (f) mass fractions illustrating the VQPCA (q = 3, k = 6) reduction of
JHC data set. GSRE,n = 0.08.
111
Chapter 4. Principal Components Analysis for turbulence-chemistry
interaction modeling
Table 4.14: Total, tq , and individual variance, tq,j , accounted for by the re-
tained PCs for the DNS1 and DNS2 data sets as a function of the number of
components, q.
k DNS1 DNS2
q=2 q=3 q=3 q=4
GPCA 1 3.130 1.830 1.800 1.130
VQPCA 2 0.816 0.176 0.734 0.369
4 0.307 0.065 0.235 0.076
6 0.116 0.025 0.141 0.046
8 0.043 0.010 0.114 0.038
10 0.036 0.009 0.096 0.033
FPCA 2 0.625 0.216 0.773 0.417
4 0.243 0.052 0.263 0.081
6 0.122 0.030 0.204 0.066
8 0.062 0.020 0.167 0.054
10 0.046 0.015 0.140 0.043
VQPCA has been exploited also for the analysis of the DNS data sets, DNS1
and DNS2, described in Section 4.4.2. Regarding the DNS2 data set, multiple
time steps have been merged before analyzing the data, namely t = 1.5e − 03 s,
t = 2.0e − 03 s, t = 2.5e − 03 s and t = 3.0e − 03 s. However, the resulting data
set (∼3.800.000 data points) have been conditioned in mixture fraction space,
between f = 0.1 and f = 0.8, to overcome memory issues (Figure 4.19).
Table 4.14 lists the values of GSRE,n given by GPCA, VQPCA and FPCA
for the DNS data sets. Similarly to the JHC case, the first partition is charac-
terized by a dramatic reduction of GSRE,n also for the DNS data sets. This
indicates, once again, that a global approach would lead to misleading estima-
tion of the manifold dimensionality. Table 4.14 also shows that DNS2 requires
an additional PC with respect to DNS1 to reach acceptable levels of accuracy.
This is determined by the higher complexity of the DNS2 data set, characterized
by a significant degree of extinction.
Figure 4.20 shows the contour plots of original and recovered temperature
and OH mass fraction distribution for the DNS1 data set. It can be observed
how a VQPCA approach with q = 3 and k = 8 allows to capture with great
accuracy the flame features, resulting in a very small reconstruction error,
GSRE,n = 0.01. This is a very appealing result, indicating that VQPCA could
be effectively exploited for the compression of DNS data sets, characterized by
very large storage requirements, for visualization and post-processing purposes.
Very strong compressions could be achieved, as shown here, prescribing the de-
sired accuracy of the recovered data. For a given manifold dimensionality, the
dimensions of the reduced data sets are independent of the number of clusters;
therefore, the parameters q and k can be varied to optimize the accuracy and
112
4.5. Results
(a)
(b)
Figure 4.19: Original (a) and conditioned (b) temperature field for DNS2 data
set at time step t = 1.5e − 03 s.
113
Chapter 4. Principal Components Analysis for turbulence-chemistry
interaction modeling
Comparison of VQPCA and FPCA Rows 6-9 in Table 4.13 and rows 7-11
in Table 4.14 list the values of GSRE,n given by FPCA for the experimental
and numerical data sets, respectively, as a function of the number of clusters, k,
and retained PCs, q. These values can be weighed against those obtained with
the VQPCA algorithm (rows 2-5 in Table 4.13 and rows 2-6 in Table 4.14) for
the same values of q and k. The comparison is illustrated graphically in Figure
4.24 for the experimental data, and in Figure 4.25 for the DNS data sets.
For the jet flame (Figure 4.24 (a)), VQPCA performs generally better with
respect to FPCA. The values of GSRE,n given by VQPCA are 6 − 25% lower
than those provided by FPCA with the exception of the cases corresponding to
k = 2 and q = 3. However, the performances of FPCA can be considered very
satisfying and promising, especially from the point of view of model implemen-
tation. In fact, FPCA partitioning is much simpler and straightforward than
the one underlying the VQPCA algorithm. Moreover, it is interesting to inves-
tigate how the VQPCA partitioning is reflected in the mixture fraction space.
To this purpose, the indexes of the observations with respect to the original
data matrix, X,f were stored and used to reconstruct the mixture fraction vec-
tors in each cluster identified by VQPCA. Figure 4.26 shows the temperature
as a function of mixture fraction for the two clusters selected by VQPCA, with
q = 2. It can be observed that the data are almost clustered into two regions
corresponding to the rich and lean zones of the flame, being the stoichiometric
mixture fraction of the jet flame flame equal to 0.295. This result suggests that,
for the jet flame, the mixture fraction can be considered an optimal variable for
the parametrization of the thermochemical state of the system, as it is gener-
ally assumed in many models for non-premixed combustion. The information
added here is that mixture fraction is an optimal variable from the point of
view of reconstruction error minimization
As far as Flame F flame is concerned, Table 4.13 and Figure 4.24 (b) point
out that VQPCA outperforms FPCA in all cases, providing values of GSRE,n
13-46% lower than those given by FPCA. These results confirm the notorious
complexity of this flame, characterized by significant local extinction and re-
114
4.5. Results
(a)
(a’)
(b)
(b’)
Figure 4.20: Contour plots of original and recovered temperature (a, a’) and
OH mass fraction (b, b’) distribution for DNS1. VQPCA reduction with q = 3
and k = 8. GSRE,n = 0.01.
115
Chapter 4. Principal Components Analysis for turbulence-chemistry
interaction modeling
(a) (b)
Figure 4.21: Parity plots of original and recovered temperature (a, a’) and OH
mass fraction (b, b’) distribution for DNS1. VQPCA reduction with q = 3 and
k = 8. GSRE,n = 0.04.
ignition. In the context of the Conditional Moment Closure [73], for example,
it has been recognized [119] that conditioning on mixture fraction is not suffi-
cient for Flame F and a second conditioning variable should be used. Figure
4.27 shows the temperature as a function of mixture fraction for the two clus-
ters selected by VQPCA with q = 3. It can be observed that, differently from
the jet flame, the VQPCA algorithm extracts features from the whole mixture
fraction space in order to achieve the best q-dimensional representation of the
thermochemical state of the system. Then, it can be concluded that, for Flame
F, mixture fraction does not represent an optimal reaction variables. There-
fore, VQPCA could provide an appealing alternative to guide the selection of
the most compact subset of reaction variables needed to properly describe the
thermochemical state of such reacting system.
Figure 4.24 (c) shows the comparison between the reconstruction provided
by VQPCA and FPCA for the JHC data set. Similarly to Flame F, the VQPCA
algorithm provides GSRE,n , 10-50% lower than those obtained with FPCA.
The mixture fraction partitioning does not optimally follow the curvature of
the manifold in state space, indicating the complexity of turbulence/chemistry
interactions for the system. This is further confirmed by Figure 4.28, which
shows the partition of temperature in the two clusters selected by VQPCA
with q = 3, in mixture fraction space. The algorithm selects a first cluster
characterized by a lean branch and part of a rich region, with an aspect charac-
teristic of a non premixed flame. On the other hand, the second cluster shows
important non equilibrium phenomena, such as extinction, similarly to cluster
2 for Flame F (Figure 4.27 (b)).
To better understand the underlying mechanism of the VQ partitioning
algorithm, it is possible to analyze the structure of the rotated eigenvectors in
116
4.5. Results
(a) (a’)
(b) (b’)
Figure 4.22: Contour plots of original (a, b) and recovered (a’, b’) temperature
distribution for DNS2, at two different time steps, i.e. t = 1.5e − 03 s (a,
a’) and t = 2.0e − 03 s (b, b’). VQPCA reduction with q = 4 and k = 8.
GSRE,n = 0.04.
(a) (b)
Figure 4.23: Parity plots of temperature (a) and OH (b) mass fraction illustrat-
ing the VQPCA (q = 4, k = 8) reduction of DNS2 data set. GSRE,n = 0.04.
117
Chapter 4. Principal Components Analysis for turbulence-chemistry
118
interaction modeling
Figure 4.24: Values of GSRE,n as a function of the number of clusters, k, and retained PCs, q, for the jet flame, Flame F and
JHC data sets.
4.5. Results
(a) (b)
119
Chapter 4. Principal Components Analysis for turbulence-chemistry
interaction modeling
(a) (b)
(a) (b)
120
4.5. Results
Table 4.15: Rotated eigenvector in the first (a) and second (b) cluster identified
by VQPCA for Flame F. q = 3 and GSRE,n = 0.21
a1,r a2,r a1 a2
T -0.01 0.55 T 0.30 -0.01
YO2 -0.09 -0.44 YO2 -0.34 -0.09
YN2 -0.70 -0.07 YN2 -0.12 -0.12
YH2 -0.01 -0.04 YH2 0.05 0.64
(a) YH2 O -0.03 0.44 (b) YH2 O 0.33 -0.09
YCH4 0.71 -0.10 YCH4 -0.01 0.02
YCO 0.00 0.07 YCO 0.12 0.70
YCO2 0.00 0.51 YCO2 0.35 -0.11
YOH 0.00 0.05 YOH 0.60 -0.17
YN O -0.02 0.18 YN O 0.42 -0.13
the two clusters identified by VQPCA for Flame F (Table 4.15) and for the
JHC data set (Table 4.16). For Flame F, the eigenvectors associated to the
first cluster (Table 4.15 (a)) are a mixture fraction 3 and a linear combination
of major species and temperature, respectively. This supports the graphical
observation provided by Figure 4.27 (a), which shows the first cluster to be
characterized by the lean and rich branches of the flame. On the other hand,
the reaction region identified by Figure 4.27 (b), needs to be described by means
of parameters with a strong contribution of intermediate and minor species, as
it is shown in Table 4.15 (b).
With regard to the JHC data set, the structure of the rotated eigenvec-
tors prompts very interesting considerations. In particular, the second cluster
Table (4.16 (b)) is parametrized by a first component with significant weights
on the fuel species, intermediate species and temperature, whereas the second
PC reduces to OH. Thus, VQPCA is able to extract the subset of the data set
dominated by finite rate chemistry effects by means of progress variables able to
capture the ignition process. In the context of the numerical modeling of flame-
less combustion, such result confirms the need of combustion models suited for
the description of turbulence-chemistry interactions in such combustion regime.
As far as the numerical data are concerned, the VQPCA and FPCA reduc-
tions appear comparable for the DNS1 data set (Table 4.14), while VQPCA
outperforms FPCA for DNS2 (Table 4.14).This confirms that mixture fraction
is not optimal from the point of view of error minimization when the physics
under investigation become too complex. This is somehow expected, as mixture
fraction is only a measure of the local system stoichiometry and, then, it can
only cover relatively fast scales.
The small discrepancies between FPCA and VQPCA for DNS1 (and for
3
The denomination mixture fraction is used here because the variables which define the
first PC are highly correlated with the f , cov (f, N2 ) = 0.97 and cov (f, CH4 ) = 0.90.
121
Chapter 4. Principal Components Analysis for turbulence-chemistry
interaction modeling
Table 4.16: Rotated eigenvector in the first (a) and second (b) cluster identified
by VQPCA for the JHC data set. q = 3 and GSRE,n = 0.21
a1 a2 a1,r a2,r
T 0.48 0.01 T -0.25 0.21
YO2 -0.51 0.03 YO2 -0.13 -0.03
YN2 0.07 -0.02 YN2 -0.50 0.00
YH2 0.00 0.00 YH2 0.48 -0.06
(a) YH2 O 0.46 0.04 (b) YH2 O -0.23 0.06
YCH4 0.00 0.00 YCH4 0.49 0.00
YCO 0.05 -0.03 YCO 0.15 -0.07
YCO2 0.53 0.04 YCO2 -0.32 0.05
YOH 0.09 0.01 YOH 0.10 0.96
YN O -0.03 1.00 YN O -0.05 0.14
the jet flame) suggest that VQPCA actually tends to FPCA when dealing with
relatively simple systems, characterized by fast chemistry and a small degree
of extinction. This is confirmed by Figure 4.29, showing the VQPCA (a-d) and
FPCA (a’-d’) partition of the DNS1 data. Both the approaches identify a rich
and lean region, together with a rich and lean reacting layer.
122
4.6. Development of a PCA based combustion model
(a) (b)
(c) (d)
(a’) (b’)
(c’) (d’)
Figure 4.29: Parity plots of temperature (a), H2 O (b), CO (c), H2 (d), OH (e)
and NO (f) mass fractions illustrating the VQPCA (q = 3, k = 6) reduction of
JHC data set. GSRE,n = 0.08.
Figure 4.30: CPU time associated with the FPCA and VQPCA reductions.as a
function of the number of clusters, k, and retained PCs, q, for the experimental
(a) and numerical (b) data sets.
123
Chapter 4. Principal Components Analysis for turbulence-chemistry
interaction modeling
In fact, the selection of optimal variables for turbulent reacting systems could
be exploited for the development of turbulence-chemistry interaction models.
In this context, the linearity of the PCA method is extremely appealing. In
fact, if a set of reaction variables is selected, only few linear combinations of
the original variables need to be transported in a numerical simulation.
λ ∂2
DYk ∂Yk
ρ = + ω̇k (4.46)
Dt cp Lek ∂x2j ∂xj
where we have assumed the density and species diffusivity to be constant.
If the species mean, Y k , is subtracted and a scaling factor, dk , is applied to
the centered variable, we get:
(Yk −Y k )
D " #
dk λ ∂2 Yk − Y k ω̇k
= + . (4.47)
Dt ρcp Lek ∂x2j dk ρdk
Indicating with aki the weight of the kth variable on the ith PC, the following
equation is obtained:
(Yk −Y k )
D aki " #
dk λ ∂2 Yk − Y k ω̇k aki
= aki + . (4.48)
Dt ρcp Lek ∂x2j dk ρdk
Pp (Yk −Y k )
D aki " p #
k=1 dk λ ∂ 2 X Yk − Y k
= aki +
Dt ρcp Lek ∂x2j dk
k=1
p
1 X ω̇k aki
+ . (4.49)
ρ dk
k=1
124
4.6. Development of a PCA based combustion model
(Y −Y )
But pk=1 k dk k aki is simply the definition of the ith PC, zi . Therefore Eq.
P
(4.49) can be rewritten as:
Dzi λ ∂ 2 zi
= + ω̇zi (4.50)
Dt cp Lei ∂x2j
where ω̇zi is the source term for zi
p
1 X ω̇k aki
ω̇zi = . (4.51)
ρ dk
k=1
D
(Z) = −∇ · (jZ ) + (ω̇Z ) , (4.53)
Dt
where jZ is the mass diffusive flux of Z In Eq. (4.53), the source terms of
temperature and all species contribute to the source term for each PC.
125
Chapter 4. Principal Components Analysis for turbulence-chemistry
interaction modeling
arising from closure of the convective term. In many cases, and particularly
at high Reynolds number, the molecular diffusion term is small relative to the
turbulent diffusion term and it is neglected. However, even when one wishes
to retain the full description of molecular diffusion, the treatment with PCA is
straightforward. First, X is approximated from Z. Next, the diffusive terms for
X, jX , are constructed. Finally, the diffusive fluxes for the PCs are calculated
as jZ = jX A.
Source terms for the PCs, ω̇Z , can be parametrized by Z and tabulated
a priori to avoid run-time calculation. The accurate parametrization of the
source terms is crucial for the successful application of PCA as a modeling
approach. Therefore, the data adopted for PCs extraction must have source
terms for all X, which is currently impossible to obtain from experimental
data. Then, the decisive step for moving on from data analysis (Section 4.5.1)
to predictive modeling is the availability of computational data generated from
reliable chemical mechanisms, using methods such as DNS or ODT. Further-
more, the reliability of PCA as a modeling approach also hinges on the relative
invariance of PCs from one data set to another which is nearby in parameter
space.
n
" #" n #−1
X X
2
R =1− (xij − x∗ij )2 (xij − x¯j )2
(4.54)
i=1 i=1
4
The results shown in this Section refer to data conditioned on mixture fraction, f , since
this is a convenient variable to “force” as the first component.
126
4.6. Development of a PCA based combustion model
(a) (b)
Figure 4.31: Parametrization of temperature at fst by χ (a) and z1 (b) for case
B. Solid lines are the doubly-conditional mean temperature. R2 is calculated
from Eq. (4.54).
where xi is the ith observation of the jth variable, x∗ij is its parametrized ap-
proximation, and x̄j is the mean of xj . For the state variables, R2 is equivalent
to the parameter tq,j , introduced in Section 4.2.3.1. However, for the source
term, such parameter is not available and the R2 can be directly calculated.
Figures 4.31 (a) and 4.31 (b) show the parametrization (at fst ) of T by χ
and the first PC, z1 , respectively, for Case B. Examining Figure 4.31 (b), we
see that z1 acts as a progress variable, capturing the extinction process remark-
ably well. This has also been observed for other choices of progress variables
such as CO2 [88]. Comparing the two-parameter PCA approach with the (f, χ)
parametrization is reasonable since both are two-parameter models, although
the second parameter (χ versus z1 ) represents different physical phenomena
(gradient versus chemical state). Figure 4.32 shows the parametrization of the
OH mass fraction by the common (f, χ) and the proposed (f, z1 ) parametriza-
tions. This demonstrates that the PCA approach can be used to represent a
wide range of the state variables, not temperature alone.
Also shown on Figures 4.31 (a) and 4.31 (b) is the R2 value as calculated by
Eq.(4.54). Table 4.17 lists R2 values for reconstruction of the temperature and
all species mass fractions as a function of the number of parameters adopted,
q. These values are a concise, quantitative representation of the information
presented graphically in Figs. 4.31 and 4.32. For example, for Case B with
q = 1, we obtain RT2 = 0.967, corresponding to Figure 4.31 (b). For comparison,
Table 4.17 also lists the R2 values given by the (f, χ) parametrization, RT2 =
0.801 (Figure 4.31 (a)). Clearly, the two-parameter (f, z1 ) parametrization
reconstructs the temperature and most other state variables with much more
accuracy than the (f, χ) parametrization. It should be noted that the results for
the (f, χ) parametrization represent the best possible performance of a model
based on (f, χ); the steady laminar flamelet model typically does not perform
ideally [88].
127
Chapter 4. Principal Components Analysis for turbulence-chemistry
Table 4.17: R2 values defined by Eq. (4.54). Also shown are results for the χ parametrization. All results are at f = fst = 0.4375.
q T H2 O2 O OH H2 O H HO2 H2 O 2 CO CO2 HCO
A χ 0.789 0.344 0.811 0.718 0.165 0.085 0.695 0.839 0.816 0.803 0.827 0.828
1 0.983 0.259 0.976 0.930 0.240 0.178 0.823 0.986 0.916 0.978 0.956 0.980
128
2 0.983 0.936 0.968 0.958 0.963 0.924 0.964 0.980 0.985 0.969 0.976 0.980
B χ 0.801 0.509 0.807 0.697 0.426 0.186 0.648 0.665 0.729 0.810 0.058 0.817
1 0.967 0.370 0.910 0.614 0.736 0.531 0.524 0.940 0.849 0.907 0.094 0.901
2 0.996 0.845 0.982 0.882 0.931 0.990 0.858 0.974 0.941 0.981 0.378 0.984
3 0.990 0.904 0.982 0.984 0.979 0.991 0.985 0.977 0.933 0.981 0.854 0.980
interaction modeling
4.6. Development of a PCA based combustion model
(a) (b)
Table 4.17 also demonstrates that increasing the number of retained PCs
increases the accuracy with which the state variables are represented. This
indicates that one may select a desired error threshold and then determine the
minimum number of PCs required to achieve that accuracy. Conversely, one
may choose the number of PCs and estimate a priori the associated error.
The PCs are not conserved variables and their source terms must be parametrized
by the PCs. In this section we explore the ability of PCA to parametrize
source terms. Any function of X may be approximated by F(X) ≈ F (XAq ) .
However, it is more accurate to calculate F(X) directly from the data in p-
dimensional space and then project it onto Z by calculating the conditional
mean hF (X) |Z i. Thus, source terms are calculated directly from the original
observables, X, and their conditional means are projected onto Z. Figure 4.33
illustrates this for the two-dimensional (f, z1 ) parametrization of ω̇z1 .
Table 4.18 summarizes the ability of an q-dimensional PCA to parametrize
the source terms of the PCs. We first consider the columns describing the
results at fst . For case A, a two-dimensional parametrization (f, z1 ) captures
ω̇z1 with Rω̇2 z = 0.978. For case B, 3 PCs are required to parametrize ω̇z1 to
1
a similar degree of accuracy. Comparing the dimensionality requirements for
parametrizing ω̇Z with those for parametrizing the state variables (Table 4.17),
we see that parametrizing the source terms does not require more PCs than
parametrization of the state variables themselves, an encouraging result.
129
Chapter 4. Principal Components Analysis for turbulence-chemistry
interaction modeling
(a)
Figure 4.33: Parametrization of ω̇z1 at fst by z1 for case B. Solid line: doubly-
conditional mean value of ω̇z1 . R2 is calculated from Eq. (4.54).
130
4.7. Summary
4.7 Summary
In the first part of the present Chapter, a novel methodology based on Principal
Components Analysis (PCA) has been proposed for the identification of low-
dimensional manifolds in turbulent flames, the estimation of their dimensional-
ity and the selection of optimal reaction variables. To this purpose, high fidelity
experimental and numerical data sets have been investigated. Three different
PCA approaches are proposed. A global PCA analysis, GPCA, has been com-
pared to two local PCA models, VQPCA and FPCA, based on the partitioning
of the experimental data into separate clusters where PCA is performed locally.
However, the partitioning algorithm used by VQPCA is unsupervised and based
on reconstruction error minimization while FPCA conditions the data a priori
on the mixture fraction. Results show that the local PCA approaches (VQPCA
and FPCA) outperform the global approach in all cases. Indeed, GPCA is un-
able to provide a compact representation of the data in a low-dimensional space
due to the highly non-linear relationships existing among the state variables.
Regarding the local approaches, the performances of VQPCA and FPCA are
comparable for a simple jet flames, while FPCA proves unable to capture im-
portant features for systems characterized by complex equilibrium phenomena
131
Chapter 4. Principal Components Analysis for turbulence-chemistry
Table 4.19: R2 values at f = 0.2 using the PCA obtained at fst . Also shown are results for the χ parametrization.
q T H2 O2 O OH H2 O H HO2 H 2 O2 CO CO2 HCO
A χ 0.097 0.798 0.169 0.774 0.736 0.245 0.827 0.812 0.811 0.580 0.432 0.881
1 0.500 0.413 0.816 0.212 0.188 0.134 0.319 0.433 0.398 0.666 0.555 0.619
132
2 0.968 0.910 0.881 0.868 0.859 0.940 0.888 0.838 0.855 0.867 0.940 0.934
B χ 0.497 0.542 0.390 0.303 0.329 0.269 0.558 0.537 0.390 0.417 0.206 0.689
1 0.979 0.741 0.866 0.337 0.219 0.749 0.127 0.805 0.858 0.859 0.513 0.403
2 0.996 0.877 0.945 0.819 0.822 0.994 0.806 0.970 0.960 0.958 0.737 0.860
3 0.990 0.963 0.958 0.989 0.977 0.994 0.978 0.984 0.982 0.968 0.808 0.955
interaction modeling
4.7. Summary
Table 4.20: R2 values at f = 0.6 using the PCA obtained at fst . Also shown are results for the χ parametrization.
133
B χ 0.628 0.081 0.593 0.662 0.112 0.246 0.365 0.508 0.616 0.521 0.268 0.570
1 0.964 0.134 0.904 0.804 0.197 0.755 0.721 0.938 0.650 0.844 0.442 0.896
2 0.984 0.612 0.928 0.836 0.373 0.986 0.822 0.960 0.791 0.873 0.542 0.930
3 0.986 0.769 0.948 0.888 0.543 0.991 0.913 0.967 0.839 0.909 0.841 0.941
Chapter 4. Principal Components Analysis for turbulence-chemistry
interaction modeling
134