You are on page 1of 8

Analytica Chimica Acta 531 (2005) 209216

Analysis of lipoproteins using 2D diffusion-edited NMR spectroscopy and multi-way chemometrics


Marianne Dyrbya , Martin Petersenb , Andrew K. Whittakerc , Lynette Lambertc , Lars Nrgaarda , Rasmus Broa , Sren Balling Engelsena,
a

The Royal Veterinary and Agricultural University, Centre for Advanced Food Studies, Rolighedsvej 30, DK-1958 Frederiksberg, Denmark b The Royal Veterinary and Agricultural University, Department of Human Nutrition, Frederiksberg, Denmark c Centre for Magnetic Resonance, University of Queensland, Brisbane, Australia Received 9 May 2004; received in revised form 13 October 2004; accepted 13 October 2004 Available online 24 December 2004

Abstract This study represents the rst application of multi-way calibration by N-PLS and multi-way curve resolution by PARAFAC to 2D diffusionedited 1 H NMR spectra. The aim of the analysis was to evaluate the potential for quantication of lipoprotein main- and subfractions in human plasma samples. Multi-way N-PLS calibrations relating the methyl and methylene peaks of lipoprotein lipids to concentrations of the four main lipoprotein fractions as well as 11 subfractions were developed with high correlations (R = 0.750.98). Furthermore, a PARAFAC model with four chemically meaningful components was calculated from the 2D diffusion-edited spectra of the methylene peak of lipids. Although the four extracted PARAFAC components represent molecules of sizes that correspond to the four main fractions of lipoproteins, the corresponding concentrations of the four PARAFAC components proved not to be correlated to the reference concentrations of these four fractions in the plasma samples as determined by ultracentrifugation. These results indicate that NMR provides complementary information on the classication of lipoprotein fractions compared to ultracentrifugation. 2004 Elsevier B.V. All rights reserved.
Keywords: Lipoproteins; Subfractions; Diffusion-edited 1 H NMR; DOSY; PARAFAC; N-PLS

1. Introduction People with high concentrations of plasma cholesterol and triglyceride have an increased risk of coronary heart disease. The risk of coronary heart disease has been shown to be related to the distribution of cholesterol and triglyceride in different types of lipoproteins [1,2]. To assess the risk of coronary heart disease of patients it is of great importance to be able to measure the lipoprotein prole. Lipoproteins are micellar lipid-transporting vehicles found in blood consisting of a core of triglyceride and cholesterol ester inside a monolayer of phospholipid, free cholesterol and protein. They can be divided into subgroups

Corresponding author. Tel.: +45 35 28 32 05; fax: +45 35 28 32 45. E-mail address: se@kvl.dk (S.B. Engelsen).

on the basis of their density and hence their size (diameters ranging from 80 to 5 nm). The main fractions are very low density lipoproteins (VLDL), intermediate density lipoproteins (IDL), low density lipoproteins (LDL) and high density lipoproteins (HDL). These four main fractions have similar chemical composition but differ in relative lipid and protein composition and have distinct physiological properties [3]. The denition of the main fractions is partly empirical, as they do not represent four strictly distinct types of particles but rather particles with densities within certain ranges. Each of these main fractions can be further divided into a number of subfractions simply by dividing the density range of the main fraction into smaller ranges. Ultracentrifugation is the established standard reference method for the separation and analysis of lipoproteins. The disadvantage of lipoprotein quantication using ultracen-

0003-2670/$ see front matter 2004 Elsevier B.V. All rights reserved. doi:10.1016/j.aca.2004.10.052

210

M. Dyrby et al. / Analytica Chimica Acta 531 (2005) 209216

trifugation is that the method is time and labour consuming and requires a large amount of plasma. Other reference methods include gel electrophoresis, which is less time-consuming but based on a slightly different principle of separation (size and charge as opposed to density), and precipitation using immunoassay kits, which is characterised by high uncertainty and can only be used for determination of the main fractions. Alternatively, lipoproteins in blood plasma can be measured using high-resolution 1 H nuclear magnetic resonance (NMR) spectroscopy, which requires only small amounts of plasma (100 L) and practically no sample preparation. Furthermore, data can be acquired in a relatively short time, from a few minutes for simple one-dimensional (1D) spectra up to a few hours for more complex two-dimensional (2D) spectra. Due to their similar chemical composition, the different types of lipoproteins have very similar and overlapping NMR signals, and quantication of the main- and subfractions from NMR spectra cannot be done using standard integration, but must be done using curve tting or multivariate data analysis. Several studies have attempted to quantify the main fractions using 1D 1 H NMR spectra with good correlations to chemical reference methods [47], while very few attempts have been made to quantify subfractions [8,9]. The quantication of lipoprotein subfractions from NMR spectra is made difcult by the heavily overlapping peaks and therefore methods that can produce a further separation of the lipoproteins are desired. Directly coupled HPLC and NMR have been suggested [10] but the disadvantage of this method is the amount of time needed for the HPLC separation (90 min) and the fact that plasma proteins co-elute with HDL making the quantication of HDL uncertain. Furthermore, HPLC only separates the lipoproteins into main fractions and thus the quantication of subfractions still relies on the mathematical separation of the NMR peaks. An alternative approach for separation of lipoprotein signals is to use 2D diffusion-edited NMR spectroscopy, also known as diffusion-ordered NMR spectroscopy (DOSY). Extensive reviews on the theory and the use of diffusion-edited NMR can be found in literature [11,12]. The use of diffusionedited NMR for the quantication of lipoproteins relies upon the fact that lipoproteins have different sizes, which according to the StokesEinstein relationship will lead to different diffusion properties. Some studies have been made on blood plasma using diffusion-editing of standard 2D NMR techniques, such as TOCSY [13,14] and double-quantum experiments [15], but the aim of these studies were to determine diffusion coefcients and no attempt was made to separate or quantify lipoproteins. Recently, a study was published on 2D diffusion-edited NMR of a single plasma sample with the aim to resolve and quantify lipoprotein main fractions using curve resolution, but the fractions were not fully resolved [16]. The separation of different sized components and the calculation of diffusion coefcients from a 2D diffusion-edited NMR spectrum is simple in the case of well-resolved peaks and large differences in molecular size, but more compli-

Fig. 1. Schematic representation of a two component PARAFAC model.

cated in the case of broad or overlapping signals for which a number of processing methods have been presented (reviewed in [11]). None of the proposed methods work well on lipoprotein peaks due to the broad and overlapping signals and very small differences in diffusion coefcients. Hence, the quantication of lipoprotein main- and subfractions from 2D diffusion-edited NMR spectra cannot be based on full separation, but must rely on methods that perform a mathematical separation of the lipoprotein signals. Multi-way chemometric methods such as PARAllel FACtor (PARAFAC) analysis have the potential to perform the desired mathematical chromatography [17] of complex matrices. The PARAFAC model [18] is a model similar to principal component analysis (PCA) [19]. PCA provides a bilinear model of two-way data (i.e. an outer product of scores and loadings). Likewise, for three-way data, PARAFAC provides a tri-linear model which consists of an outer product in three directions of scores (A), loadings type I (B) and loadings type II (C) as shown in Fig. 1. As PCA, the PARAFAC model is tted in a least-squares sense but unlike PCA, no orthogonality or requirements of maximum variance per component are used. Such constraints are needed for PCA in order to obtain an identied solution, but the PARAFAC model is unique in itself [20]. Hence, if the data follow the model, i.e. are approximately low-rank tri-linear, the analysis will directly provide physically meaningful loadings (e.g. pure spectra). These aspects and basic properties of PARAFAC have been elaborated on in several places in literature [21,22]. The multi-way partial least-squares (N-PLS) regression model is, similarly to PARAFAC, an extension of the two-way PLS to higher orders [23]. The two-way PLS uses a bilinear, PCA-like, model, which is found in such a way that each score successively has maximal covariance with the variation in the dependent variables [24]. The N-PLS model does exactly the same but using a tri-linear, PARAFAC-like, model instead. Thus, it provides a solution, which is directly interpretable in terms of the individual modes in the three-way array. With the purpose of quantication of lipoprotein mainand subfractions from 2D diffusion-edited NMR spectra, the most direct approach is to use N-PLS. In the present case, this method employs a calibration between the 2D spectra of a set of samples and known reference values for the concentrations of lipoprotein main- and subfractions. Subsequently, the concentrations of lipoprotein subfractions in a plasma sample of unknown composition can be predicted from its NMR spectrum. Alternatively, PARAFAC can be applied to the 2D diffusion-edited NMR spectra of a set of samples,

M. Dyrby et al. / Analytica Chimica Acta 531 (2005) 209216

211

and optimally the decomposition directly results in the spectra and diffusion proles of the pure components as well as their relative concentrations in the set of samples. Thus far, no applications of PARAFAC to 2D diffusion-edited NMR data have been published, but several successful applications to other types of 2D data, e.g. uorescence spectroscopy data and coupled liquid chromatography and mass spectrometry (LCMS) data exist in literature [2527]. In this study, we investigate the potential use of 2D diffusion-edited NMR on a series of plasma samples to quantify lipoprotein main- and subfractions using the multi-way chemometric techniques N-PLS and PARAFAC.

Fig. 2. Pulse sequence of the double stimulated echo experiment [30].

2. Experimental 2.1. Plasma samples Human volunteers (n = 17) with a broad range of lipid levels in terms of total triglyceride were chosen in order to span a wide range of lipoprotein subfractions. Both males and females were included, and the subjects included obese (n = 2), obese after a 10-week weight loss period (n = 13), and lean controls (n = 2). Written informed consent was obtained from the volunteers. Fasting EDTA blood was centrifuged at 2500 g for 15 min and stored at 80 C until further analysis. The samples were subjected to ultracentrifugation and 2D diffusion-edited NMR analysis. 2.2. Ultracentrifugation (UC)

temperatures. The binomial solvent suppression sequence WATERGATE [31] was added after the LED. To further minimise the occurrence of convection currents, the ow of the heating air was set to 800 L/h. A spectral window of 10,000 Hz was accumulated in an acquisition time of 1.6 s. The relaxation delay was 0.2 ms, the FIDs were collected into 16,384 complex data points and 256 scans were acquired on each sample. The gradient pulse strength was increased from 5 to 95% of the maximum strength of 0.475 T/m in 24 steps, which were equidistant in squared gradient pulse strength. A diffusion time of 0.2 s and bipolar sine shaped gradient pulses of length 6 ms were applied to obtain a reasonable amount of diffusion of the lipoprotein signals. The total experiment time was 3 h per sample. Following acquisition, the FIDs were Fourier transformed applying zerolling to 32,768 data points and an exponential window function with line broadening factor 0.3 Hz. The spectra were manually phase-corrected, referenced to the CaEDTA peak at 2.555 ppm (only partially suppressed by the diffusion experiment) and corrected for baseline offset errors using the at regions at both ends of the spectrum. No normalisation or further averaging was applied. 2.4. Data analysis

The lipoproteins were subfractionated by serial ultracentrifugation as described in Baumstark et al. [28]. In short, the main fractions, VLDL, IDL, LDL and HDL, were separated from the plasma one at a time, and LDL was subsequently separated into six subfractions (LDL1 to LDL6) and HDL into three subfractions (HDL1 to HDL3). Furthermore, VLDL was separated into two subfractions (VLDL1 and VLDL2) as described in [29]. In plasma and each main (n = 4) and subfraction (n = 11), total cholesterol and triglyceride were determined by standard enzymatic colorimetric methods. 2.3. 1 H NMR The samples were prepared in 5 mm NMR tubes as 225 L plasma and 225 L 0.9% NaCl in H2 O with 50 L D2 O added for the deuterium eld/frequency lock. Proton spectra were measured at 318 K on a Bruker DRX500 operating at 11.7 T. This temperature was chosen to obtain full signal from the lipids in lipoproteins (especially LDL), which are fully melted at this temperature and thus NMR visible [4]. The pulse program used was the double stimulated echo (DSTE) experiment (see Fig. 2) including bipolar gradient pulses and longitudinal eddy current delay (LED) [30]. This pulse experiment efciently compensates for convection currents that typically develop at elevated

All chemometric models were computed in MATLAB Version 6.5 (MathWorks Inc., Natick, MA) equipped with The N-way Toolbox for MATLAB Version 2.10 found at http://www.models.kvl.dk/ [32].

3. Results and discussion 3.1. Evaluation of the spectra Representative 2D diffusion-edited NMR data of a plasma sample are shown in Fig. 3. The data are shown as a line plot of the 24 1D spectra that make up the 2D spectrum and is limited to the region 2.50.6 ppm. This part of the spectrum shows several broad peaks from lipoprotein lipids and some residual peaks from small metabolites, e.g. the lactate doublet at 1.32 ppm. The sharp signals of these molecules are fully attenuated during the rst couple of spectra due to their large self-diffusion coefcients, while the signals from lipoproteins are only attenuated to about 33% of the original intensity through all 24 spectra. The data shown do not show typical signs of convection current disturbances that arise when performing diffusion experiments at temperatures different than room tempera-

212

M. Dyrby et al. / Analytica Chimica Acta 531 (2005) 209216

Fig. 3. An example of 2D diffusion-edited NMR data of a plasma sample. Only the range 2.50.6 ppm used for the N-PLS analysis is shown.

ture [12]. Fig. 4 illustrates the effects of convection currents, which show up as deviations from the theoretical exponential attenuation of the peaks and which can even lead to the inversion of the signals (negative intensities) [33]. The data are from three repetitions of each of two experiments, the normal stimulated echo with LED (Fig. 4A) and the double stimulated echo with LED (Fig. 4B), both performed at 318 K on the same sample. The gure shows the logarithm of the intensity of a specic peak for the six diffusion experiments. In the absence of convection currents the intensity will decrease exponentially and thus the depicted lines should be linear. It is obvious that using the double stimulated echo experiment, the triplicate measurements are nearly identical and the lines close to linear, while for the simple stimulated echo, convection currents of variable size make the decay of the intensity divert from exponentiality and show a slightly oscillating behaviour, which is typical for convection current inuenced data. Some of the spectra did show one aberrant feature, an example of which is illustrated in Fig. 4C. A shift in intensity occurs for spectrum number two to spectrum number ve in the diffusion direction, and for some spectra also a slight shift

between spectrum numbers 15 and 16. The nature of these shifts has not been thoroughly investigated in this study. 3.2. Quantication using N-PLS N-PLS calibrations were developed between the 2D diffusion-edited NMR data and reference concentrations of four main lipoprotein fractions and 11 subfractions as determined by ultracentrifugation. Centering of the data was done as is common practice in N-PLS and two-way PLS, namely by so-called centering across samples. Due to the small number of samples in the calibration set they were validated using full cross-validation [34]. The spectral range was the same as shown in Fig. 3, i.e. 2.50.6 ppm, however, models based on only the two main lipoprotein peaks, methylene (1.311.20 ppm) and methyl (0.910.77 ppm) were also tested. Furthermore, spectra numbers 25 in the diffusion direction were not included in the analysis due to the intensity shifts mentioned above. There were no signs of outliers; i.e. individual samples that incorrectly inuenced the models signicantly more than the bulk of samples. The best model for each lipoprotein main- and subfraction is shown in Table 1, which lists the root mean squared error of cross-validation (RMSECV), the number of components found optimal for the calibration, the correlation coefcient (R), the relative RMSECV (calculated as RMSECV divided by the mean reference value) and the range of reference values in the calibration set. Modelling the sum of LDL2LDL5 instead of the individual subfractions yielded a slightly better relative prediction error than modelling the four subfractions separately, and therefore this model is listed in the table. The relative errors of the main fraction prediction models are between 12 and 18% and have correlations between 0.82 and 0.96. Plots of predicted values versus reference concentrations are shown in Fig. 5. For the subfraction models the relative errors are 1245% and the correlations 0.580.97. The only earlier applications of chemometric methods for the quantication of lipoproteins from 1D NMR spectra were by Bathen et al. [7] who obtained correlations of 0.740.97 and

Fig. 4. Diffusion curves for a specic peak as log(intensity): (A) three repetitions using the normal stimulated echo with LED; (B) three repetitions using the double stimulated echo with LED; (C) two examples showing abnormal shifts in intensity using the nal settings.

M. Dyrby et al. / Analytica Chimica Acta 531 (2005) 209216 Table 1 Prediction results of lipoprotein subfractions using N-PLS RMSECVa VLDL VLDL1 VLDL2 IDL LDL LDL1 LDL2-5 LDL6 HDL HDL1 HDL2 HDL3 9.8 6.8 3.7 1.6 19.0 6.4 23.6 5.8 6.7 7.3 3.3 2.6 #PCb 5 3 7 9 7 5 7 5 5 5 5 8 Rc 0.96 0.97 0.64 0.90 0.82 0.58 0.59 0.67 0.91 0.63 0.91 0.78 Relative RMSECVd (%) 15 19 30 12 18 29 34 36 12 45 19 12

213

Reference value interval 13.0167.5 3.9122.9 3.323.1 3.919.6 50.0179.6 9.739.1 26.4135.4 7.044.4 32.177.6 3.735.6 4.327.8 14.629.6

The unit for VLDL and IDL calibrations is mg/dL triglyceride and for LDL and HDL mg/dL cholesterol. a Root mean square error of cross validation (mg/dL). b Number of N-PLS components used. c Correlation coefcient. d Relative RMSECV is calculated as RMSECV/(mean reference value).

relative prediction errors of 1946% for the main fractions, and by Dyrby et al. [9] who achieved relative prediction errors of 1019% and correlations 0.870.98 for main fractions and relative prediction errors of 1739% and correlations 0.540.97 for subfractions. Thus, the models presented here are equal to or slightly better than the best models presented earlier. When evaluating the prediction errors of N-PLS models, it is important to assess the error of the reference method and of the spectral measurement as the errors of both these measurements contribute to the RMSECV. In this study,

the error of the UC method was evaluated by dividing a blood sample into six parts and performing subfractionation on each of them, and the coefcient of variation between the six determinations was calculated. Furthermore, the week-to-week variation in the cholesterol and triglyceride measurements were taken into account, also calculated in terms of coefcients of variation. These standard errors of repeated measurements add up to between 3 and 9% for the UC determinations. Furthermore, an error in the VLDL subfractionation can be detected, where the sum of VLDL1 and VLDL2 does not equal the total VLDL, but show a

Fig. 5. Predicted vs. reference concentrations for the four main lipoprotein fractions, VLDL, IDL, LDL and HDL. The unit for VLDL and IDL calibrations is mg/dL triglyceride and for LDL and HDL mg/dL cholesterol.

214

M. Dyrby et al. / Analytica Chimica Acta 531 (2005) 209216

loss of up to 75% with an average of 20% loss. For HDL the error is much smaller, approximately 4%. For LDL a similar error can be expected, but could not be directly assessed due to a dialysing step in the LDL subfraction preparation. Usually, NMR data have very low repeatability error. However, due to the large amount of signal lost when using the double stimulated echo (50% plus loss due to relaxation compared with the single stimulated echo) there is a relatively large amount of noise in these data. Furthermore, some error may arise from the previously mentioned intensity shifts (cf. Fig. 4C) as well as from the effects of convection that are not compensated for by the DSTE experiment [35]. Three 2D diffusion-edited NMR spectra were acquired for each of four samples to evaluate the standard error of repeated spectral measurements. The average difference between the predictions for each sample expressed as coefcient of variation was between 2 and 14% for different models. By comparing the standard errors of the reference measurement and the spectral measurement with the relative prediction error of each calibration model, it appears that these errors account for between 1/6 and 9/10 of the relative RMSECV for the predictions of different lipoprotein main- and subfractions. Given this signicant error from the reference measurement and the spectral measurement as well as the small number of samples for building the calibration model, the obtained predictions errors are quite good. By using more samples and possibly reducing the uncertainty arising from the NMR measurement, e.g. by averaging more scans, it is anticipated that the prediction errors can be lowered even further. While the N-PLS method represents the supervised data analytical approach, the tri-linear PARAFAC model represents the unsupervised approach. This method has the advantage of yielding directly interpretable estimates of the underlying physical features, which can then be evaluated.

3.3. PARAFAC analysis For this analysis, the NMR spectral range considered was limited to the methylene (1.311.20 ppm) and methyl (0.910.77 ppm) signals to avoid the residual small metabolite signals. PARAFAC was performed on the entire spectral ensemble using different numbers of components to establish the most informative model. No centering was applied to the data before the PARAFAC analysis. Models were tted on the methylene and methyl signals together or separately and with non-negativity constraints on all modes. As for the NPLS models, spectra 25 in the diffusion direction were not considered because of the intensity shift problem discussed briey above (c.f. Fig. 4C). PARAFAC models using two to four components were generally informative and provided a good t to the data. Fig. 6 shows the best result, which is based on the methylene signal only and using four PARAFAC components. An example of a 2D diffusion-edited spectrum in the same range is shown in Fig. 7 for comparison. Fig. 6A shows the smooth spectral loadings that are very similar in shape but have different chemical shifts. It is wellknown that lipoproteins of different sizes have similar signals that are shifted in position according to their size, due to a difference in magnetic susceptibility between the core and the shell of the lipoproteins [36]. This effect results in a larger chemical shift for the largest particles (i.e. VLDL) than for the smallest particles (i.e. HDL). The four loadings thus represent four different sizes of lipoproteins, where the three with the larger chemical shifts (IIII) are relatively close together in size while the one at smaller chemical shift (IV) is much smaller. Furthermore, component IV appears to be broader than the other signals, which is in accordance with the fact that the smaller lipoproteins, e.g. HDL, have lower internal mobility and therefore smaller T2 values leading to broader peaks [37]. The four diffusion loadings shown in Fig. 6B are smooth and seemingly exponential. Component IV has the diffusion

Fig. 6. Result of a PARAFAC model with four components on the 2D diffusion-edited NMR spectrum of the methylene peak of lipoprotein lipids: (A) spectral loadings and (B) diffusion loadings.

M. Dyrby et al. / Analytica Chimica Acta 531 (2005) 209216

215

Fig. 7. A 2D diffusion-edited NMR spectrum of the methylene peak of which the deconvolution is shown in Fig. 5.

loading with the fastest decay rate, which is consistent with it representing the smallest particles. In accordance with the above discussion on the spectral loadings, components IIII have diffusion curves that decrease at a similar rate, while component IV has a somewhat larger decay rate. From the diffusion curves the self-diffusion coefcients of the particles giving rise to the signal can be calculated using exponential tting, and from the diffusion coefcients the hydrodynamic radii can be calculated [12]. The calculated diffusion coefcients and radii for the four PARAFAC components can be seen in Table 2, which also shows the radii of the four main lipoprotein fractions [3]. The results show that the four PARAFAC components with radii 29, 18, 12 and 6 nm could very well represent averages of VLDL, IDL, LDL and HDL, respectively. However, direct correlations between PARAFAC-derived concentrations (scores) and main lipoprotein concentrations as determined by ultracentrifugation were high for VLDL and HDL (0.93 and 0.82), but very low for IDL and LDL (0.41 and 0.34). There are many reasons for this: lipoprotein main fractions as dened by ultracentrifugation are distributions in density and size, while the PARAFAC components are discrete. Thus, the PARAFAC components represent four average lipoprotein sizes, the concentrations of which are not necessarily directly correlated to the concentrations of
Table 2 Calculation of diffusion coefcients (D), and hydrodynamic radii (rH ) of the four PARAFAC components I D (1011 m2 s1 ) rH (nm) REF rH (nm) 1.00 28.7 4017.5 II 1.61 17.9 17.512.5 III 2.36 12.2 12.59 IV 5.01 5.7 62.5

REF rH are the ranges that characterise the four main lipoprotein fractions, VLDL, IDL, LDL and HDL (from left to right) [3].

the four distributions. Furthermore, the spectral residuals of the four-component PARAFAC model are not random, but contain a fair amount of structured variation. Due to the large difference in lipoprotein proles of the samples in this study, the remaining variation is different for each sample and can therefore not be modelled by a common fth component. Adding more samples would most likely result in a PARAFAC model with more extractable components that could better explain the variation and that could then possibly be used quantitatively as well, in accordance with the good quantitative performance of N-PLS. An important difference between the PARAFAC and N-PLS analysis is that the N-PLS models for the main fractions listed in Table 1 are all based on the entire spectral region, 2.50.6 ppm, while N-PLS models built on the methylene region alone did not perform well. Furthermore, the N-PLS main lipoprotein models use between ve and nine components, emphasizing the need of extracting more components in order to quantify lipoprotein fractions. An intrinsic problem of this dataset which limits the application of both N-PLS and PARAFAC is the fact that lipoprotein subfractions are distributions in density and thus the data are not low-rank tri-linear, but high-rank tri-linear. This problem actually applies for 1D NMR data of lipoproteins as well. The fact that both PLS on 1D NMR data and N-PLS on 2D diffusion-edited NMR data produce reasonable calibrations show that data can be approximated with low-rank bi-linear and tri-linear models, respectively. Nevertheless, the highrank bi- or tri-linearity of the NMR data limits the appropriateness of tting low-rank models and might set a limit for the performance of calibration models of lipoprotein mainand subfraction contents. Furthermore, this problem emphasizes the need for a higher number of samples in the models in order to be able to reliably increase the estimated rank of the models.

216

M. Dyrby et al. / Analytica Chimica Acta 531 (2005) 209216 [3] P.B. Duell, D.R. Illingworth, W.E. Connor, in: P. Felig, L.A. Frohman (Eds.), Endocrinology and Metabolism, 4th ed., McGraw-Hill, New York, 2001, p. 993. [4] J.D. Otvos, E.J. Jeyarajah, D.W. Bennett, Clin. Chem. 37 (1991) 377. [5] M. Ala-Korpela, Y. Hiltunen, J. Jokisaari, S. Eskelinen, K. Kiviniitty, M.J. Savolainen, Y.A. Kes niemi, NMR Biomed. 6 (1993) 225. a [6] H. Serrai, L. Nadal, G. Leray, B. Leroy, B. Delplanque, J.D. de Certaines, NMR Biomed. 11 (1998) 273. [7] T.F. Bathen, J. Krane, T. Engan, K.S. Bjerve, D. Axelson, NMR Biomed. 13 (2000) 271. [8] J.D. Otvos, Clin. Cardiol. 22 (1999) II21. [9] M. Dyrby, M. Petersen, S.B. Engelsen, L. Nrgaard, U.G. Sidelmann, in: G.A. Webb, P.S. Belton, A.M. Gil, D.N. Rutledge (Eds.), Magnetic Resonance in Food Science: Latest Developments, The Royal Society of Chemistry, Cambridge, 2002, p. 101. [10] C.A. Daykin, O. Corcoran, S.H. Hansen, I. Bjrnsdottir, C. Cornett, S.C. Connor, J.C. Lindon, J.K. Nicholson, Anal. Chem. 73 (2001) 1084. [11] C.S. Johnson, Prog. Nucl. Magn. Reson. Spectrosc. 34 (1999) 203. [12] B. Antalek, Concepts Magn. Reson. 14 (2002) 225. [13] M.L. Liu, J.K. Nicholson, J.C. London, Anal. Chem. 68 (1996) 3370. [14] M.L. Liu, J.K. Nicholson, J.A. Parkinson, J.C. Lindon, Anal. Chem. 69 (1997) 1504. [15] C. Dalvit, J.M. B hlen, NMR Biomed. 10 (1997) 285. o [16] M.L. Liu, H. Tang, J.K. Nicholson, J.C. Lindon, Magn. Reson. Chem. 40 (2002) 83. [17] L. Munck, L. Nrgaard, S.B. Engelsen, R. Bro, C.A. Andersson, Chemometr. Intell. Lab. Syst. 44 (1998) 31. [18] R.A. Harshman, UCLA Working Papers Phonetics 16 (1970) 1. [19] S. Wold, K. Esbensen, P. Geladi, Chemometr. Intell. Lab. Syst. 2 (1987) 37. [20] N.D. Sidiropoulus, R. Bro, J. Chemometr. 14 (2000) 229. [21] R.A. Harshman, M.E. Lundy, Comput. Stat. Data Anal. 18 (1994) 39. [22] R. Bro, Chemometr. Intell. Lab. Syst. 38 (1997) 149. [23] R. Bro, J. Chemometr. 10 (1996) 47. [24] S. Wold, H. Martens, H. Wold, Lect. Notes Math. 973 (1983) 286. [25] R. Bro, Chemometrics Intell. Lab. Syst. 46 (1999) 133. [26] D.K. Pedersen, L. Munck, S.B. Engelsen, J. Chemometr. 16 (2002) 451. [27] D. Bylund, R. Danielsson, G. Malmquist, K.E. Markides, J. Chromatogr. A 961 (2002) 237. [28] M.W. Baumstark, I. Frey, L. Berg, J. Keul, Clin. Biochem. 25 (1992) 338. [29] M.J. Caslake, C.J. Packard, in: N. Rifai, G.R. Warnick, M.H. Dominiczak (Eds.), Handbook of Lipoprotein Testing, AACC Press, Washington, DC, 1997, p. 509. [30] A. Jerschow, N. M ller, J. Magn. Reson. 125 (1997) 372. u [31] M. Piotto, V. Saudek, V. Sklenar, J. Biomol. NMR 2 (1992) 661. [32] C.A. Andersson, R. Bro, Chemometr. Intell. Lab. Syst. 52 (2000) 1. [33] G.H. Srland, D. Aksnes, Magn. Reson. Chem. 40 (2002) 139. [34] H. Martens, P. Dardenne, Chemometr. Intell. Lab. Syst. 44 (1998) 99. [35] A. Jerschow, N. M ller, J. Magn. Reson. 132 (1998) 13. u [36] J. Lounila, M. Ala-Korpela, J. Jokisaari, Phys. Rev. Lett. 72 (1994) 4049. [37] J.K. Nicholson, P.J.D. Foxall, M. Spraul, R.D. Farrant, J.C. Lindon, Anal. Chem. 67 (1995) 793.

4. Conclusions This work shows that the analysis of 2D diffusion-edited NMR spectra of human blood plasma using multi-way chemometric methods is feasible. The application of N-PLS to the data led to calibration models with relative prediction errors between 12 and 45% and correlations between 0.58 and 0.97, which is equal to or slightly better than the best results presented hitherto. Nevertheless, somewhat reduced prediction errors are required in order to make the method applicable for investigations where small changes in lipoprotein subfractions are of interest. Using PARAFAC we were able to extract four chemically meaningful components from the methylene peak range of the 2D diffusion-edited NMR spectra. Investigation of the spectral loadings and radii calculated from the diffusion loadings shows that the four components correspond well to the four main lipoprotein fractions VLDL, IDL, LDL and HDL. However, the correlations between PARAFAC scores and main lipoprotein concentrations were not good, which is ascribed mainly to the continuous density proles of lipoproteins. A larger number of samples will probably give models with higher dimensionality due to a better description of the underlying components representing the subfractions and thus lead to models with better quantitative abilities. Although N-PLS was clearly superior for quantication, the analysis of 2D diffusion-edited NMR data using PARAFAC is promising and suggests that PARAFAC could be applied to 2D diffusion-edited NMR data from other types of samples where there is a large overlap between the spectra, and in cases where the components have similar diffusion coefcients under which circumstances other methods often fail. Acknowledgements MD would like to thank the Centre for Magnetic Resonance at the University of Queensland (Brisbane, Australia) for hosting her during six months as part of her PhD study. RB and SBE gratefully acknowledge the Danish Research Council (SJVF) for nancial support to the project entitled Bromatonomics: exploring functional factors in food by highresolution NMR and chemometrics.

References
[1] W.J. Mack, R.M. Krauss, H.N. Hodis, Arterioscler. Thromb. Vasc. Biol. 16 (1996) 697. [2] B. Lamarche, A. Tchernof, S. Moorjani, B. Cantin, G.R. Degenais, P.J. Lupien, J.-P. Despr s, Circulation 95 (1997) 69. e

You might also like