Improving Artificial Neural Networks With A Pruning Methodology and Genetic Algorithms For Their Application in Microbial Growth Prediction in Food

International Journal of Food Microbiology 72 (2002) 19 30 www.elsevier.
com/locate/ijfoodmicro
Improving artificial neural networks with a pruning methodology and genetic algorithms for their application in microbial growth prediction in food
Rosa Mara Garca-Gimeno a,*, Cesar Hervas-Martnez b, Maria Isabel de Siloniz c
b a Department of Food Science and Technology, University of Cordoba, Campus Rabanales, Edif. C-1, 14014 Cordoba, Spain Department of Computer Science and Numerical Analysis, University of Cordoba, Campus Rabanales Edif. C-2, 14014 Cordoba, Spain c Department of Microbiology III, Faculty of Biology, Complutense University of Madrid, Ciudad Universitaria, 28040 Madrid, Spain
Received 31 March 2001; received in revised form 31 May 2001; accepted 29 June 2001
Abstract The application of Artificial Neural Networks (ANN) in predictive microbiology is presented in this paper. This technique was used to build up a predictive model of the joint effect of NaCl concentration, pH level and storage temperature on kinetic parameters of the growth curve of Lactobacillus plantarum using ANN and Response Surface Model (RSM). Sigmoid functions were fitted to the data and kinetic parameters were estimated and used to build the models in which the independent variables were the factors mentioned above (NaCl, pH, temperature), and in some models, the values of the optical densities (OD) vs. time of the growth curve were also included in order to improve the error of estimation. The determination of the proper size of an ANN was the first step of the estimation. This study shows the usefulness of an ANN pruning methodology. The pruning of the network is a process consisting of removing unnecessary parameters (weights) and nodes during the training process of the network without losing its generalization capacity. The best architecture has been sought using genetic algorithms (GA) in conjunction with pruning algorithms and regularization methods in which the initial distribution of the parameters (weights) of the network is not uniform. The ANN model has been compared with the response surface model by means of the Standard Error of Prediction (SEP). The best values obtained were 14.04% of SEP for the growth rate and 14.84% for the lag estimation by the best ANN model, which were much better than those obtained by the RSM, 35.63% and 39.30%, respectively. These were very promising results that, in our opinion, open up an extremely important field of research. D 2002 Elsevier Science B.V. All rights reserved.
Keywords: Computational neural networks; Genetic algorithms; Pruning; Regularization; Microbe growth; Response surface model; Lactobacillus plantarum
1. Introduction It is well known that the factors that most affect the growth rate or the lag-time of micro-organism growth are, among others, pH, storage temperature, water activity, preservatives and the modification of the atmosphere during the packaging (Gibson et al.,
Corresponding author. Tel.: +34-957-218-691; fax: +34-957212-000. E-mail address: bt1gagir@uco.es (R.M. Garca-Gimeno).
0168-1605/02/$ - see front matter D 2002 Elsevier Science B.V. All rights reserved. PII: S 0 1 6 8 - 1 6 0 5 ( 0 1 ) 0 0 6 0 8 - 0
20
R.M. Garca-Gimeno et al. / International Journal of Food Microbiology 72 (2002) 1930
1988). Given an adequate database, the response of many microbes in food could be predicted from a knowledge of the foods formulation, processing and storage conditions. In food microbiology, during the past few years, much effort has been directed to developing models describing the combined effects of the factors in microbe growth (Davey, 1991; Baranyi, 1992; Van Impe et al., 1992; Zwietering et al., 1994; Zaika et al., 1994; Baranyi and Roberts, 1995; Devlieghere et al., 1998). Response Surface Models are the most frequent techniques used to describe the relationships between the combinations of factors and the growth curve parameters (Buchanan and Phillips, 1990; Hudson, 1992; Little et al., 1992; Buchanan et al., 1993a,b; Buchanan and Bagi, 1994; Bhaduri et al., 1995; Kalathenos et al., 1995; Devlieghere et al., 1998) but new methods are being introduced, such as the application of artificial neural networks (Hajmeer et al., 1997; Najjar et al., 1997; Geeraerd et al., 1998). An ANN is a highly interconnected network consisting of many simple processing elements capable of performing a massively parallel computation for data processing inspired by the elementary principles of the nervous system. An ANN imposes no restrictions on the type of relationship governing the dependence of the growth parameters on the various running conditions (Hajmeer et al., 1997). Regression-based response surface models require the order of the model to be stated (i.e., second, third, or fourth order), while ANN tends to implicitly match the input vector (i.e., growth conditions) to the output vector (i.e., Gompertz parameters). When an ANN is trained on appropriate data, it can then be used to predict growth curves for different microbe growth conditions within the experimental range with no need of conducting any experimental investigation. A detailed description of artificial neural networks has been made by Najjar et al. (1997) and their application in predictive microbiology is reported by Hajmeer et al. (1997) and Geeraerd et al. (1998). There is a characteristic neural network terminology worth mentioning. The network is formed by a series of interconnected neurons, each neuron in the first layer being associated with a variable or characteristic of the problem (pH, temperature, salt). Each
neuron in the hidden layers carries out a double transformation or function, first adding and afterwards transferring, using a sigmoidal function, arch-tangent, hyperbolic, etc., that gives the models their non-linearity, and the neurons in the output layer are the values, or targets, given. Between the different layers, there are connections, weights or parameters, which are the coefficients of the model and which start with some initial values generated at random and later on are estimated by iterative procedures using a backpropagation algorithm, in our case, the extended deltabardelta EDBD, and this process is what is called the training of the net. This iterative process recalls those used in the estimation of non-linear regression coefficients. One of the most important aspects of the employment of this modelling methodology is its ability to reduce the complexity of the network during the training. Other authors have developed artificial neural networks (ANN), showing that growth predictions from neural networks supplied a better agreement than the predictions obtained by regression equations (Hajmeer et al., 1997), but the number of parameters to be estimated were too great (142), more than those recommended. This demonstrates the necessity of developing methods permitting a reduction in the number of parameters without diminishing the models estimation capacity (Hervas et al., 2000, in press). ANN could be overparameterized, and in this case, they produce an overtrained net, which leads to a worse generalization capacity. That is the reason why, in order to eliminate unnecessary weights or parameters, the pruning algorithm is used. There are different pruning methods, such as: their application as a function of the ANN accuracy index (Le Cun et al., 1990; Hervas et al, 1998), or by the simple weight decay or elimination method, reducing the networks size and improving generalization (Krogh and Hertz, 1992; William, 1995; Bebis et al., 1997), or by designing ANN using genetic algorithms (Miller et al., 1989; Bebis et al., 1997; Yao, 1997, 1999; Hervas et al., 2000, in press, 2001b). Once the best model has been obtained by the training process, this is tested with a data set, obtained in a similar way and denominated generalization data set, to evaluate the prediction capacity of the proposed model. This could be equivalent to what has
21
been called by other predictive microbiology authors mathematical validation (Van Impe et al., 1996). The generalization capacity is evaluated by means of the Standard Error of Prediction percentage (% SEP), which is a relative typical deviation of the mean prediction values and has the advantage, compared to other error measures, that it is not dependent on the magnitude of the measurements. The objective of the present study was to analyse what possibilities and improvement ANN, optimized by genetic algorithms and pruning methods, might provide (as compared with the response surface model methodology usually used) in predictive microbiology for the study of the growth or inhibition of microorganisms on food. The sample used was Lactobacillus plantarum, under different temperature, pH and salt conditions.
The parameters chosen to build the model were all the possible combinations of pH (4, 5, 6, 7); NaCl (%) (0, 2, 4, 6) and the temperature (20 C and 28 C). The different pH combinations were adjusted with HCl 0.1 N and NaOH 1 N in final volumes of 50 ml of BHI, to which the inoculum at a final concentration of 105 cfu/ ml was added. From it, four wells of each condition were filled with 250 ml of media with the inoculum. 2.1.1. Data collection The experiments were carried out, twice at 20 C and 28 C, with four replicas of each growth condition. The optical density measurements were made every hour during 72 h, enough time for most conditions to show growth and to obtain enough data for our purpose, which was to compare two types of modelling. All the wells were completed with 100 ml of sterile paraffin with the twofold purpose of favouring the growth of this micro-organism and of preventing the evaporation of the media, which would have altered the results. 2.2. Curve fitting The DMFit curve fitting Program designed by Baranyi (1998) was used for the optical density (OD) data fit, applying the modified Gompertz function (Mc Clure et al., 1993) and Baranyi function (Baranyi and Roberts, 1995) and the estimation of growth rate and lag-time. The modified Gompertz function was the final equation used after an analysis of variance between the results of both equations that could be expressed by Lt A C exp exp Bt M where L(t) is the log10 OD at time t, A is the asymptotic log-OD as t decreases indefinitely, C is the asymptotic amount of growth that occurs as t increases indefinitely, and B is the relative growth rate at M, where M is the time at which the absolute growth rate is maximum. These parameters were used to derive the growth rate and lag-time: growth rate log10 OD=h BC=e; lag time h M 1=B:
2. Material and methods 2.1. Sample preparation An automatic method for the measurement of the optical density by an iEMS Reader MF (Labsystem, Finland) was used. To find out with some accuracy the number of cells injected into the samples, a calibration line was drawn by taking three previous calibrations made with the same instrument, with readings at 600 nm and under an optimal temperature condition of 30 C, with the L. plantarum strain CECT 220 (Spanish Collection of Strain Types, Valencia). For this, double seriated dilutions were made in BHI (Brain Heart Infusion, Difco) at different initial micro-organism concentrations. At the same time, they were plated on MRS Agar (Oxoid, CM361) and incubated at 30 C for 48 h. For the preparation of the inoculum of L. plantarum, the strain was inoculated in flasks with 50 ml of BHI and incubated at 30 C for 24 h, this being repeated twice. In the culture used previously to prepare the inoculum, at 18 h the cells were found to be at the stationary stage of growth. The number of micro-organisms present in the culture was calculated by the calibration line. Subsequently, the necessary dilutions were made in BHI (Brain Heart Infusion, Difco) to obtain a total of 105 cfu/ml in each well.
22
2.3. Response surface model The kinetic parameters obtained from the primary model were used for the Response Surface Analysis on a total of 256 time optical density curves. The independent variables considered were temperature, pH and salt concentration and the dependent variables were the growth rate (Gr), and the lag-time (lag). For the estimation of the parameters of the fitting function, the SPSS version 9.0 (SPSS) software was used, considering the Levenberg Marquardt algorithm as being suitable for the optimization of the error function. 2.4. ANN feedforward pruning models In the Introduction, some neural network terms have been clarified and here it will be explained how they are carried out. The methodology, nomenclature and the best parameters obtained by a heuristic method for the genetic algorithm (GA) used was that described by Hervas et al. (1998, 2000, in press, 2001b). The algorithms were implemented in C programming language and the training process was carried out in a Silicon Graphics Origin 2000. First, an initial population of nets was created by choosing 100 nets with random parameters (weights) with the same architecture (same input number, one hi
hidden layer, same nodes, and the same number of outputs) following a Laplace distribution to assist the parameter (weights) regularization method (William, 1995). A specific neural network structure has to be made to learn what output (in our case, growth rate and lagtime) has to produce for given inputs (in our case, pH, temperature, salt). This estimated output is compared with the target-value and if this error is larger than a predetermined threshold, a training algorithm adjusts the parameters (weights). In a completely random way, 75% of the curves of a total of 256 time optical density curves were used for the training set and 25% for the generalization set, taking five replicas of each condition for training and three for generalization. Variables (temperature, pH, % NaCl, time optical density, growth rate and lag-time) were scaled in the rank [0.1, 0.9] due to their different measurement ranges and to avoid saturation problems in the sigmoidal function in the network between. The new scaled variables were named as T *, pH*, NaCl*, Sti*, Gr* and lag* respectively, and were obtained as follows: X 0:8 X Xmin 0:1: Xmax Xmin 1
The output values of the node hi in the hidden layer were obtained using a sigmoidal function such as: 2
1 : 1 expwi;b1 wi;T T wi;pH pH wi;ClNa ClNa wi;St0 S . . . wi;St24 S t0 t24
The output values of Gr* and lag* were estimated using a linear function such as, Gr wg;b2 wg;1 h1 . . . wg;i hi . . . wg; m hm : 3 These values were de-scaled by: lag w1;b2 w1;1 h1 . . . w1;i hi . . . w1;m hm : These values were de-scaled by: maxY minY Y 0:1 minY : 0:8 4
The theoretical architecture of the artificial neural network can be seen in Fig. 1. The standard error of prediction (SEP) is a relative standard error and was obtained by the following expression: v uX n u u Gri Gri 2 100 t i1 6 SEP n Gri where Gr and Gr are the observed and predicted values for growth rate, respectively; Gr is its average value; and n the number of data points used. In a similar way this could be applied for the lag-time.
23
Fig. 1. Theoretical architecture of the artificial neural network for the estimation of the growth rate (Gr) and the lag-time (lag), marked with an asterisk when scaled.
To design the network architecture, it uses a genetic algorithm with a selection procedure and crossing/ mutation operators which introduce some diversity to the algorithm (Hervas et al., 1998, 2000, in press, 2001b). The fitness function used in the genetic algorithm has two objectives: first, to minimise the residual sum of squares, and second, to reduce the unnecessary parameters. The second term penalizes more complex models (models with a larger number of parameters) as compared with less complex models but with the same generalization capacity. Thus, the models with a smaller number of parameters to be estimated, keeping the mean square error of the generalization set, are favoured, and this is what is called weight decay (Krogh and Hertz, 1992). Some techniques known as pruning reduce the network size by modifying not only the connection parameters (weights) but also the network structure during training, beginning with a network design with an excessive number of nodes and gradually eliminating the unnecessary nodes or connections (Le Cun et al., 1990; Hassibi and Stork, 1993). In short, the procedure is as follows (Hervas et al., 2000, in press, 2001b): (1) A random initial population of 100 neural networks with an identical number of neurons and
layers and a different connectivity (different weights) is created. A percentage of these connections were frozen so as to start the process with not fully connected nets that would help the pruning process (2) A genetic algorithm searches the near optimal network architecture from those generated in the previous stage, by the selection, crossing and mutation of the individual network from the population of nets. (3) This procedure is carried out by a standard backpropagation training algorithm using the rule extended deltabardelta EDBD. (4) A connection elimination pruning is performed by the algorithm twice and the best individual network is selected from the new population. (5) The procedure ends when a stopping criterion is reached: a maximum number of generations or an improvement in the average aptness of the population.
3. Results and discussion The first step in modelling the data was to obtain the kinetic parameters, growth rate and lag-time, from
24
the growth curves by a modified Gompertz function (Table 1, Obs.). After the estimation of the growth rate and lag-time, different methods were applied for the building of the model. 3.1. Response surface model Table 1 shows the estimations obtained by this method. Only those terms that were statistically significant were considered and analysed by the stepwise method, one by one, eliminating first those that were less significant. The best full-response surface model obtained for the growth rate (Gr) and the lag-time (lag) are described in Table 2. Different transforma-
tions of the data, like the application of logarithms, were also studied but did not improve the model. The resulting equations were Gr 0:00059 T 0:00192 pH 0:00290 NaCl R2 0:5827 7
lag 15:52 T 58:15 pH 12:46 NaCl 0:39 T 2 4:37 pH2 0:29 NaCl2 0:41 T pH 0:30 T NaCl 0:81 pH NaCl R 0:7975: 8 The application of a Response Surface for the growth rate estimation showed significant coefficient
2
Table 1 Average of observed (Obs) and estimated growth rate and lag-time by neural networks (ANN) and response surface model (RSM) of L. plantarum T (C) 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 28 28 28 28 28 28 28 28 28 28 28 28 28 28 28 28 pH NaCl (%) 0 2 4 6 0 2 4 6 0 2 4 6 0 2 4 6 0 2 4 6 0 2 4 6 0 2 4 6 0 2 4 6 Growth rate (h 1) Obs 0.0116 0.0044 0.0025 0.0001 0.0215 0.0162 0.0153 0.0109 0.0236 0.0212 0.0169 0.0054 0.0251 0.0117 0.0108 0.0056 0.0212 0.0105 0.0072 0.0037 0.0298 0.0223 0.0180 0.0133 0.0323 0.0313 0.0281 0.0209 0.0318 0.0259 0.0172 0.0017 ANN 0.0110 0.0065 0.0029 0.0001 0.0209 0.0163 0.0125 0.0096 0.0240 0.0191 0.0149 0.0113 0.0216 0.0153 0.0093 0.0033 0.0186 0.0148 0.0101 0.0047 0.0258 0.0218 0.0168 0.0109 0.0358 0.0308 0.0245 0.0169 0.0344 0.0261 0.0155 0.0026 RSM 0.0195 0.0137 0.0079 0.0021 0.0214 0.0156 0.0098 0.0040 0.0233 0.0175 0.0117 0.0059 0.0252 0.0194 0.0137 0.0079 0.0242 0.0184 0.0126 0.0068 0.0261 0.0203 0.0145 0.0087 0.0281 0.0223 0.0165 0.0107 0.0300 0.0242 0.0184 0.0126 Lag-time (h) Obs 20.55 28.21 31.41 72.00 10.45 13.04 19.44 31.70 11.32 13.51 18.99 22.45 16.51 27.04 24.32 31.78 6.70 4.68 7.06 8.18 0.01 0.07 3.58 9.49 0.01 0.67 3.66 7.02 0.01 4.78 16.49 6.53 ANN 18.93 24.48 37.90 71.12 11.97 15.17 20.08 29.76 11.56 15.05 19.19 24.47 17.13 21.39 25.97 30.86 9.33 10.76 12.50 12.30 3.19 4.64 6.35 8.43 1.48 3.30 5.46 8.02 4.04 6.65 9.69 13.16 RSM 21.84 29.53 39.53 51.83 11.11 17.19 25.57 36.26 9.11 13.58 20.35 29.42 15.84 18.69 23.85 31.31 7.02 9.93 15.16 17.31 0.47 0.84 4.45 10.36 0.78 0.48 2.47 6.77 10.77 8.85 9.23 11.92
4 4 4 4 5 5 5 5 6 6 6 6 7 7 7 7 4 4 4 4 5 5 5 5 6 6 6 6 7 7 7 7
25
Table 2 Regression coefficients, their standard errors and the confidence interval of the response surface model for growth rate and lag-time of L. plantarum Growth parameter Gr Term T pH NaCl T pH NaCl T2 pH2 NaCl2 T pH T NaCl pH NaCl Coefficient 0.00059 0.00191 0.00289 15.52 58.15 12.46 0.39 4.37 0.29 0.41 0.30 0.81 Standard error 0.00009 0.00037 0.00023 1.48 6.22 1.92 0.04 0.50 0.13 0.11 0.06 0.20 95% Confidence interval 0.00042 0.00118 0.00335 12.60 70.44 8.67 0.47 3.37 0.04 0.18 0.41 1.21 0.00076 0.00264 0.00244 18.54 45.86 16.26 0.32 5.36 0.54 0.63 0.19 0.41
lag
values ( p < 0.05) and a standard error of prediction (SEP) of 35.63%. The response surface model statistics for the lagtime showed a SEP of 39.30%. Table 2 gives the estimated regression coefficient values, their standard errors and the 95% confidence interval using a Students t120-6 distribution. The results show a second-order response surface model since the third-order model displayed regression coefficients, whose differences were not significant ( p > 0.05). 3.2. ANN feedforward pruning models Three different types of nets have been designed (Table 3): the first type of network was set up to predict only the growth rate values (Models I and II), the second type of network only predicts the lag-time values (Models III and IV) and the third type of network was designed to predict the growth rate and lag-time in combination (Models V and VI). Also, in some nets, 25 data of the OD time were considered, so that in those cases, the number of input parameters increased to 28. (Models II, IV, and VI). Where the nomenclature of the network architecture is as follows:no. of input neurons: no. of hidden neurons s: no. of input neurons l, outputwhere:

l, the transfer function for the units in the output layer is lineal output in our case, is the kinetic parameters estimated: lag and Gr Table 3 shows the average values and their standard deviation for the standard error of prediction (SEP) of growth rate and lag-time parameters, for several trainings (SEPt) and generalizations (SEPg) of five runs or executions of the whole process, as a function of the number of initial nodes in the hidden layer. Also observed are the average values and their standard deviation for SEP of the growth rate and lagtime parameters when the network simultaneously estimates both parameters (Models V and VI). In this case, it can be observed that the SEP, in general, is higher than in those models where the kinetic parameters are estimated separately. The next step was to ascertain the statistical significance of the differences observed between the estimation errors of the different models by an analysis of variance (ANOVA), assuming that the SEP values obtained had a normal distribution. This analysis showed that there was a statistically significant effect due to the network architectures ( p < 0.05). In order to decide which was the most suitable network architecture based on the smallest error value, the equality of the means was contrasted. We used the multiple comparisons Student Newman Keuls (SNK) test for a post hoc method. From this
s, the transfer function for the neurons in the hidden layer is sigmoid,
26
Table 3 Standard errors of prediction of the algorithm for different neural network architectures for growth rate (Gr) and lag-time (lag) for training and generalization process of the growth data of L. plantarum over five executions of the algorithm Model Network architecture Means of values (Standard Deviation) SEPta (%) (Gr) I II III IV V VI
a b
SEPgb (%) (Gr) 15.55 (1.81) 11.68 (0.24) 15.98 (0.73) 12.38 (0.36)
SEPt (%) (lag) 15.06 (0.36) 11.01 (0.44) 19.23 (0.29) 13.20 (0.21)
SEPg (%) (lag) 15.05 10.97 16.76 13.97
Numbers of connections 13.40 (1.14) 38.00 (16.64) 15.00 (1.00) 34.60 (7.54) 24.2 (1.30) 42.20 (22.91)
3:3s:1l Gr (3 + 25):2s:1l Gr 3:3s:1l lag (3 + 25):3s:1l lag 3:4s:2l Gr, lag (3 + 25):3s:2l Gr, lag
22.42 (1.29) 11.49 (0.26) 19.63 (0.54) 13.60 (0.47)
(0.17) (0.85) (0.30) (0.38)
SEPt: Standard error of prediction of training runs. SEPg: Standard error of prediction of generalization runs.
analysis, we deduced that the best of all the network topologies studied for training and generalization were those nets which had incorporated the 25 OD time data. A larger error was observed in those models that considered only the three environmental factors (15.55% and 15.05%, Models I and III, respectively) than when the OD time data were introduced (11.68% and 10.97%, Models II and IV, respectively) (Table 3). This lesser value of SEP could indicate that the resulting best network architecture obtained would be that of Models II and IV, but this slight improve-
ment of the generalization error has to be considered together with the number of parameters (23 and 22, respectively) and also the application of the model in real food. We consider these models (II and IV) as being useful for some situations where with only a few data, the lag and Gr can be obtained in 24 h. This could be the case for an industry that needs to find out the microbial contamination, lag-time and shelf life of a product in a minimum amount of time after the product is produced. By using a rapid instrument technique such as the iEMS, with software into which
Fig. 2. Architecture of the selected artificial neural network for the estimation of the growth rate (Gr) of L. plantarum, marked with an asterisk when scaled.
27
Fig. 3. Architecture of the selected artificial neural network for the estimation of the lag-time (lag) of L. plantarum, marked with an asterisk when scaled.
the predictive model can be incorporated, after a few hours of measuring the optical density of a sample, the lag and Gr, and therefore, the estimated shelf life of the product, or estimate any abnormally high concentration of bacteria, could be calculated automatically. But for the purpose of its incorporation into a general database for prediction, it is not necessary to include the OD time data, so that as the next models with the significant smallest error were 3:3:1 Gr and 3:3:1 lag, these were the two we selected. The matrix of the best nets chosen can be seen in Figs. 2 and 3 and the estimated values are described in Table 1.
The results obtained in the contrasts of the multiple comparisons answered the question of whether or not it is necessary and useful to prune the network connections. Table 4 shows the number of connections or parameters associated with each type of net: those of a completely connected network and those estimated after it had been pruned (the average value and the best network value obtained). It can be seen that the number of connections of the more complex nets that included data of OD time were reduced in a relevant way. For example, Model IV was reduced from 91 parameters to 22 in the best network esti-
Table 4 Number of connections associated with each type of net, when completely connected and after pruning (average and best case) and standard error of prediction (SEP) of the best models of growth data of L. plantarum Model Network architecture No. of connections Completely connected I II III IV V VI 3:3s:1l (3 + 25):2s:1l 3:3s:1l (3 + 25):3s:1 3:4s:2l (3 + 25):3s:2l 16 61 16 91 26 95 Pruned, average 13.40 38.00 15.00 34.60 24.20 42.20 Pruned, best case 13 23 14 22 25 32 SEPg (%) best model Growth rate 14.04 11.82 14.84 10.96 15.15 13.75 Lag-time
15.84 15.52
28
mated. Also noticeable is the stability of the simplest nets, like Models I and II, by the low difference between the average and the best cases of numbers of connections (13.4 vs. 13 and 15 vs. 15, respec-
Fig. 5. Graphic representation of the observed (lag OBS), estimated by neural networks (lag ANN) and by response surface model (lag RSM) of growth rate at (a) 28 C and (b) 20 C of L. plantarum.
Fig. 4. Graphic representation of the observed (Gr OBS), estimated by neural networks (Gr ANN) and by response surface model (Gr RSM) of growth rate at (a) 28 C and (b) 20 C of L. plantarum.
tively). It is, in general, more convenient to have a small number of parameters, as this results in a lower experimentation cost and a simpler model but without forgetting the error of estimation.
29
On comparing this method with the RSM, it can be observed that the SEP for the Gr is 35.63%, while for the best ANN model it is 14.04%, and that in the case of the lag, it increases from 14.84% of the network to a value of 39.30% obtained by the RSM. Although it is necessary to point out the differences in the computational time necessary and the complexity between the two types of models, where the RSM contains only three parameters for the Gr and nine for the lag to be estimated, the ANN contains 13 parameters for Gr and 14 for lag, but the difference in the SEP observed is so great that this seemed to us reason enough to choose the ANN, a more complex but accurate model. Figs. 4 and 5 show the values observed and estimated by both methods at 20 C and 28 C, where it can be seen that the estimated values of the ANN are closer to the observed values than the values estimated by the RSM. It is clear that in the ANN model, there is a great improvement in the SEP with regard to the response surface model, although a much greater degree of it is needed to simplify and reduce the computational time cost.
shelf life of any one of the products. The more accurate the models, the more accurate our predictions, and this will be an advantage for their practical application. The results obtained in this study show the need for and the effective use of pruning the connections of the network by genetic algorithms. ANN models pruned with GA provided a good architecture design for generalization and required a smaller number of connections than a totally connected net. This decrease in the number of connections implies a lower hardware implementation cost and a more efficient estimation of the parameters, because less parameters were estimated with the same number of data points in the training set without significantly losing any capacity for generalization. Also, as can be seen, ANN permits an estimation in the same model of kinetic parameters, growth rate and lag-time, and although the error increases, this is an advance in modelling compared with the RSM.
Acknowledgements 4. Conclusions The architecture of the network obtained indicates that, for an improvement of the generalization error, it is necessary to introduce some OD time values data into the network besides the values associated with pH, salt and storage temperature. In this study, it was observed that with the incorporation of the first 25 OD time values, there was enough additional input information. The % SEPg of the best models (3:3:1) in which the OD time values were not introduced remained around an acceptable value of 14.04% for Gr and 14.84% for lag-time and kept their great generalization ability. This is most suitable for the purpose of incorporating them in a prediction database and it was for this reason that these models were selected. The possibility of finding out the development of spoilage bacteria in foods by predictive microbiology, and relating the spoilage of the product to a certain level of micro-organisms would allow us to estimate the shelf life of different products. Of course, the model should include micro-organism behaviour data throughout the general shelf life of that type of product so as to estimate a realistic duration of the This work has been partly financed by the CICYT ALI98-0676-C02-01 and ALI98-0676-CO2-02, and Research Group AGR-170 and TIC 148 AYRNA.
References
Baranyi, J., 1992. Notes on reparameterization of bacterial growth curves. Food Microbiol. 9, 169 174. Baranyi, J., 1998. Draft Manual for the DMFit Curve-Fitting Program. Account, Institute of Food Research, Norwich. Baranyi, J., Roberts, T., 1995. Mathematics of predictive food microbiology. Int. J. Food Microbiol. 26, 199 218. Bebis, G., Georgiopoulos, M., Kasparis, T., 1997. Coupling weight elimination with genetic algorithm to reduce network size and preserve generalization. Neurocomputing 17, 167 194. Bhaduri, S., Buchanan, R., Phillips, J.G., 1995. Expanded response surface model for predicting the effects of temperatures, pH, sodium chloride contents and sodium nitrite concentrations on the growth rate of Yersinia enterocolitica. J. Appl. Bacteriol. 79, 163 170. Buchanan, R., Bagi, L., 1994. Expansion of response surface analysis for the growth of Escherichia coli O157:H7 to include sodium nitrite as a variable. Int. J. Food Microbiol. 23, 317 332. Buchanan, R., Phillips, J.G., 1990. Response surface model for predicting the effects of temperature, pH, sodium chloride con-
30
R.M. Garca-Gimeno et al. / International Journal of Food Microbiology 72 (2002) 1930 7966 and food isolate under aerobic conditions. J. Food Prot. 55, 968 972. Kalathenos, P., Baranyi, J., Sutherland, J., Roberts, T., 1995. A response surface study on the role of some environmental factors affecting the growth of Saccharomyces cerevisiae. Int. J. Food Microbiol. 25, 63 74. Krogh, A., Hertz, J., 1992. A simple weight decay can improve generalization. In: Touretzky, D.S. (Ed.), Advances in Neural Information-Processing Systems, vol. 4. Morgan Kaufmann, San Mateo, CA, pp. 950 957. Le Cun, Y., Denker, J., Solla, S., 1990. Optimal brain damage. In: Touretzky, D.S. (Ed.), Advances in Neural Information-Processing Systems, vol. 2. Morgan Kaufmann, San Mateo, CA, pp. 598 605. Little, C., Adams, M., Anderson, W., Cole, M., 1992. Comparison of a quadratic response surface model and a square root model for predicting the growth rate of Yersinia enterocolitica. Lett. Appl. Microbiol. 15, 63 68. Mc Clure, P.J., Baranyi, J., Boogard, E., Kelly, T.M., Roberts, T.A., 1993. A predictive model for the combined effect of pH, sodium chloride and storage temperature on the growth of Brochothrix thermosphacta. Int. J. Food Microbiol. 19, 161 178. Miller, G., Todd, P., Hedge, S., 1989. Designing neural networks using genetic algorithms. In: Schaffer, J. (Ed.), Proceedings of 3rd International Conference of Genetic Algorithms and Their Applications. Morgan Kaufmann, San Mateo, pp. 379 384. Najjar, Y., Basheer, I., Hajmeer, M., 1997. Computational neural networks for predictive microbiology: I. Methodology. Int. J. Food Microbiol. 34, 27 49. Van Impe, J.F., Nicolai, B.M., Martens, T., Baerdemaeker, J., Vandewalle, J., 1992. Dynamic mathematical model to predict microbe growth and inactivation during food processing. Appl. Environ. Microbiol. 58, 2901 2909. Van Impe, J.F., Versyck, K., Geeraerd, A., 1996. Validation of predictive models definitions and concepts. In: Roberts, T.A. (Ed.), COST 914. Predictive Modelling of Microbial Growth and Survival in Foods. European Commission, Brussels, pp. 31 38. William, P.M., 1995. Bayesian regularization and pruning use a Laplace prior. Neural Comput. 7, 117 143. Yao, X., 1997. Evolutionary system for evolving artificial neural networks. IEEE Trans. Neural Networks 8, 694 713. Yao, X., 1999. Evolving artificial neural networks. Proc. IEEE 87, 1423 1447. Ziaka, L.L., Moulden, E., Weiner, L., Phillips, J.G., Buchanan, R.L., 1994. Model for the combined effects of temperature, initial pH, sodium chloride and sodium nitrite concentrations on anaerobic growth of Shigella flexneri. Int. J. Food Microbiol. 23, 345 358. Zwietering, M., de Wit, J., Cuppers, H., Vant Riet, K., 1994. Modeling of bacterial growth with shifts in temperature. Appl. Environ. Microbiol. 60, 204 213.
tent, sodium nitrate concentration and atmosphere on the growth of Listeria monocytogenes. J. Food Prot. 53, 370 376. Buchanan, R., Bagil, K., Goins, R., Phillips, J., 1993a. Response surface analysis for the growth kinetics of Escherichia coli O157:H7. Food Microbiol. 10, 303 315. Buchanan, R., Smith, J.L., McColgan, C., Marmer, B.S., Golden, M., Dell, B., 1993b. Response surface analysis for the effects of temperature, pH, sodium chloride and sodium nitrite on the aerobic and anaerobic growth of Staphylococcus aureus 196E. J. Food Saf. 13, 159 175. Davey, K., 1991. Applicability of the Davey (linear Arrhenius), predictive model to the lag phase of microbial growth. J. Appl. Bacteriol. 70, 253 257. Devlieghere, F., Debevere, J., Van Impe, J., 1998. Effect of dissolved carbon dioxide and temperature on the growth of Lactobacillus sake in modified atmospheres. Int. J. Food Microbiol. 41, 231 238. Geeraerd, A., Herremans, C., Cenens, C., Van Impe, J., 1998. Applications of artificial neural networks as a non-linear modular modelling technique to describe bacterial growth in chilled food products. Int. J. Food Microbiol. 44, 49 68. Gibson, A.M., Bratchell, N., Roberts, T.A., 1988. Predicting microbe growth: growth responses of salmonellae in a laboratory medium as affected by pH, sodium chloride and storage temperature. Int. J. Food Microbiol. 6, 155 178. Hajmeer, M., Basheer, I., Najjar, Y., 1997. Computational neural networks for predictive microbiology: II. Application to microbe growth. Int. J. Food Microbiol. 34, 51 66. Hassibi, B., Stork, D., 1993. Second order derivatives for a network pruning: optimal brain surgeon. In: Hanson, S.J., Cowan, J.D., Giles, C.L. (Eds.), Advances in Neural Information-Processing Systems, vol. 5. Morgan Kaufmann, San Mateo, CA, pp. 164 171. Hervas, C., Ventura, S., Silva, M., Perez, D., 1998. Computational neural networks for resolving nonlinear multicomponent systems based on chemiluminescence methods. J. Chem. Inf. Comput. Sci. 38, 1119 1124. Hervas, C., Algar, J.A., Silva, M., 2000. Correction for temperature variations in kinetic-based determinations with pruning computational neural networks by using genetic algorithms. J. Chem. Inf. Comput. Sci. 40, 724 731. Hervas, C., Toledo, R., Silva, M., 2001a. Use of pruned computational neural networks for processing the response of oscillating chemical reaction with a view to analyzing nonlinear multicomponent mixtures. J. Chem. Inf. Comput. Sci. 41, 1083 1092. Hervas, C., Zuera, G., Garca, R.M., Martnez, J.A., 2001b. Optimisation of computational neural network for its application to the prediction of microbial growth in foods. Food Sci. and Technol. International 7 (2), 159 163. Hudson, J., 1992. Construction of and comparison between response surface analysis for Aeromonas hydrophila ATCC

Improving Artificial Neural Networks With A Pruning Methodology and Genetic Algorithms For Their Application in Microbial Growth Prediction in Food

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Improving Artificial Neural Networks With A Pruning Methodology and Genetic Algorithms For Their Application in Microbial Growth Prediction in Food

Uploaded by

Copyright:

Available Formats

International Journal of Food Microbiology 72 (2002) 19 30 www.elsevier.

R.M. Garca-Gimeno et al. / International Journal of Food Microbiology 72 (2002) 1930

R.M. Garca-Gimeno et al. / International Journal of Food Microbiology 72 (2002) 1930

R.M. Garca-Gimeno et al. / International Journal of Food Microbiology 72 (2002) 1930

1 : 1 expwi;b1 wi;T T wi;pH pH wi;ClNa ClNa wi;St0 S . . . wi;St24 S t0 t24

R.M. Garca-Gimeno et al. / International Journal of Food Microbiology 72 (2002) 1930

R.M. Garca-Gimeno et al. / International Journal of Food Microbiology 72 (2002) 1930

R.M. Garca-Gimeno et al. / International Journal of Food Microbiology 72 (2002) 1930

R.M. Garca-Gimeno et al. / International Journal of Food Microbiology 72 (2002) 1930

SEPg (%) (lag) 15.05 10.97 16.76 13.97

22.42 (1.29) 11.49 (0.26) 19.63 (0.54) 13.60 (0.47)

(0.17) (0.85) (0.30) (0.38)

R.M. Garca-Gimeno et al. / International Journal of Food Microbiology 72 (2002) 1930

R.M. Garca-Gimeno et al. / International Journal of Food Microbiology 72 (2002) 1930

R.M. Garca-Gimeno et al. / International Journal of Food Microbiology 72 (2002) 1930

You might also like