Professional Documents
Culture Documents
www.elsevier.com/locate/engappai
Abstract
The integration of neural networks and optimization provides a tool for designing network parameters and improving
network performance. In this paper, the Taguchi method and the Design of Experiment (DOE) methodology are used to
optimize network parameters. The users have to recognize the application problems and choose a suitable Arti®cial Neural
Network model. Optimization problems can then be de®ned according to the model. The Taguchi method is ®rst applied to a
problem to ®nd out the more important factors, then the DOE methodology is used for further analysis and forecasting. A
Learning Vector Quantization example is shown for an application to bicycle derailleur systems. # 2000 Elsevier Science Ltd.
All rights reserved.
Keywords: Neural networks; Optimization; Taguchi method; Design of experiments; Bicycle derailleur systems
0952-1976/00/$ - see front matter # 2000 Elsevier Science Ltd. All rights reserved.
PII: S 0 9 5 2 - 1 9 7 6 ( 9 9 ) 0 0 0 4 5 - 7
4 T.Y. Lin, C.H. Tseng / Engineering Applications of Arti®cial Intelligence 13 (2000) 3±14
2. Optimization process
Fig. 1. Optimization process.
gi x R0; i 1, 2, . . . , m: 3
using other kinds of optimization methods, such as the the output response from the changes in the input par-
Taguchi method and DOE methodology. ameters. Fig. 2 also shows the process of DOE. Unlike
the Taguchi method, a statistical model is constructed
2.4. The Taguchi method for the simulations and the experiment. Therefore,
some assumptions and validations of the model (model
Dr. Genichi Taguchi's methods were developed after adequacy checking) have to be made, both before and
World War II in Japan. His most notable contri- after the experiment. The experimental strategy is to
butions lie in quality improvement, but in recent years, change one parameter and keep the rest of the par-
the basic concepts of the Taguchi method has been ameters constant in each step. Therefore, the exper-
widely applied in solving optimization problems, es- iment and simulation times are much longer than in
pecially in zero order problems. Because the Taguchi the Taguchi method as mentioned before. The exper-
method is a kind of fractional factorial DOE, the imental response, such as the training error and con-
simulation or experiment at times can be reduced, vergence speed, can be analyzed and forecast by
compared to DOE. For example, if there are seven ``Analysis of Variance'' (ANOVA) and other statistical
two-level factors in a design problem, only eight simu- techniques (Montgomery, 1991).
lations have to be done in the Taguchi method. In
DOE, however there are 27 128 simulations that
have to be done. 2.6. The Taguchi method vs. DOE methodology
Fig. 2 shows the process of the Taguchi method.
The engineers must recognize the application problem In the optimization process shown in Fig. 1, the
well and choose a suitable ANN model. In the selected Taguchi method is treated as a pre-running of the de-
model, the design parameters (factors) which need to sign parameters. For some engineering applications, it
be optimized have to be determined. Using orthogonal is quite sucient to use the Taguchi method. There are
arrays, simulations can be executed in a systematic many reasons to do so. In design problems, there are
way. From simulation results, the responses can be sometimes a large number of design parameters. It is
analyzed by level average analysis and signal-to-noise not ecient to use DOE methodologies at this time
(S/N ) ratio in the Taguchi method (Taguchi, 1986). because of too many training cases. Therefore, the
Taguchi method is used to reduce the training cases,
2.5. DOE methodology and to ®nd the more important parameters that aect
the response of the neural network. Afterwards, the
DOE is a test or series of tests in which the designer DOE methodology can be easily completed using a
may observe and identify the reasons for changes in smaller number of important parameters, keeping
T.Y. Lin, C.H. Tseng / Engineering Applications of Arti®cial Intelligence 13 (2000) 3±14 7
Fig. 6. Vibration signals in the time domain: (a) Type I and (b) Type 3. Update the weight vectors according to
II.
wnew
c wold
c a
x ÿ wc if C
x t
from the accelerometer. The data fed to the network
are transferred from the time domain to the frequency wnew
c wold ÿ
c ÿ a
x ÿ wc if C
x 6 t
domain by the FFT technique (John and Dimitris,
1996). In gathering training data, if the tooth numbers
wnew
i wold
i for all i 6 c
6
of two adjacent sprockets are both even and the tooth
number of the chainwheel sprocket is also even, only where a > 0 and aÿ > 0 are the learning coe-
one type of chain engagement will occur. If only one cients of two dierent cases.
chain link is shifted from the previous situation, 4. Repeat steps 2 and 3 until the weights stabilize.
another type will occur. In this example, 80 items are
used to train the network, and 40 items are applied to
test the trained network.
3.3. De®ne the optimization problem
3.2. Choosing an ANN model
The optimal physical problem can be covered by a
mathematical model of design optimization involving
In supervised learning models, an LVQ example like
the procedures below.
that shown in Fig. 7 is selected because of its fast
training speed, no local minimum traps and better per- 1. Choose design variables: Based on the requirements
formance in classi®cation (Patterson, 1996). The LVQ of the physical problem, users can choose some fac-
is the transformation from the input vector x of tors as design variables, which can be varied during
dimension n to known target output classi®cations the optimization iteration process, and some factors
t
x t, where each class is represented by a codeword as ®xed constants. In this example, the chosen de-
T.Y. Lin, C.H. Tseng / Engineering Applications of Arti®cial Intelligence 13 (2000) 3±14 9
Table 2
Training results
Factor Level Error Factor Level Error From the Taguchi method, an improved design of
the LVQ network is obtained. Fig. 2 shows the next
Input units 256 16 0.05 16 step of the optimization process Ð the DOE method-
128 19 Zÿ 0.1 22 ology. The main purpose of this step is to further ana-
32 37 0.3 34 lyse the results of the Taguchi method, and to get
Z 0.05 18 2 0.1 23
0.1 30 Weight initial range 2 0.3 18 more accurate settings for the factors. The DOE the-
0.3 34 2 0.5 32 ories and principles (Montgomery, 1991) will not be
described here; only the key points and results are
shown below.
aected by dierent initial designs. Therefore, repli-
cated training is necessary for the following S=N
analysis. 4.1. Choosing an experimental design
The last column in Table 2 is the signal-to-noise
ratio
S=N). The equation for calculating the S=N for For the number of input units, it is obvious that a
the smaller-the-better quality characteristic is Eq. (A1). larger number will cause smaller grouping errors in
In this example, there is only one replicate, therefore, Fig. 5, and that 256 units is the maximum. Therefore,
the physical meaning of S=N is similar to the grouping the number of input units will remain at 256 in this
errors in Table 2. The grouping errors are used here step. For the weight initialization range 20:3 is not
instead of S=N for easier understanding. located at the variable boundaries and seems to be a
The next step in the Taguchi method is Level local minimum in Fig. 8. Therefore, this parameter
Average Analysis. The goal is to identify the strongest will also remain constant here. For Z and Zÿ , the
eects and to determine the combination of factors minimum response is located at the lower bound of
and levels that can produce the most desired results. the variables; therefore, an expansion of the bound-
Table 3 is the response table, which shows the average aries is required in further analysis. In summary, only
experiment result for each factor level. The total eect the two continuous parameters, Z and Zÿ , are treated
of the 256 input units is 16. This is the average group- as design variables in the DOE methodology, and the
ing error of the ®rst three rows in Table 2 other two parameters are kept constant.
1 14 80=3 16). Other response values can be The new parameter sets are shown in Table 4. The
calculated by using a similar method. For the number upper and lower bounds of Z and Zÿ are changed
of input units, 256 units can get a smaller grouping from 0.05 and 0.3 to 0.01 and 0.1. Therefore, three
error than other levels. The same principle can be used levels (0.1, 0.05 and 0.01) are designed for training,
to make Z and Zÿ equal to 0.05, and the weight initi- while the other two parameters are kept constant.
alization range to be between +0.3 and ÿ0.3. Fig. 8 Nine training cases have to be simulated in this
shows the response curves for the four factors. It example.
shows that the four factors do have a strong eect on
the grouping errors. Therefore, the recommended fac-
tor levels are: 256 input units, Z 0:05, Zÿ 0:05 4.2. Conducting the experiment
and a weight initialization range of 20:3:
The same 80 items of training data used in the
Taguchi method are also applied here. The results of
grouping errors are also shown in Table 4. Upon com-
pletion of the training experiments, DOE analysis tech-
niques can be executed.
Table 4
DOE array and training results
Z
where
yijk is the ijkth observation (grouping error),
m is the overall mean,
ti is the ith Z eect (®xed eect),
bj is the jth Zÿ eect (®xed eect), Fig. 9. Plot of residuals vs. predicted grouping error.
tbij is the interaction between Z and Zÿ ,
eijk is the random error component. eijk 0NID
0, is not very high. In some cases, the conclusion may be
s2 : drawn that there are signi®cant dierences between the
levels. This means that the factor is very sensitive to
There are three levels in each factor and only one
the output response.
replicate is done; therefore, a 3, b 3 and k 1:
With only one replicate, there are no error estimations.
One approach applied to the following analysis is to
assume a negligible higher order interaction between 4.5. Model adequacy checking
Z and Zÿ combined with an error degree of freedom.
In the statistical model of this problem, three
assumptions are made: normality, independence and
4.4. Analysis of variance (ANOVA) equal variance. Some tests have to be executed to ver-
ify these assumptions. For checking the normality, the
Using the statistical model and the training results, Kruskal±Wallis Test (Montgomery, 1991) is used. In
the ANOVA technique can be executed. Table 5 is the Table 4, the values in the brackets are the data ranks,
ANOVA table from the SAS software, and the error Rijk , for the experiment.
estimation is taken from the interaction between Z 2 3
and Zÿ : If a 95 percent con®dence interval is assumed 1 Xa X b X
n
N
N 1 2
S2 4 R2 ÿ 5
(i.e., a 0:05, the normal setting for applications), N ÿ 1 i1 j1 k1 ijk 4
Fa; n1 ; n2 F0:05; 2; 2 19:0: For the null hypotheses of
ti 0 and bj 0,
1 9 102
284:5 ÿ 7:4375
10
F0, Z 0:42 < F0:05, 2, 2 19:0 and 8 4
^
Fig. 9 plots the residuals vs. the predicted values, y, 5. Conclusion
for the grouping errors. There is no obvious pattern
apparent, therefore, the independence and equal var- Optimization techniques have been widely used in
iance assumptions are justi®ed. many applications. In this paper, two major categories,
the Taguchi method and the DOE methodology, are
4.6. Forecasting applied to improve upon the original designs of
ANNs. The users have to recognize the design problem
In Table 4, the training results show that two cases and choose a suitable ANN model. Then, the optimiz-
of Z and Zÿ using the combination ((0.01, 0.01) and ation problems can be de®ned according to the model.
(0.05, 0.05)) will get a better grouping error. Only one The Taguchi method is ®rst applied to ®nd the more
grouping error occurs among 80 training data. In important factors, and to simplify the design problems.
order to get more precise results, the regression DOE methodologies are then used to ®nd the sensi-
method is used. Using the General Regression Model tivity and a more precise combination of design par-
in the SAS software (Montgomery, 1991), the highest ameters. The ®nal results of the examples introduced
order Fitting Response Surface is in this study indeed improve the initial designs and get
a better performance.
z ÿ13:89 1284:54Z 800:09Zÿ Although only one ANN model, LVQ, is demon-
2 2
strated in this paper, other models, such as ADALINE,
ÿ 8422:84Z ÿ 3867:28Zÿ ÿ 68236Z Zÿ MADALINE, Hop®eld Networks, MLFF, Boltzmann
Machines, Recurrent Neural Networks, Neocognitrons,
2 2
561574Z Zÿ 589352Z Zÿ etc., are also suitable. Many bene®ts can be mentioned.
First, this is a systematic method to use for a neural net-
2 2
ÿ 5262346Z Zÿ :
13 work design. It means that the engineer, whether or not
he or she is experienced in ANN, the Taguchi method
The surface is shown in Fig. 10. Using the partial and DOE, can follow this process easily. Many commer-
dierential method, the minimum z value and the cor- cial software packages can be applied, such as SNNS in
respondence Z and Zÿ can be obtained: Z 0:046 ANN and SAS in the DOE. Second, it will not take too
and Zÿ 0:05: much computational eort and time. The results of the
demonstrated examples can be obtained within 5 min
4.7. Con®rmation experiment with a Pentium-150 PC. This detail was not emphasized
in this paper because it is not the major concern here.
Using the recommended factor levels: Finally, in engineering applications, it is not necessary to
get a global optimization of the problems, because that
number of input units: 256,
takes too much time or the algorithms may be very com-
Z :0:046,
plicated. The improvement of the original designs to an
Zÿ :0:05,
acceptable region is helpful for engineers.
weight initialization range: 20:3:
the grouping error after training becomes zero, i.e., all
the training data are classi®ed successfully. Acknowledgements
A.3. Choice of an experimental design Conducting the experiment includes the execution
and simulation of the experiment as developed in pre-
Designing the experiment means the construction of vious stages. Before the actual running of the exper-
Table A2
L9
34 orthogonal array
iment, the test plans (including the experimental order, where n is the number of replicates and yn is the exper-
repetition, randomization, preparation and coordi- imental response. Larger S=N values mean that strong
nation) have to be developed. These preliminary eorts signals and little noise (interference) exist during the ex-
are essential and are important for smooth and e- periment. Therefore, larger S=N values are desired in the
cient execution of the experiment. Taguchi Method. After the S=N calculations, the level
average analysis can be performed to obtain the opti-
A.5. Analysis for DOE mum solution. The demonstration example and the
detailed descriptions of data analysis for the Taguchi
The analysis phase of the experiment is employed to method are shown in Section 3 of this paper.
convert a row of data into meaningful information
and to interpret the results. The ®rst step of an analy- A.7. Con®rmatory experiment
sis for DOE is to assume a statistical model for the ex-
periment. The model contains the eects of the factors, After the analysis phase, the optimum combination
their interactions, and error estimations. Three of the levels in each factor can be obtained. A con®r-
assumptions (normality, independence and equal var- matory experiment at these settings is vital for check-
iance) are applied in the model. According to the stat- ing the reproducibility of the optimum combinations,
istical model, the Analysis of Variance (ANOVA) can and for con®rming the assumptions used in planning
be accomplished using the SAS software. The ANOVA and designing the experiment.
is the sensitivity analysis for the levels in each factor.
Therefore, the eects of dierent factors, levels, and
interactions between factors can be realized. Finally, References
the model adequacy checking has to be performed to
prove the three assumptions in the model. For check- Arora, J.S., 1989. Introduction to Optimum Design. McGraw-Hill,
ing the normality, the Kruskal±Wallis Test New York.
Gill, P.E., Murray, W., Margaret, H.W., 1981. Practical
(Montgomery, 1991) is always used. For the indepen- Optimization. Academic Press, New York.
dence and equal variance assumptions, residual plots John, G.P., Dimitris, G.M., 1996. Digital Signal Processing.
(Montgomery, 1991) can be used. If the model is ade- Prentice-Hall, Englewood Clis, NJ.
quate, the General Regression Model in the SAS soft- Khaw, B.S.L., John, F.C., Lennie, E.N.L., 1995. Optimum design of
ware (Montgomery, 1991) can be used to forecast the neural networks using the Taguchi method. Neurocomputing 7,
225±245.
optimum combination of the experiment. The demon- Lin, T.Y., Tseng, C.H., 1998. Using fuzzy logic and neural network
stration example and the detailed descriptions of data in bicycle derailleur system tests. In: Proceedings of the
analysis for DOE are shown in Section 4 of this paper. International Conference on Advances in Vehicle Control and
Safety (AVCS'98), Amiens, France, 338±343.
Miller, G.F., Todd, P.M., Hedge, S.U., 1989. Designing neural net-
A.6. Analysis for the Taguchi method
works using genetic algorithms. In: In: Proceedings of the Third
International Conference on Genetic Algorithms, 379±384.
Dr. Taguchi recommends analyzing the mean response Montgomery, D.C., 1991. Design and Analysis of Experiments.
for each experiment in the orthogonal array, and analyz- Wiley, New York.
ing the variation using the signal-to-noise
S=N ratio. Patterson, D.W., 1996. Arti®cial Neural Networks, Theory and
Applications. Prentice-Hall, Englewood Clis, NJ.
The S=N ratio for three dierent objective functions are:
Peace, G.S., 1993. Taguchi Methods, A Hands-On Approach.
Addison-Wesley, Reading, MA.
S y2 y22 . . . y2n
ÿ10 logj 1 j, Taguchi, G., 1986. Introduction to Quality Engineering. Asian
N n
A1 Productivity Organization, Tokyo.
Teo, M.Y., Sim, S.K., 1995. Training the neocognitron network
for a smaller-the-better characteristic using design of experiments. Arti®cial Intelligence in Engineering
9, 85±94.
Wang, C.C., Tseng, C.H., Fong, Z.H., 1996. A method for improv-
1 1 1
... 2 ing shifting performance. International Journal of Vehicle Design
S y21 y22 yn 18 (1), 100±117.
ÿ10 logj j,
A2 Zell, A., 1995. SNNS, Stuttgart Neural Network Simulator User
N n
Manual, Version 4.1, Report No. 6/95. University of Stuttgart,
for a larger-the-better characteristic, IPVR, Stuttgart, Germany.