You are on page 1of 13

IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 10, NO.

4, JULY 1999

815

Design of Fuzzy Systems Using Neurofuzzy Networks


Maur cio Figueiredo and Fernando Gomide, Member, IEEE
Abstract This paper introduces a systematic approach for fuzzy system design based on a class of neural fuzzy networks built upon a general neuron model. The network structure is such that it encodes the knowledge learned in the form of ifthen fuzzy rules and processes data following fuzzy reasoning principles. The technique provides a mechanism to obtain rules covering the whole input/output space as well as the membership functions (including their shapes) for each input variable. Such characteristics are of utmost importance in fuzzy systems design and application. In addition, after learning, it is very simple to extract fuzzy rules in the linguistic form. The network has universal approximation capability, a property very useful in, e.g., modeling and control applications. Here we focus on function approximation problems as a vehicle to illustrate its usefulness and to evaluate its performance. Comparisons with alternative approaches are also included. Both, nonnoisy and noisy data have been studied and considered in the computational experiments. The neural fuzzy network developed here and, consequently, the underlying approach, has shown to provide good results from the accuracy, complexity, and system design points of view. Index Terms Fuzzy systems design, knowledge-based nets, learning, neurofuzzy nets.

I. INTRODUCTION NE of the most important problems when designing fuzzy rule-based systems is to derive the desired fuzzy rule base. A set of essential design issues such as the number of ifthen fuzzy rules, partition of the universes, and membership functions, inter-alia, must be addressed [1], [2]. Although fuzzy systems have reached a recognized success in several and relevant application areas, few systematic design procedures are available, especially when structure determination is of concern. Trial-and-error has been a natural choice to design fuzzy systems since their origins. Two factors associated to fuzzy systems have contributed for this practice. First, fuzzy rules can be easily and directly formulated by experts in the form of linguistic rules; second, the fuzzy system performance does not suffer from critic degradation due to parameters dened in a nonoptimal way. These factors turn trial-and-error very practical in certain circumstances. However, this approach is not suitable or even feasible when there is no linguistic knowledge available or expert knowledge must be tuned to data [3], [4].
Manuscript received December 10, 1997; revised February 26, 1998 and July 8, 1998. The work of M. Figueiredo was supported by FAPESP, the Research Foundation of the State of S ao Paulo, under Fellowship 93/3034-8. The work of F. Gomide was supported by the CNPq, the Brazilian National Research Council, under Grant 300 729/86-3. The authors are with Unicamp-Feec-Dca, 13083-970 Campinas, SP, Brazil. Publisher Item Identier S 1045-9227(99)05973-1.

Much research efforts have been devoted to develop alternative design methods [5]. For instance, clustering techniques is considered as a plausible approach for fuzzy rule derivation [4], [6]. However, methods based on clustering usually do not provide fuzzy systems described in the linguistic form or tables. Moreover, the fuzzy rule base derived may be incomplete, that is, it may not cover the whole input space or gaps may be found. Many design methods based on mathematical programming and optimization theory have been suggested [7]. Unfortunately, slow convergence and local solutions, very common in algorithms based on optimization theory, often impose practical constraints in using the corresponding fuzzy design methods. To be effective, these methods need, in general, good designer insights to generate the desired solutions. Combinations of neural networks, and fuzzy systems (or neurofuzzy systems for short) have been recognized as a powerful alternative approach to develop fuzzy systems. Some neurofuzzy networks are capable to learn and to provide ifthen fuzzy rules in linguistic or explicit form [2], [3], [8][11]. However, most of the current neurofuzzy approaches address parametric identication or learning only. In general, the designer chooses membership functions shape and the respective parameters are adjusted. In many neurofuzzy design techniques, the fuzzy sets involved are dened in multidimensional spaces. Very often, this turns rule interpretation very difcult. In addition, some of them are not consistent with fuzzy set and fuzzy reasoning theory. See [12] for a recent overview on neurofuzzy models in the spirit of the one addressed here, and [13][15] for examples of approaches developed within this vein. As pointed out in [8], in addition to parametric learning, structure learning problems deal with the partition on the inputoutput universes, the number of membership functions for each input, the number of fuzzy ifthen rules, and so on. Few results on structure determination have been available in the open literature. For instance, in [16] (see also the references therein) a neurofuzzy system with fuzzy training data (fuzzy ifthen rules) and supervised learning was proposed. It provides a mechanism to nd the number of rules, assuming exponential rule membership functions. However, the designer still has to specify the input and output space partitions, and to conveniently choose some network parameters. The case of noisy data was not considered. In this paper we propose a class of neurofuzzy network whose aim is to design fuzzy systems via learning. The main ideas, as far as net topology is concerned, were already

10459227/99$10.00 1999 IEEE

816

IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 10, NO. 4, JULY 1999

indicated in [5]. The underlying network structure and learning algorithm suggest a systematic approach for design. The main features of the proposed approach include: 1) it determines the membership functions for each input variable based on training data; thus no a priori assumptions on membership functions are needed; 2) it processes data according to fuzzy reasoning mechanisms; 3) it is trivial to recover the encoded knowledge in the form of linguistic rules and put them in the usual, intuitive format; 4) the desired number of rules to represent a fuzzy system is the only structural decision to be specied by the designer, the input space partition and parameters being automatically determined by the learning algorithm; 5) it learns rules covering the whole input space; and 6) it has universal approximation character. Moreover, the new learning strategy introduced in this paper is robust. That is, learning still can be achieved when data is noisy. The network learns in two main phases. In the rst, the network self-organizes to determine the rules and the respective antecedent membership functions. In the second phase it learns consequent parameters using a supervised scheme. The network structure encodes ifthen fuzzy rules in which the consequent is a function of the input variables [17]. The knowledge encoded in the net is transparent: it is easy to map fuzzy rules into the network topology and, vice-versa, to extract fuzzy rules from the network. The neural fuzzy system is an universal approximator, an important characteristic in, e.g., modeling and control applications. Simulation results show that the neurofuzzy network introduced here provides a good compromise between accuracy and complexity. That is, it achieves good approximation capabilities with a small number of rules, and provides the essential design parameters to assemble rule-based fuzzy systems. After this introduction, Section II describes a fuzzy neuron model and the network topology proposed. Section II also explains how fuzzy rules (knowledge) can be extracted from or inserted into the network architecture. The learning algorithm is addressed in Section III. Sections IV and V discuss the convergence properties and the universal approximation capability of the neural fuzzy network, respectively. Simulation results are included in the Section VI. The proposed network is compared with alternative approaches and its performance is also evaluated on Section VI. Section VII concludes the paper and presents suggestions for future works. II. NEUROFUZZY NETWORK TOPOLOGY The topology of the network addressed here was developed bearing in mind two essential points: the mapping between fuzzy rules and the network topology should be direct, and fuzzy inference and neural processing should fully agree.

These points are necessary conditions to guarantee that fuzzy systems have the desired dual nature. In other words, it is possible see a fuzzy system either as a fuzzy rule-based system, or as a neural network, both having identical qualitative and quantitative properties. An alternative view of equivalence between fuzzy systems and neural networks has also been introduced in [18]. Based on the conditions above, the network proposed in this paper was designed to match with the fuzzy inference mechanism described at the bottom of the page, where is the th fuzzy variable, is the th input fuzzy set, is a fuzzy set corresponding to the th antecedent of the th rule and and are real variables is a real dened in the output space, and will be dened valued function of the input variables ( in the next section). . The input space (universe) of the th input is All the spaces are assumed to be discrete. That is, if the th is , then the interval of the universe th universe is discretized in disjoint intervals , where . The membership function of a fuzzy dened in the universe is denoted by set if For the fuzzy sets and , the membership functions and are denoted by if if The numerical consequent is determined via a sequence of the three reasoning stages ( and mean -norm and -norm, respectively) [19]: 1) Matching: For each rule and each antecedent compute the possibility measure for fuzzy sets and and

where is taken over all . 2) Antecedent Aggregation: For each rule activation level as follows:

compute its

3) Rule aggregation: Compute the output using

input rules output If If If

is is is is

and and and and

is is is is

and and and and

is is is is then is then is then is

FIGUEIREDO AND GOMIDE: DESIGN OF FUZZY SYSTEMS

817

Fig. 1. The neuron model: i and  are the synaptic and the input aggregation operators, respectively, and ' is the activation function.

Often, traditional neuron and network models are not general enough to compose neurofuzzy networks with the properties of interest here. This is because the synaptic operator, the aggregation operator and activation function are usually chosen from a very limited range of options. This limitation does not allow, for instance, to model the three fuzzy reasoning stages described above. However, according to some known neurophysiological results, a biological neuron model can support a large number of functions and operators to represent synaptic and somatic activities, e.g., -norms and -norms [20]. For computational purposes, we may summarize the main characteristics of the biological neuron studied in [20] in an articial neuron model depicted in Fig. 1. This neuron model , for each synapse, may have different synaptic operators and an activation function has an aggregation operator . Therefore, its output is . The network architecture addressed here is constructed using instances of the general model. For simplicity, from now on we assume the same synaptic operator for all synapses. The network conceived from that neuron model is shown in Fig. 2. It has a feedforward architecture with ve layers. The rst layer is divided into groups of neurons. To each input group is associated a single fuzzy variable. Thus, for groups of neurons in the input layer. The variables we have individual neurons of each group represent a discrete value of the corresponding input space. More precisely, an input neuron , and decodes of the th group receives a single signal , the value and transmits it to the second layer. This is transmitted by the th neuron placed in the th group because is, dening and as identity functions, the output . In the network of Fig. 2 we assume that, for , the activation function an interval of the neuron associated with the th interval is as depicted in Fig. 3. groups (representing rules), The second layer comprises neurons (representing rule antecedents). each of which with This layer performs the rst stage of inference namely, matching. The th neuron of th group computes . Neurons of the th group in the second layer receive only inputs from neurons of the th group in rst layer. Particularly, the th neuron of the th second layer group connects with the th neuron of the th rst layer group through the synaptic weight . Thus, -norm, the output of second layer neurons are, assuming -norm and identity function, . neurons in the third layer. The th neuron There are receives only inputs from neurons of the th second layer group. The connections have unity weights. Neurons in the

third layer compute the antecedent aggregation if we set identity function, -norm and identity function. (see Fig. 2). Thus, the output of the th neuron is Two neurons compose the fourth layer. For both neurons, we assume , and as the algebraic product, the algebraic sum, and the identity function, respectively. One of them connects with all neurons of the third layer through the synapses whose . Thus, its output constitutes the numerator weights are . The of the fuzzy inference output other neuron is also fully connected with all neurons of the third layer through unity weight connections. Therefore, its output of it is just the denominator of . The last layer has a single neuron to compute . It is connected with the two neurons of the fourth layer through unity weights. The identity function, algebraic division, and identity function are assumed as its synaptic operator, aggregation operator and activation function, respectively. Clearly, the neurofuzzy network of Fig. 2 encodes a set of fuzzy ifthen rules in its topology, and processes information in a way that matches the fuzzy reasoning scheme adopted. Table I summarizes the network and neurons characteristics. III. LEARNING STRATEGY
AND

ADAPTATION

The main idea of the learning strategy is, in general terms, similar to that used in counterpropagation networks [21]. In other words, the neurofuzzy network learns through a two phase strategy. The rst is a self-organizing phase to let the second layer neurons to cluster and to perform weight adaptation. This means learning rules and membership functions of each rule antecedent. The second phase uses a supervised scheme for rule consequents adaptation. The details are as follows. Assume a data set composed by inputoutput pairs denoted where by and denotes transpose. Dene, for each rule , the center of where the th component area vector is found by

and is the center of the interval (see Fig. 3). If the parameter denotes a step of the learning algorithm, then denotes the center of area vector for the th rule at the th step. In the rst, self-organizing phase, only the inputs are needed. For each step we choose a pair from and the corresponding . The groups of neurons input data in the second layer (each group is enclosed by a dotted rectangle in Fig. 2) have an important role in this learning is presented at th step, these phase. When an input groups will compete. The group wins if the Euclidean to is the minimum. Only neuron groups distance from adjust that belong to a neighborhood of the winner group (which are the discrete their respective synaptic weights ). To values of the corresponding membership function the intervals, and , to which the center of adjust

818

IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 10, NO. 4, JULY 1999

Fig. 2. The neural fuzzy network. TABLE I NEURONS CHARACTERISTICS

NETWORK STRUCTURE

AND

Adjustments of and , as follows. , then 1) If

depend on the relative positions of

if if otherwise
Fig. 3. Activation function of input neurons.

(a) (b) (c) (1)

2) If area and the current input belongs, are determined . More formally, if and are the required for indexes, then they are as follows:

, then if if otherwise. (a) (b) (c)

and

and bewhere ing the learning rate and the neighborhood function, respectively.

FIGUEIREDO AND GOMIDE: DESIGN OF FUZZY SYSTEMS

819

Fig. 4. Adjustments of

i jk (t); k

= 1; 1 1 1 ; Qj , considering the intervals

Ijp ; Ijq

and

Ijk

Fig. 4 illustrates how (a), (b) and (c) dened in (1) are used , considering the intervals , and . to adjust , Thus, if the th network input is then the rst phase of the learning algorithm can be summarized as follows. . 1) Initialization: Dene the number of fuzzy rules , Choose the initial values for the weights randomly. Dene a monotone decreasing function for the and a neighborhood function . learning rate to be used in the Choose suitable values for and stop criterion. from as an input for the 2) Input: Choose a vector neurofuzzy network. 3) Competition: Compute the Euclidean distances :

dene

as follows: STAR STAR

STAR (2)

where

STAR

and determine the winner group is mimimum:

as the one for which

4) Adjustments: For all neurons within the neighborhood , adjust according to (1). of the winner group and . Update or , 5) Stop criterion: If then go to 6), otherwise go to 2). 6) End. In the second, supervised learning phase, input and output which, in turn, data are presented to nd the weights determine the consequent of each rule. We assume weights as a parameterized function of the input variables. This assumption is similar to the one introduced in [17]. To dene of the rule , we have to focus on the the consequent input space region where the respective neural group is the . winner. This region is represented by the set of indexes is a set of indexes such that if and only That is, . More specically, we if

and phase; Euclidean distance. and correspond to the The pairs such that maximum and minimum points for the inputs . The parameter STAR is the mean of all outputs . To nd the maximum and minimum points and , we take all the STAR parameter for each rule inputoutput pairs from the data set , determine each set and consider the denitions given in (2). For example, to nd , determine such that . Take the pair and set . We adopt the gradient descent method to adjust the paramand . Therefore, the adjustment procedure reads eters as (3) is the learning rate, is any where means either or convenient performance index, and . Assuming the usual squared error index , with , being the pair chosen at the th step, , that is,

is the cardinality of . are parameters to be adjusted by this learning

820

IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 10, NO. 4, JULY 1999

we easily derive the expressions for

2Determine of

by computing the average , that is

STAR STAR (4) ) is the net output for the input , and . The second phase learning algorithm can be summarized as follows. , and 1) Initialization: Determine the values of given in (2). STAR according to the denition of and randomly. Choose initial positive values for , and suitable values for Choose the learning rate and to be used in the stop criterion. . 2) Input: Take an inputoutput pair 3) Adjustments: Adjust the parameters and according to (3). 4) Stop criterion: Return to 2) until or . 5) End. Summing up, during the self-organizing phase data is input . The membership and the net learns the synaptic weights functions, partition of the input universes, and rules are also determined in this phase. Next, the supervised phase uses inputoutput pairs to learn the parameters of the rule consequents . So far, only nonnoisy data have been discussed. In what disturbed by an additive, zero mean follows, we assume noise. Note that, except for the additive hypothesis, there is no loss of generality in assuming zero mean noise. Recalling that STAR is computed based on an average (see (2)), the term STAR is not affected by noise because it is of zero mean. Therefore, the denition of STAR does not need to be changed. On the other hand, the maximum and minimum points are seriously affected because, as suggested by (2), they are taken directly from the data set. To reduce the noise effects, we propose a two-stage method to nd the proper maximum and minimum points. For this purpose, we detail the only. algorithm to compute the maximum points can be found by an algorithm The minimum points which is just a minor modication (to be made explicit later) of the one used to nd the maximum points. Before proceeding, is a set of indexes such that if let us recall that , and that is and only if its cardinality of . Thus, the algorithm to nd the maximum points is as follows. First stage: Determination of . whose cardinality, estab1Construct a subset . The elements of are the lished a priori, is greatest values, . indexes of the and and , In other words, if . then where Second stage: Determination of . whose cardinality, estab1Construct a subset . The elements of are the lished a priori, is closest to . That is, if indexes of the vectors and and , then . as follows: 2Compute

The main idea behind the rst stage is very intuitive: the for which are the greatest are close noisy data pairs to the real (nonnoisy) maximum point. Therefore, the average , is a good approximation for . In of vectors computing the average of the second stage, we nd , because this operation has the tendency to cancel the inuence of a zero mean noise. , it is only necessary To nd the minimum points to modify the rst stage of the algorithm above to construct with the indexes of the least values , the set and in the second stage [consider the item 2-(b)], to replace by . In short, if learning data is noisy, then only and terms in (2) have to be computed properly, as suggested above. Overall, the learning algorithm remains the same. IV. APPROXIMATION It has been proved that some types of fuzzy systems are universal approximators [22], [23]. In the Section II, we have highlighted that the proposed network has a dual character because it can be seen either as a fuzzy system or as a neural network. Therefore, the Theorem 1 shows that the proposed neurofuzzy network is an universal approximator. be a continuous Theorem 1: Let and is such that function dened on a compact . Consider the fuzzy rule type and the inference mechanism dened in Section II to construct the neurofuzzy network. There exists a set of fuzzy rules or, and alternatively, a network, such that for any the following is satised: (5) where is the output of the network for input . The proof is omitted for brevity. V. CONVERGENCE The function used in unsupervised learning phase adjustments (1) guarantee the convergence of synaptic weights . This is because it must be a decreasing function of the number of steps , e.g., an exponentially decreasing function. Obviously, the convergence rate of the supervised

FIGUEIREDO AND GOMIDE: DESIGN OF FUZZY SYSTEMS

821

phase depends of the initial condition, the data set presentation , as scheme, and the correct choice of the learning rate widely discussed in the literature, e.g., [24]. Note that there is no a priori assumption or restriction . In concerning the adjustments of the synaptic weights other words, it is not required to specify membership functions shapes and forms. Nonetheless, it is possible to prove that, under specic circumstances, that the membership functions do converge to triangular shapes, as summarized by the Theorem 2 below. of the input Theorem 2: Assume that each component , is chosen according to an uniform probability density function during the unsupervised learning phase. Suppose that the centers of area , do not change during the learning process converge. Then, for and that the mean value of , the membership functions any , converge to triangular functions. The proof is omitted for brevity. VI. SIMULATION RESULTS In this section we present samples of experimental results to illustrate the characteristics and capabilities of the neurofuzzy network, or NFN for short, proposed in this paper. , for In all experiments we assume 30 discrete intervals . The centers of area , all universes at , are chosen randomly. Considering that can be initialized as follows: the synaptic weights if if As dened in (1), the factor is composed by two and the neighborhood function factors: the learning rate

(a)

(b)

(c) Fig. 5. and (c) t for  (t) < 0; 001. Membership function A1 1 (x) at three steps: (a) t = 0, (b) t = 500

where is the winner group for the input is updated by The learning factor

where if During competition, ties are resolved by choosing the least index. The aggregation and synaptic operators of the second layer neurons are the -norm maximum and the -norm minimum, respectively. The aggregation operator of the third layer neurons is the -norm minimum. . The unsupervised learning phase stops when The rst two experiments are intended to illustrate the effectiveness of the algorithm during membership functions learning. Experiment 1: Only one fuzzy rule is considered. The fuzzy rule has one antecedent. As stated above, here we focus on the unsupervised learning phase. The purpose is to show that the . network is able to construct the membership function Inputs are chosen within the unit interval according to an at uniform probability density function. Fig. 5 shows ; 2) and; and 3) for which three steps: 1) . As we can easily note, the membership function

The neighborhood function , then If if where If . , then if

is dened as follows. if

if and otherwise The winning group sion: is found using the following expres-

822

IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 10, NO. 4, JULY 1999

Substituting these values in (2) we get

or

Fig. 6. Membership function Gaussian distribution.

A1 1 (x) when inputs are chosen according to a

approximates a triangular function, as anticipated by Theorem 2. Experiment 2: The neural fuzzy network is the same as in the Experiment 1. However, the inputs are now chosen within the unit interval according to a 0.5 mean Gaussian probability density whose variance is 0.025. Fig. 6 shows the shape of after convergence. The membership function obtained associates higher membership grades to points within a short neighborhood of the 0.5. Interesting enough, Experiments 1 and 2 show that the learning algorithm captures membership functions based on evidence contained in the data, not on any a priori choice about its form or shape. In the next experiments, the network is used for function approximation purposes because this a relevant issue in many engineering and data processing, especially in fuzzy modeling is and control applications. The learning factor and in (3). The initial used to adjust the parameters and are such that . values of For performance characterization and comparison purposes, we assume the usual RMSE performance index

The original function and the consequent are depicted in Fig. 7(a). Note that the approximation is much better within the maximum and minimum points than in the . The Fig. 7(b) extremes. The RMSE obtained was details the contribution of each of the three terms that compose . The remaining experiments consider the same ve functions as suggested in [25]. of two variables dened in They are taken here to compare the proposed network with alternative neurofuzzy and rule based approaches developed for the same purpose. a) b) c) d) e) Experiment 4: The network is used to approximate each . The NFN structure encodes function four fuzzy rules. A set with 2000 inputoutput pairs is used for and is randomly chosen learning. Each input component within the unit interval according to an uniform probability density function. The functions are normalized. Figs. 811 show in (a) the original functions and and in (b) the approximation provided by the network NFN. Next, we choose two alternative approaches to compare with the proposed network, namely the backpropagation multilayer network (BACK) and the neurofuzzy network (NIE) introduced in [3]. The reason for this choice, among numerous alternatives, is that the multilayer net is a landmark in function approximation problems, whereas the NIE net is inspired, in general terms, in the same paradigm as used by the NFN. The BACK network has one hidden layer with 11 neurons. The MATLAB environment was used during computations with BACK. The parameters of the NIE network are set as (used to adjust the dispersion follows: shrinking factor of each fuzzy set), the unsupervised and supervised learning and , respectively; and the dispersion rates . In NIE, a new neuron is added to the network whenever its output is greater than 0.25. The membership functions of NIE are Gaussians. is portrayed in Fig. 12(a). The The original function approximations provided by NFN, NIE and BACK are depicted in (b), (c) and (d), respectively. Table II summarizes the results obtained by the NFN for all the ve functions whereas Table III shows the RMSE performance of NFN, BACK and NIE.

RMSE

is the net output, the desired output, and is the number of input/output pairs. Obviously, training and test are distinct data sets. Experiment 3: Initially, we focus on the characteristics of . To see this, we take the simplest the rule consequent case in which the network architecture encodes only one fuzzy rule with one antecedent. The function is used as example. The single consequent should approximate for all . The inputs are chosen according to an uniform probability density function. After convergence of both, the unsupervised and supervised were learning algorithms, the parameters of

where

and and STAR

and

FIGUEIREDO AND GOMIDE: DESIGN OF FUZZY SYSTEMS

823

(a)

(b) Fig. 7. (a) Function f (x) () and consequent mean exponential ().
g

1 (x) (- - - - -). (b) Terms of

1 (x): constant term (111), u 1 mean exponential (- - - - -) and

(a) Fig. 8. (a) Function


f

(b)

1 (x

1 ; x2 ) and (b) NFN approximation (RMSE = 0:008).

(a) Fig. 9. (a) Function


f

(b)

2 (x

1 ; x2 ) and (b) NFN approximation (RMSE = 0:043).

824

IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 10, NO. 4, JULY 1999

(a) Fig. 10. (a) Function f 3 (x1 ; x2 ) and (b) NFN approximation (RMSE = 0:038).

(b)

(a) Fig. 11. (a) Function f 4 (x1 ; x2 ) and (b) NFN approximation (RMSE = 0:071). TABLE II APPROXIMATE

(b)

PERFORMANCE

OF

NFN

TO

THE

FUNCTIONS

(1)

TABLE III PERFORMANCE OF NFN, BACK AND NIE TO APPROXIMATE THE FUNCTION f 5 (1)

We note that the performance of the BACK and NFN nets are equivalent and that both are much better than the NIE net. However, we should point out that we can not extract the knowledge acquired by BACK as easily as we do with NFN and NIE networks. In addition, the number of rules encoded by NFN was four while BACK derived 11 rules (see Theorem 1 of [26]) and NIE derived 142 rules. This shows that the NFN efciently encodes rules within its structure. To illustrate how easy is to get a set of fuzzy rules in a simple, intuitive linguistic and tabular format, let us consider the result obtained in the approximation of the function . Fig. 13 depicts the membership functions extracted from the NFN topology after learning, including the centers in the unity square. The neural fuzzy network of area generated four membership functions for each dimension, but since some of them are superposed it clearly indicates that each of the input space dimension need to be partitioned by only two fuzzy sets labeled SMALL and BIG, respectively.

To generate the fuzzy rules (in other words, to derive the Table IV), we just assign the label of each fuzzy set found to each table dimension, and get the respective consequent in the net to ll the corresponding table entry. The parameters of each rule consequent are also trivially extracted from the NFN topology. For example, consider the consequent . The following parameters were found by NFN:

STAR and

Thus,

is as follows:

Fig. 13 and the Table IV reveal a signicant characteristic of the learning algorithm: for a given number of rules it tends to partition the input space with a reduced number of linguistic labels.

FIGUEIREDO AND GOMIDE: DESIGN OF FUZZY SYSTEMS

825

(a)

(b)

(c)

(d)

Fig. 12. (a) Function f 5 (x1 ; x2 ). (b), (c), and (d) Approximations provided by NFN (RMSE = 0:035), NIE (RMSE = 0:121), and BACK (RMSE = 0:032), respectively.

Fig. 13. Membership functions mation of f 5 (x1 ; x2 ).

i j (x )

extracted from NFN after approxi-

TABLE IV FUZZY RULE TABLE TO APPROXIMATE THE FUNCTION f 5 (1)

but the function for which it encodes six fuzzy rules. to compare the For brevity, we consider the function performances of NFN with the BACK and NIE nets, and with an additional, nonneural approach introduced in [25] called FSR. The results provided by NFN are portrayed in Fig. 14, and the RMSE performance of NFN and FSR are compared in Table V. Table V also shows the corresponding number of rules between brackets. Note that NFN performs better than FSR for all function approximation examples except for the function . Even in this case, approximation of the function , NFN achieves acceptable performance with a much smaller number or rules than FSR. Fig. 15(a)(d) shows the approximations of function provided by NFN, FSR, BACK and NIE, respectively. The Table VI summarizes their RMSE indexes. The best performance was obtained by NFN, which in addition encoded a small number of fuzzy rules (4) against NIE (144) and FSR (25). Therefore, this experiment reinforces the property of the NFN in providing good tradeoff as far as accuracy, complexity and system design are concerned. Moreover, we may view the NFN approach as underlying a systematic design procedure for fuzzy rule-based systems.

Experiment 5: In this case we adopt the same function approximation task as in the previous experiment, except that now the learning data are perturbed with a zero mean Gaussian noise whose variance is 0.05. That is, the data pairs are , where . The NFN encodes four fuzzy rules to approximate all

VII. CONCLUSION A class of neural fuzzy network and a learning and adaptation strategy was introduced in this paper as systematic approach to design of fuzzy, rule-based systems. The main features of the approach developed are: 1) it nds the membership functions for each input variable and no a priori assumptions

826

IEEE TRANSACTIONS ON NEURAL NETWORKS, VOL. 10, NO. 4, JULY 1999

(a)

(b)

(c) Fig. 14.

(d)

Approximations provided by NFN for functions f 1 (1); f 2 (1); f 3 (1), and f 4 (1) are shown in (a) , (b), (c) and (d). See performance indexes in Table V.

(a)

(b)

(c) Fig. 15.

(d)

Approximation results: (a) NFN (RSME = 0:050), (b) FSR (RSME = 0:130), (c) BACK (RSME = 0:094), and (d) NIE (RSME = 0:211). TABLE V FSR APPROACHES CONSIDERING NOISY DATA

PERFORMANCE OF NFN

AND

on membership functions are needed; 2) the neural processing is a form of fuzzy reasoning mechanism; 3) the knowledge encoded can be trivially recovered in form of linguistic rules; 4) the number of fuzzy rules is the only information needed from the designer; the neural fuzzy network determines the remaining essential fuzzy system parameters automatically through the learning procedure; 5) it learns rules covering

FIGUEIREDO AND GOMIDE: DESIGN OF FUZZY SYSTEMS

827

TABLE VI PERFORMANCES OF NFN, BACK, NIE, AND FSR

TO

APPROXIMATE OF f 5 (1)

the whole input space; 6) it has an universal approximation characteristic; and 7) it is robust when handling noisy data. Simulation results for the function approximation problem have shown that the neurofuzzy network proposed here provides good accuracy with low complexity. As opposed to many alternative neurofuzzy approaches (with parameter or structure learning), once the number of rules is given, the approach provides algorithms to determine the partition of the input spaces, the membership functions of each rule antecedent, and the fuzzy rules as well. These are essential decisions when designing fuzzy systems and applications. However, as it is the case with any fuzzy rule-based system, the scheme herein developed may, in principle, yield an exponential growth in the number of rules as inputs are added. Nevertheless, recent works, e.g., [27], are indicating procedures to overcome complexity. In addition, since the number of rules is assumed to be given in our approach, the designer may control the most appropriate number of rules during the design cycle. The complexity issue and questions on how to optimally nd the number of fuzzy rules, to choose the triangular norms and conorms, and to handle more general rule forms, e.g., truth qualied rules, are important research considerations left for future works. ACKNOWLEDGMENT The authors would like to thank the anonymous reviewers for their constructive comments and suggestions. REFERENCES
[1] A. Bastian, Toward a fuzzy system identication theory, in Proc. 6th IFSA World Congr., S ao Paulo, Brazil, 1995, vol. 2, pp. 6972. [2] W. Pedrycz, Fuzzy Sets Engineering. Boca Raton, FL: CRC, 1995. [3] J. Nie, Constructing fuzzy model by self-organizing counterpropagation network, IEEE Trans. Syst., Man, Cybern.,, vol. 25, pp. 963970, 1995. [4] W. Pedrycz, Fuzzy Control and Fuzzy Systems. New York: Wiley, 1993. [5] M. Figueiredo and F. Gomide, A neural fuzzy network: Structure and learning, in Fuzzy Logic and Its Applications, Information Sciences and Intelligent Systems, Z. Bien and K. Min, Eds. Amsterdam, The Netherlands: Kluwer, 1995, pp. 177186. [6] M. Delgado, A. Skarmeta, and F. Martin, Using fuzzy clustering in a descriptive fuzzy modeling approach, in Proc. 6th Int. Conf. IPMU96, Granada, Spain, 1996, vol. 1, pp. 563568. [7] T. Wakabayashi, K. Itoh, and A. Ohuchi, A method for constructing of system models by fuzzy exible interpretative structural modeling, in Proc. Int. Joint Conf. 4th IEEE Int. Conf. Fuzzy Syst. Sec. Int. Fuzzy Eng. Symp., Yokohama, Japan, 1995, pp. 913918. [8] J. Jang and C. Sun, Neuro-fuzzy modeling and control, Proc. IEEE, 1995, vol. 83, pp. 378406. [9] H. Ishibuchi, R. Fujioka, and H. Tanaka, Neural networks that learn from fuzzy ifthen rules, IEEE Trans. Fuzzy Syst., vol. 1, pp. 8597, 1993. [10] J. Keller and H. Tahani, Implementation of conjunctive and disjunctive fuzzy logic rules with neural networks, Int. J. Approximate Reasoning, vol. 6, pp. 221240, 1992. [11] J. Keller, R. Yager, and H. Tahani, Neural network implementation of fuzzy logic, Fuzzy Sets Syst., vol. 45, pp. 112, 1992. [12] H. Bersini and G. Bontempi, Now comes the time to defuzzify neurofuzzy models, Fuzzy Sets Syst., vol. 90, no. 2, pp. 161169, 1997.

[13] H. Takagi, N. Suzuki, T. Koda, and Y. Kojima, Neural networks designed on approximate reasoning architecture and their applications, IEEE Trans. Neural Networks, vol. 3, pp. 752760, 1992. [14] H. Takagi and I. Hayashi, NN-driver fuzzy reasoning, Int. J. Approximate Reasoning, vol. 5, pp. 191212, 1991. [15] S. Mitra and L. Kuncheva, Improving classication performance using fuzzy MLP and two-level selective partitioning of the feature space, Fuzzy Sets Syst., vol. 70, pp. 113, 1995. [16] C. Lin and Y. Lu, A neural fuzzy system with fuzzy supervised learning, IEEE Trans. Syst., Man, Cybern. Part B, vol. 26, pp. 744763, 1996. [17] T. Takagi and M. Sugeno, Derivation of fuzzy control rules from human operators control actions, in Proc. IFAC Symp. Fuzzy Inform., Knowledge Representation Decision Anal., Marseilles, France, 1983, pp. 5560. [18] J. Buckley and Y. Hayashi, Numerical relationship between neural networks, continuous functions, and fuzzy systems, Fuzzy Sets Syst., vol. 60, pp. 18, 1993. [19] R. Yager and D. Filev, Essentials of Fuzzy Modeling and Control. New York: Wiley, 1994. [20] A. Rocha, Neural Nets. Berlin, Germany: Springer-Verlag, 1992. [21] R. Hetch-Nielsen, Neurocomputing. San Diego, CA: Addison Wesley, 1989. [22] J. Castro and M. Delgado, Fuzzy systems with defuzzication are universal approximators, IEEE Trans. Syst., Man, Cybern., vol. 26, no. 1, pp. 149152, 1996. [23] B. Kosko, Fuzzy systems as universal approximators, in Proc. IEEE Int. Conf. Fuzzy Syst., San Diego, CA, 1992, pp. 11531162. [24] S. Haykin, Neural Networks, A Comprehensive Foundation. Englewood Cliffs, NJ: Prentice-Hall, 1994. [25] R. Rovatti and R. Guerrieri, Fuzzy sets of rules for system identication, IEEE Trans. Fuzzy Syst., vol. 4, pp. 89102, 1996. [26] J. Benitez, J. Castro, and I. Requena, Are articial neural networks black boxes?, IEEE Trans. Neural Networks, vol. 8, pp. 11561164, 1997. [27] W. Combs and J. Andrews, Combinatorial rule explosion eliminated by a fuzzy rule conguration, IEEE Trans. Fuzzy Syst., vol. 6, pp. 111, 1998.

Maur cio Figueiredo received the M.Sc. and Ph.D. degrees in electrical engineering from the State University of Campinas, Campinas, S ao Paulo, Brazil, in 1991 and 1997, respectively. He is currently Professor of Computer Science at State University of Maring a, Paran a, Brazil. His main research interest is in the development of methodologies to design of intelligent autonomous control systems for mobile vehicles using computational intelligence theories (neural networks, fuzzy systems, evolutionary computation).

Fernando Gomide (S79M83) received the B.Sc. degree in electrical engineering from the Polytechnic Institute of the Catholic University of Minas Gerais (IPUC), Belo Horizonte, Brazil, the M.Sc. degree in electrical engineering from the State University of Campinas (Unicamp), Campinas, Brazil, and the Ph.D. degree from the Case Western Reserve University (CWRU), Cleveland, OH. He was with the Center for Technology of Information (CTI), Campinas, and currently is a Professor in electrical and computer engineering at the State University of Campinas (Unicamp), Brasil. He has published numerous papers in the area of system modeling, optimization and control, large-scale systems, fuzzy sets and systems, computational intelligence and semiotics, logistics and applications. His current research interests include decision and system analysis, logistics, control and applications, formal and applied articial intelligence, fuzzy sets, neural nets and computational intelligence. Dr. Gomide serves on editorial boards of Fuzzy Sets and Systems, Journal of Uncertainty Fuzziness and Knowledge-Based Systems, Intelligent Automation and Soft Computing, Journal of Advanced Computational Intelligence, and SBA Journal Control and Automation.

You might also like