You are on page 1of 4


1(3), DECEMBER 2011


Random Iterative Extreme Learning Machine for Classication of Electronic Nose Data
J. Siva Prakash and R. Rajesh India Technologies, 211, TVS nagar, Edayarpalayam, Coimbatore -25 email: Department of Computer Applications, Bharathiar University, Coimbatore - 641046, India email:

AbstractRecently, Extreme Learning Machine (ELM) has been proposed, which signicantly reduce the amount of time needed to train a Neural Network. The performance of ELM depends on the random input weights assigned. This paper reviews ELM and introduces Random Iterative ELM (RI-ELM), which iteratively assigns random weights and hence nds the optimal weights for ELM. The performance through the classication example of iris data set shows the effectiveness of RI-ELM. Finally RI-ELM is used to effectively classify electronic nose data. Index Termsingle Layer Neural Network, Extreme Learning Machine, Classication, Electronic Nose Dataingle Layer Neural Network, Extreme Learning Machine, Classication, Electronic Nose DataS

ELM algorithm have been appeared in the literature [11], [16], [19]. This paper deals with random iterative ELM and its performance has been compared with other methods. This paper is organized as follows. Section 2 deals with a review of ELM. Section 3 deals with RIELM. Section 4 deals with classication of electronic data set and section 5 conclude the paper. II. E XTREME L EARNING M ACHINE - A R EVIEW Extreme Learning Machine proposed by Huang at el [6], [7] uses Single Layer Feedforward Neural Network (SLFN) Architecture [1]. It randomly chooses the input weights and analytically determines the output weights of SLFN. It has much better generalization performance with much faster learning speed. It requires less human interventions and can run thousands times faster than those conventional methods. It automatically determines all the network parameters analytically, which avoids trivial human intervention and makes it efcient in online and realtime applications. A. A Note on Single Hidden Layer Feedforward Neural Network Single Hidden Layer Feedforward Neural Network (SLFN) function with hidden nodes [8], [12] can be represented as mathematical description of SLFN incorporating both additive and RBF hidden nodes in a , unied way is given as , . where and are the learning parameters of hidden nodes and the weight connecting the th hidden node to the output node.

I. I NTRODUCTION Neural Networks have been extensively used in many elds due to their ability to approximate complex nonlinear mappings directly from the input sample; and to provide models for a large class of natural and articial phenomena that are difcult to handle using classical parametric techniques. There are many algorithm for training Neural Network like Back propagation [18], Support Vector Machine (SVM) [4], [10], Hidden Markov Model (HMM) [14] etc. One of the disadvantages of the Neural Network is the learning time. Recently, Huang et al [6], [17] proposed a new learning algorithm for Single Layer Feedforward Neural Network architecture called Extreme Learning Machine (ELM) which overcomes the problems caused by gradient descent based algorithms such as Back propagation [18] applied in ANNs[9], [10]. ELM can signicantly reduce the time needed to train a Neural Network and a number of papers based on



is the output of the th hidden node with respect to the input . For additive hidden node with the activation function (e.g., sigmoid and threshold), is given by , . is the weight vector connecting the input layer to the th hidden node and is the bias of the th hidden node. denotes the inner product of and in . vector For RBF hidden node with activation function (e.g., Gaussian), given by , . and are the center and impact factor of th RBF node. indicates the set of all positive real values. The RBF network is a special case of SLFN with RBF nodes in its hidden layer. For , arbitrary distinct samples . Here, is a input vector and is a target vector. If an SLFN with hidden nodes can approximate these samples with zero error, then there exist , and such that


Equation (1) can be written compactly as



the output weights using pseudoinverse of H giving only a small error . The hidden node parameters of ELM and (input weights and biases or centers and impact factors) need not be tuned during training and may simply be assigned with random values. The following theorems state the same. Theorem 1: (Liang[12]) Let an SLFN with additive or RBF hidden nodes and an activation function which is innitely differentiable in any interval of R be given. Then, for arbitrary distinct input vectors and randomly generated with any continuous probability distribution, respectively, the hidden layer output matrix is invertible with probability one, the hidden layer output matrix H of the SLFN is invertible and Theorem 2: (Liang[12])Given any small positive value and activation function which is innitely differentiable in any such that for interval, there exists arbitrary distinct input vectors , for any randomly generated according to any continuous probability distribution with probability one. Since the hidden node parameters of ELM need not be tuned during training and since they are simply assigned with random values, eqn (2) becomes a linear system and the output weights can be estimated as follows.

where is the Moore-Penrose generalized inverse of the hidden layer output matrix and (4) can be calculated using several methods includ ing orthogonal projection method, orthogonalization method, iterative method, singular value decomposi. (5) . tion (SVD), etc. The orthogonal projection method . can be used only when is nonsingular and . Due to the use of searching with ; ; orthogonalization method and iterative . is the hidden layer output matrix of and iterations, limitations. Implementations of ELM method have SLFN with th column of being the ith hidden uses SVD to calculate the Moore-Penrose generalized nodes output with respect to inputs . inverse of , since it can be used in all situations.
. . .


B. Principles of ELM ELM [6], [7] designed as a SLFN with L hidden neurons can learn L distinct samples with zero error. the Even if the number of hidden neurons (L) number of distinct samples (N), ELM can still assign random parameters to the hidden nodes and calculate

III. R ANDOM I TERATIVE E XTREME L EARNING M ACHINE The working of random iterative extreme learning machine is shown below. 1) Initialize j=1 and Maximum Performance, =0
2) Randomly assign the input weights,

and bias weights,



3) Calculate the hidden layer output matrix H, with , given by ;

of 90.86%. The performance comparison of the RIELM algorithm with PCA+MLP is shown in table II. V. C ONCLUSION This paper has reviewed extreme learning machine and has shown that it outperforms backpropagation neural networks. This paper also introduces random iterative ELM (RI-ELM) which is used to classify electronic nose data and the results show the effectiveness of RI-ELM over other methods. ACKNOWLEDGEMENT This research is supported by University Grants Commission, India through a major research project grant (UGC F. No. 33-62/2007 (SR) dated 28th Feb 2008). The authors are also thankful to all the staffs of the SCSE and Bharathiar University for their valuable help. R EFERENCES
[1] Annema, A.J. and Hoen, K. and Wallinga, H,Precision requirements for single-layer feedforward neural networks, In: Fourth International Conference on Microelectronics for Neural Networks and Fuzzy Systems, pp: 145-151, 1994. [2] James C. Bezdek, James M. Keller, Raghu Krishnapuram, Ludmila I. Kuncheva, and Nikhil R. Pal, Will the Real Iris Data Please Stand Up?,IEEE Trans. on fuzzy systems, Vol. 7, No. 3, June 1999. [3] Shyi-Ming Chen, Yao-De Fang, A New Approach for Handling the Iris Data Classication Problem, International Journal of Applied Science and Engineering, 3, 1: 37- 49, 2005. [4] Corinna Cortes, Vladimir Vapnik, Support-Vector Networks, Machine Learning, Vol. 20, pp: 273-297, 1995. [5] Dehuri, S., Cho, S.-B., Multi-criterion Pareto based particle swarm optimized polynomial neural network for classication: A Review and State-of-the-Art, Computer Science Review,pp: 19-40, 2009. [6] Guang-Bin Huang, Qin-Yu Zhu, Chee-Kheong Siew, Extreme Learning Machine: A New Learning Scheme of Feedforward Neural Networks, International Joint Conference on Neural Networks, Vol. 2, pp: 985-990, 2004. [7] Guang-Bin Huang, Qin-Yu Zhu, Chee-Kheong Siew, Extreme Learning Machine: Theory and Applications, Neurocomputing, Vol. 70, pp: 489-501, 2006. [8] Guang-Bin Huang, Lei Chen, Chee-Kheong Siew, Universal Approximation Using Incremental Constructive Feedforward Networks with Random Hidden Nodes, IEEE Transactions on Neural Networks, Vol. 17, No. 4, pp: 879-892, 2006. [9] Karayiannis, A. N. Venetsanopoulos, Articial Neural Network: learning algorithms, performance evaluation, and applications, Kluver Academic, Boston, MA, 1993. [10] Derong Liu, Guest Editors, Huaguang Zhang, Sanqing Hu, Neural networks: Algorithms and applications, Neurocomputing, Vol. 71, pp: 471-473, 2008. [11] Nan-Ying Liang, Paramasivan Saratchandran, Guang-Bin Huang, Narasimhan Sundararajan, Classication of mental tasks from EEG signals using Extreme Learning Machine, International Journal of Neural Systems, Vol. 16, No. 1, pp: 2938, 2006.

4) Calculate the output matrix using Moore-Penrose generalized inverse of the hidden layer output matrix calculated using singular value decomposition (SVD). 5) Evaluate the performance ( ) of the resulting ELM 6) If

, where is the

End If 7) Do these procedures from step 2 to 6 iteratively for number of times (ie., j=1,2, ,n), where is the predened maximum number of iterations.

Save , , H,

Inorder to demonstrate RI-ELM, classication of Fishers Iris data set is shown here. The dataset [2] consists of 50 samples from each of three species of Iris owers (Iris setosa, Iris virginica and Iris versicolor) with four features (length of sepal, width of sepal, length of petal and width of petal). RIELM with 28 hidden nodes is able to learn the data and achieve a testing accuracy of 98.67%. The performance comparison of the ELM algorithm with other algorithms are shown in table I. IV. C LASSIFICATION OF E LECTRONIC N OSE DATA S ET USING RI-ELM Commercial coffees are blends which contain coffees of various origins and its analysis and control is of great importance. Characterization of coffee is usually done using the chemical prole of one of its fractions such as the headspace of green or roasted beans or the phenolic fraction. The relative abundance of 700 diverse molecules identied so far in the headspace depends on the type, provenance and manufacturing of the coffee. None of these molecules can alone be identied as a marker, on the other hand one has to consider the whole spectrum, as for instance, the gas chromatographic prole. Electronic noses get the gas chromatographic prole and the related software helps to quantify the concentration of the constituents gaseous mixtures in case of simple mixtures and analyze gaseous mixtures for discriminating between different (but similar) mixtures. In this section, coffee data is considered for classication obtained from the Pico Electronic Nose. Blends group of 7 coffees were analyzed using classication algorithm RI-ELM. The blends group has 5 features extracted from a sensor response curve. In our simulations, RI-ELM with 33 hidden nodes is able to achieve a testing accuracy



TABLE I C LASSIFICATION ACCURACY OF IRIS DATA SET Algorithm Chen-and-Fang method (2005) [3] ANN (2008) [15] ELM RI-ELM (no. of iterations = 100) Accuracy % 97.33 94.87 93.68 (averaged over 100 runs) 98.67 Learning time 1 min 1 min 3.2 sec 3.2 sec

TABLE II C LASSIFICATION ACCURACY OF N OSE D ATA SET Algorithm PCA + MLP, M. Pardo et. al. (2002) [13] RI-ELM (no. of iterations = 350) Accuracy % 87% 90.86% learning time 1 min 11.06 sec

[12] Nan-Ying Liang, Guang-Bin Huang, Hai-Jun Rong, P. Saratchandran, N. Sundararajan, A Fast and Accurate On-line Sequential Learning Algorithm for Feedforward Networks, IEEE Transactions on Neural Networks, Vol. 17, No. 6, pp: 1411-1423, 2006. [13] M. Pardo, G. Sberveglieri, Coffee analysis with an Electronic Nose, IEEE Transactions on Instrumentation and Measurement, VOL. 51, NO. 6, DECEMBER 2002. [14] Lawrence R. Rabiner, A tutorial on Hidden Markov Models and selected applications in speech recognition, Proceedings of the IEEE, 257-286, 1989. [15] N. P. Suraweera, D. N. Ranasinghe, Adaptive Structural Optimisation of Neural Networks, The International Journal on Advances in ICT for Emerging Regions, pp: 33 - 41, 2008.

[16] S. Suresh, R. Venkatesh Babu, H. J. Kim, No-reference image quality assessment using modied extreme learning machine classier, Applied Soft Computing, Vol. 9, pp: 541552, 2009. [17] Dianhui Wang, Guang-Bin Huang, Protein Sequence Classication Using Extreme Learning Machine, Proceedings of International Joint Conference on Neural Networks, Vol. 3, pp: 1406- 1411, 2005. [18] Paul John Werbos, Werbos, The Roots Of Backpropagation: From Ordered Derivatives To Neural Networks And Political Forecasting, Wiley-interscience, 1994. [19] Chee-Wee Thomas Yeu, Meng-Hiot Lim, Guang-Bin Huang, A New Machine Learning Paradigm for Terrain Reconstruction, IEEE Geoscience and Remote Sensing Letters, Vol. 3, No. 3, 2006.

You might also like