2 views

Uploaded by Aniket Roy

Optical Recognition of Handwritten Digits

- ANN help
- Principal component
- Concept of adaptive control.pdf
- Back Propagation Algorithm to Solve Ordinary Differential Equations
- SPE-163370
- Understanding your data.pdf
- GAIT RECOGNITION BASED ON LDA AND ITS VARIANTS
- (SpringerBriefs in Computer Science) M.N. Murty, Rashmi Raghava (Auth.)-Support Vector Machines and Perceptrons_ Learning, Optimization, Classification, And Application to Social Networks-Springer Int
- ANN Matlab
- hw1
- Pub130
- US Federal Reserve: 200025pap
- Ludvigson & Ng - Forecasting Bond Risk Premia
- Neural Networks Paper
- EFFICIENT POWER THEFT DETECTION FOR RESIDENTIAL CONSUMERS USING MEAN SHIFT DATA MINING KNOWLEDGE DISCOVERY PROCESS
- 10.1.1.3.8589.pdf
- SIA 544 Final Exam 132
- An Analysis of the Quality of the financial institutions in Thailand
- 01 Overview
- Akash Machine Learning 101

You are on page 1of 6

**ECE539 Project Report
**

Pradeep Rajendran

December 20, 2013

1 Introduction

Recognition of handwritten digits has many everyday uses. It is particularly applied in the auto-

mated sorting of postal addresses based on zipcode. A general implementation of such a system is

breifly described.

As the postal item moves on a conveyor belt, robotic mechanisms reposition the item such that

the address label is visible to a camera. Then, the camera registers an image of the label. The

image is then fed into a image processing system which dissects the image into constituent character

blocks. The digit blocks composing the zipcode field is then thresholded and pre-processed to ensure

uniform scale and orientation. After preprocessing, the binary image of a digit block is ready for

use in machine learning tools.

The pre-processing step is rather involved and out of the scope of this project as it requires many

image processing steps. It cannot be skipped as it is crucial to the success of most machine learning

tools.

In this project, I focus on how the performance of common machine learning tools compare with

each other.

2 Dataset

The MNIST(Mixed National Institute of Standards and Technology) dataset [1] is used in this

project. All the digit images have been pre-processed such that the digit is centered on a 28 × 28

block of 8-bit grayvalues. Fig. 1 shows an example of a digit block containing the character ‘8’.

Fig. 2 shows a ensemble of digit blocks containing the character ‘5’.

1

e. 1: 28 × 28 digit block containing character ‘8’ Fig. 2 . 2: A small subset of training ensemble containing character ‘5’ 3 Tools applied 3. The second layer has h hidden neurons. And. The first layer is the input layer and it has 784 inputs. 5 10 15 20 25 5 10 15 20 25 Fig. 10 digits).1 Multi-layer perceptron A two layer perceptron implementation found in the Neural Network Toolbox is utlized for this section. the final layer is the output layer consisting of 10 neurons corresponding to the 10 class labels (i.

it appears that the number 4 is often misclassified as it is furthest away from the point (0. it is clear that h = 100 gives better performance than h = 200. 4. This might be due to over-fitting associated with h = 200. 3.2 Support vector machine The LIBSVM suite of tools developed by Chih-Chung Chang and Chih-Jen Lin is used in this section [2]. The main tools are: svm-scale (used for scaling data). h Error rate (%) 10 8.09 120 11.22 200 21. 3: An example of 2-layer neural network with 10 hidden neurons The table below shows the various values of h that were tried and the corresponding performances. 4: ROC plot and its zoomed version (right) From the ROC. 1). svm-train (used to obtain support vectors 3 .92 100 2. Fig. (a) ROC for h = 100 (b) ROC for h = 100 Fig.46 20 8. The ROC plots for h = 100 are given in Fig.27 Table 1: Various values of h and correspoding error rates From Table 1.

from given data). testing is performed using svm-predict. each training feature vector becomes a m-dimensinal vector instead of the original 784-dimensional vector as shown in Fig. . A scale file is also produced along with the scaled training file.01 Polynomial (Order 6) 1. Since there are 10 000 test vectors. 5. vm ]) are also obtained. the m eigenvectors (vj ) and 10 eigendigits (Ej = [v1 v2 . While performing PCA.18 Polynomial (Order 5) 2. This is because. . This model file contains the support vectors identified in the scaled training file. Therefore. Once the model file is obtained. Choice of parameters There are many choices of parameters that can be chosen during the training phase of the SVM. In this way. a similarity metric between each test feature vector and all the training feature vectors has to be calculated. Kernel Type Error rate (%) Linear 6. 3.92 Polynomial (Order 8) 1. Table 2 shows the error rate corresponding different kernels and parameters.3 K-Nearest Neighbor (K-NN) K-NN is the simplest classification method. The scale file contains the appropriate scale values that have to be applied to the testing data as a pre-processing step. The scaled training file is then input to svm-train which produces a model file. Eigen-digit method The Eigen-digit method involves using PCA (Principal Component Analysis) to reduce the dimen- sionality of the feature space from M to m. it seems that highest performance is achieved when a polynomial kernel of order 6 or 7 is used.99 Gaussian 2.85 Polynomial (Order 4) 2.92 Polynomial (Order 7) 1. Method svm-scale is first used to to scale the input feature vectors in the training file to have a value of either +1 or -1. it is also a slow and memory intensive method. for this dataset. But. svm-predict (used to make predictions based on a trained model).71 Table 2: Performance for different kernels According to Table 2. 60 000 similarity metric calculations have to be performed for each test vector. svm-predict uses the model file to determine the classification of test vectors and produces an output file containing these predictions. the total similarity metric calculation calls sums up to 600 000 000. 4 .66 Sigmoidal 10. during testing phase.

6). oj . During the comparision. ct .. 50-dim training vector io n E 9 50-dim vector Fig. each of 784-dimensional test feature vector is projected to the m- dimensional feature space using the eigen-digits obtained in the training phase. v1 Weight 1 v2 Weight 2 Projection 784-dim vector v3 Weight 3 5-dim vector v4 Weight 4 v5 Weight 5 Fig. Amongst the closest K matches.. 5: Example of dimensionality reduction for m = 5 During the testing phase. Distance bewteen 50-dim vector 50-dim training vector E0 50-dim training vector n io e ct . The resulting m-dimensional feature vector is then compared with the huge collection of labeled m-dimensional training feature vectors (leftmost vectors in Fig. . K labled closest matches from the collection are identified. Pr . Projection E5 784-dim test vector 50-dim vector 50-dim training vector Pr 50-dim training vector o je .. the most frequently occuring label is taken as the classification output of the Eigen-digit method. 6: Processing steps illustrated for m = 50 and K = 3 5 .

59 5. 2.7 4.981 8. 6 . Cortes.011 8. 27:1–27:27. 2010.01 4.23 4. “MNIST handwritten digit database. Combinations K=1 K=3 K=5 K=7 K = 11 Computation time (s) m = 10 61.991 13.com/exdb/mnist/. 2.731 3348 m = 784 3.661 5907 Table 3: Error rate and computation time for various combinations of m and K Increasing K does not seem to improve error rate.16 4. Chang and C. 2011.156 60. References [1] Y.” ACM Transac- tions on Intelligent Systems and Technology.12 4.851 410 m = 100 5. pp. [2] C.656 62.841 10.59 5. KNN with m = 784 and K = 1 nearly took 1 hour and 40 minutes of computation time for calculating labels of 10 000 test vectors.77 4. A trade-off between computation time and error rate can be acheived by picking a value for m for which the classification rate specifications are still met.496 71 m = 50 8.391 1786 m = 500 3.451 6.49 3.76 4.286 63.60 seconds for each test vector which is too slow. increasing m has a diminishing improvement on error rate.-C.261 5.-J.65 5.161 6.47 3.546 61. 4 Conclusion The SVM seems to be particularly well suited for digit recognition as it has the the best performance when compared to the MLP or KNN methods.381 6.47 respectively. This translates to about 0.78 5.501 837 m = 200 4.141 7. vol.lecun.931 9. LeCun and C. MLP and KNN are 1.03 4. Lin.” http://yann.09 and 3.621 1474 m = 250 3. And. The best error rates of SVM. SVM is not only accurate but also faster than the other methods tested. “LIBSVM: A library for support vector machines.92.

- ANN helpUploaded byiitgn007
- Principal componentUploaded bygvaf
- Concept of adaptive control.pdfUploaded byKaruna Avatara Dasa
- Back Propagation Algorithm to Solve Ordinary Differential EquationsUploaded byIJRASETPublications
- SPE-163370Uploaded bymonka02535
- Understanding your data.pdfUploaded byFITHRAH NEELAM FATHIN BT RAHIM MAN171145
- GAIT RECOGNITION BASED ON LDA AND ITS VARIANTSUploaded byAnonymous vQrJlEN
- (SpringerBriefs in Computer Science) M.N. Murty, Rashmi Raghava (Auth.)-Support Vector Machines and Perceptrons_ Learning, Optimization, Classification, And Application to Social Networks-Springer IntUploaded byJosmir S. Rodríguez K.
- ANN MatlabUploaded byprashmce
- hw1Uploaded bysagar0596
- Pub130Uploaded byDayana Arias
- US Federal Reserve: 200025papUploaded byThe Fed
- Ludvigson & Ng - Forecasting Bond Risk PremiaUploaded byramdabom
- Neural Networks PaperUploaded byUpender Dhull
- EFFICIENT POWER THEFT DETECTION FOR RESIDENTIAL CONSUMERS USING MEAN SHIFT DATA MINING KNOWLEDGE DISCOVERY PROCESSUploaded byAdam Hansen
- 10.1.1.3.8589.pdfUploaded byKumar Anupam
- SIA 544 Final Exam 132Uploaded byFahim Jatmiko
- An Analysis of the Quality of the financial institutions in ThailandUploaded byADBI Events
- 01 OverviewUploaded bywerwerwer
- Akash Machine Learning 101Uploaded byAkash Mavle
- CSE-Artificial-Neural-Networks-report.docxUploaded byKarthik Vanam

- dmdwUploaded byNaganarasimha Sanpanga
- Clustering Using Shared Reference Points Algorithm Based On a Sound Data ModelUploaded byAI Coordinator - CSC Journals
- Tay 2016Uploaded byhenfa
- 2.Diabetic Prognosis Through Data Mining Methods and TechniquesUploaded byBannuru Rafiq
- 2010 Jmvlsc KeelUploaded byAthira Ks
- Curriculum for Data ScienceUploaded bynamkhanh87
- Data Mining_ Concepts, Methods and Applications in Management and Engineering DesignUploaded byDai Wantanakorn
- 1Uploaded byArnav Guddu
- Weka Overview SlidesUploaded byCristhian Paul Ramirez Guaman
- DSS Assignment 2 Ali LafciogluUploaded bymathunter
- A survey of Knowledge Discovery and Data Mining process modelsUploaded bydoud98
- 1. Introduction 140818Uploaded byMadhu Meghana
- MachineLearning IntroductionUploaded byganapathy2010sv
- Srs SubmitUploaded byAjit Singh Kushwah
- Crisp Dm ModelUploaded byBharati77
- E3 Answers Nov 2013Uploaded byGihanEdiri
- Turnover Prediction of Shares Using Data Mining TechniquesUploaded byCS & IT
- Decision Support SystemsUploaded byDheeraj_Gopal
- Randomized Response Technique in Data MiningUploaded byEditor IJRITCC
- Rule InductionUploaded byquesijoda5
- MGMT FINAL EXAM flashcards _ Quizlet.pdfUploaded byAshish Walle
- E Commerce5E Ch06Uploaded byLuân Nguyễn
- ORACLE’S DATA WAREHOUSING IMPLEMENTAITON BOOT CAMPUploaded byrams08
- SAS Enterprise Miner 12.3Uploaded byaminiotis
- An Efficient Learning of Constraints for Semi-Supervised Clustering Using Neighbour Clustering AlgorithmUploaded byEditor IJRITCC
- Syllabus - Data Mining Solution With WekaUploaded byJaya Wijaya
- Turban Dss9e Ch05Uploaded byKHLAED
- SEECS-2712Uploaded byKisan Kisan
- CSE-ComputerAndInformationScience SyllabusUploaded byRencheeraj Mohan
- Ceit 13 Call for PaperUploaded byAhcene Bouzida