BOBBY SOETARTO 301145953 APRIL 4TH 2014 Final Project: Object Recognition OVERVIEW The goal of our project is object recognition and classication. We are trying to build an application that will input an image and the application will ouput telling users what is the classication of it. In our project, we are using 2 category images, airplanes and cows. In order for this work, we decided to use machine learning.
IMAGE RECOGNITION NEURAL NETWORK Image recognition through neural network is a network that consist of many hierarchy layers of articial neuron. It is divided into 3 layers, the bottom, middle, top layer and it represents the input, image processing and output respectively. It is also divided into 2 segments visual and associative segment. How it works is that, there is an input image which creates features that can be passed into the image processing layer. The processing layer associate the features with the appropriat algorithm and outputs it. MULTILAYER PERCEPTION The multilayer Perceptron (MLP) is a type of neural network that is capable of image recognition. It consists a input, hidden layer and the output. The MLP architecture of neural networks works best in classifying nonlinearly-separable data points as it has hidden layers in between the input and the output layer in it architecture. Users can store image database in the network in which the MLP network will then train itself to learn these images using a backpropagation algorithm. BACKPROPAGATION ALGORITHM The multilayer Perceptron (MLP) is a type of neural network that is capable of image recognition. It consists a input, hidden layer and the output. The MLP architecture of neural networks works best in classifying nonlinearly-separable data points as it has hidden layers in between the input and the output layer in it architecture. Users can store image database in the network in which the MLP network will then train itself to learn these images using a backpropagation algorithm. NEUROPH STUDIO Neuroph simplies the development of neural networks by providing Java neural network library and GUI tool that supports creating, training and saving neural networks. This is the software that we have decided to work with to help us in machine learning. It uses the method and networks that was explained earlier. It has built-in image recognizing GUI that helps us in the project. WORK DESCRIPTION - Phase 1 Database Research Learning of visual object categories require many training images. So, we searched for databases of images in order to build object categories to be able to recognize objects and object categories in images. We found image databases of airplanes in the Computational VIsion at Caltech website. They also had 101 categories of different image database. Computational Vision at Caltech http://www.vision.caltech.edu/html-les/archive.html Here, we have found image database of cows. LabelMe http://labelme2.csail.mit.edu/Release3.0/index.php WORK DESCRIPTION - Phase 2 Software Research For building the framework of measuring similarities between the images, we searched for the softwares in which we can prototype example of categories. Weka http://www.cs.waikato.ac.nz/ml/weka/ Matlab http://www.mathworks.com/products/matlab/ Both Machine learning software, Matlab and Weka, use boosting, a machine learning meta-algorithm. Neuroph http://neuroph.sourceforge.net/screenshots.htm WORK DESCRIPTION - Phase 3 At rst, neuroph studio had a built in image recognition application. We tried using that, however, to no avail. The application was only able to do image recognition and not classication. For example, if we had a database image and when we run through neuroph studios built in image recognition, it will only output images that is similar to those that the users image input. It did not really classify. Weka is able to do image processing, but only using raw-text le that has the rgb values. However, Neuroph Studio was similar to Weka and Neuroph Studio had a better interface of using the software. So, we chose to use Neuroph Studio instead. Matlab is a machine learning software that uses java codes and we had to learn a whole new library that we could not fully grasp within a short period of time. And it was expensive. Trial and Error RESULT The result of the project was successful and will be shown in the results will be shown in the demo later on. We were able to get the neuroph studio to get it working and it was able to learng the images and ouput the correct classication.
Resizing the images Images were resized to 20x20 Getting the RGB values of the Images Getting a RGB value for each image for input to the neuroph studio. Creating the training and test sets in Neuroph Studio Using neuroph studio creating the training set and test sets, from the text le that contains the RGB values as inputs and outputs. 80% for training set 20% test set. Training the network We created networks with multilayer perceptron with the same number of inputs and outputs and the number of hidden layer of neurons Testing the network After training, is time to test it using the image set with 20% of the RGB value RESULT class getRGBvalue extends Frame { BufferedImage airImage; BufferedImage airImage1; BufferedImage airImage2; BufferedImage airImage3; BufferedImage airImage4; BufferedImage airImage5; BufferedImage airImage6; BufferedImage airImage7; BufferedImage airImage8; BufferedImage airImage9;
BufferedImage cowImage; BufferedImage cowImage1; BufferedImage cowImage2; BufferedImage cowImage3; BufferedImage cowImage4; BufferedImage cowImage5; BufferedImage cowImage6; BufferedImage cowImage7; BufferedImage cowImage8; BufferedImage cowImage9; // This is where the image is passed into the function where it will return an array containing // the RGB values of image. public ArrayList<Float> lterImage (BufferedImage img) { int width = img.getWidth(); int height = img.getHeight(); ArrayList<Float> myList = new ArrayList<Float>(); for (int i = 0; i < width; i++){ for (int j = 0; j < height; j++) { int rgb = img.getRGB(i, j); oat newR = 0, newG = 0, newB = 0;
} } RESULT RESULT Figure 1. Creating Test Set Figure 2. Creating Training Set Figure 3. Training Network Figure 4. Testing Network RESULT From the diagram below, you can see that in the test result, the Output value:0.0183;0.1183 Which is closed to the desired output of 0,1. This means that the network is stating that row of RGB value belongs to an airplane image. Thus, this is our result of project, where the software successfully machine learn and classify the 2 category images that we wanted. Figure 6. Total Mean Square Error Display Figure 5. Output Display