You are on page 1of 2

2017 IEEE International Conference on Consumer Electronics - Taiwan (ICCE-TW)

Automatic Recognition of Clothes Pattern and Motifs


Empowering Online Fashion Shopping
K.S. Loke, Swinburne University of Technology Sarawak Campus, Member, IEEE

embedded in a vocabulary tree. Yamaguchi [5] also used the


AbstractAutomatic recognition of clothes pattern and styles human pose to extract clothing segments and used a variety
have broad application in consumer online fashion shopping. In features including color histogram and Gabor features. A
this research, we propose a fast and accurate method of couple of recent works also used convolutional neural
recognizing a clothes textile design and pattern. We used a
modified 6-channel co-occurrence matrix with a random forest
networks to match clothes style and patterns [6] [7] [8] but
classifier. We tested the accuracy of recognizing clothing of none involved direct textile patterns.
fashion models and obtained results of 93%. There are some recent works that addressed texture
recognition using features from filter banks [9] and Fisher
I. INTRODUCTION Vectors [10]. These local feature descriptors have been
In recent years, there have been renewed interests in clothes coupled with classifiers like random forest, support vector
recognition, classification and retrieval. Recent research has machines and convolutional neural networks (CNN) [11].
begun to address specifically clothes-based recognition and However, textures are not like clothes patterns.
detection, in parallel with a surge in startup investments on
fashion-based apps. We are interested in the consumer III. APPROACH
application of clothes recognition in particular related to We selected a range of images from the fashion website
content-based retrieval of similar clothes based on textile Forever21.com by using these search terms: floral, patterns,
patterns and motifs. The general idea is to facilitate the plaids, paisley, geometric and tribal. From these images, we
consumer to search and purchase similar textile patterns. For collected 100 different classes of patterns with 10-11 samples
example, a consumer may view a dress with a particular for each class, each 100 by 100 pixels, sampled from pictures
pattern on the television or on the street; he or she may capture
of models. These image patches represent the clothes patterns.
it by camera and use it to search for a similar dress. This is not
The patches are extracted from images of fashion models
a new idea but better precision is needed than general content-
(Fig.1). From the above, we assembled three datasets. The first
based retrieval. Our work currently does not address dress
style yet. dataset, D1, consist of 100 classes with 1080 training samples,
Our work is broadly related to two broad categories of each image sampled from the front, side and back views of
works. One is in texture recognition and the other in clothes models. Training and test are 100x100 patches.
recognition. Our work is a combination of both but not strictly The second dataset D2 consists of training image patches
classified as texture recognition though it is quite related. In from D1 but uses the fashion models front, side and back
texture recognition, the research is mostly interested in view as test. The size of the dataset is 98 classes.
identifying natural material surfaces like marble, wood grains, The third dataset, D3, consists of training and test image
tree barks and so on. Natural textures tend to be uniform and taken from mutually exclusive model images (Fig. 2). The
finer in structure. The other works involve classification of total number of classes is reduced to 84 due to lack of suitable
clothes and their styles. A sub-category of that is classification patches in some images.
by the textile pattern and motifs such as floral, plaids, Our approach centers on calculating the second order
geometric and so on. Many of these works attempt to classify statistics of the image grey levels. Grey Level Co-occurrence
texture and motifs into a broad themed category such as floral Matrix (GLCM) [12] counts the grey level occurrence
and so on. However, such broad classifications have between all pairs of nearby pixels. If we consider the pixels
indeterminate boundaries and can be subjective. In our view, red, green and blue channels, there would be six-paired
linguistic classification is not very useful. For example GLCM such as red-red, blue-blue, green-green, red-blue, red-
retrieving all floral patterns is not a useful task we are looking green and blue-green. The matrix Mab for channels a-b, can be
for a dress with similar patterns. Therefore, we are interested mathematically formulated as in (1):
in accurate recognition of clothes patterns and motifs.
(, ) = [(, ) ] [( + , (1)
II. RELATED WORKS , ,
+ ) ]
We briefly review some of the recent works. Bossard et al
where []is the Kronecker delta, I(x,y) is the pixel intensity
[1] used a random forest [2] for classifying type of clothes and
at (x,y) and (tx, ty) is the translation of the paired pixel from
support vector machines for the style of the apparel. They used
(x,y). U is the set of pixels that is one pixel from (x,y) in all
multiple features including Scale-Invariant Feature Transform
directions.
SIFT [3] to create a spatial pyramid of code books. Fu et al [4]
Cheong et al [13] used orthogonal polynomial (O.P.) to
uses the human pose to extract bag of words features

978-1-5090-4017-9/17/$31.00 2017 IEEE 375


encode the GLCM matrix. The calculation of O.P. is slow, conjunction with object detection such as convolutional neural
therefore we use the raw numbers in the matrix reduced to networks to obtain full classification of the clothes. However,
32x32 by averaging the neighborhood (Fig.1). This method there is still an unresolved issue on how to deal with different
reduces overfitting and is fast. Cheong et al tested only 30 scale. This is because the scale will affect the statistical
classes using 150 samples based on Batik patches; here we properties of the pixel distribution. This is the topic of our
tested on 87-100 classes using fashion model images. future research.
We implemented the training and tests in Java using Weka
Java libraries. For the convolutional neural network test, we
used the Keras Python libraries.

Fig.2.D3 dataset example - training: back view (left) and testing: front
view (right). Testing is done by calculating mean score for each patches,
resulting in accuracy of 93.4% overall for 87 classes.

Fig.1. (Left) Image patch with the 6 GLCM . (Right) Source of the REFERENCE
patches. The GLCM are used as input features.

[1] L. M. Bossard, Dantone, C. Leistner, C. Wengert, T. Quack and L. V.


IV. RESULTS & DISCUSSION Gool, "Apparel classification with style," in ACCV, 2012.
The results are presented in Table I. [2] L. Breiman, "Random forests," Machine Learning, vol. 45, pp. 5-32,
2001.
TABLE I [3] D. G. Lowe, "Distinctive image features from scale-invariant keypoints,"
Test-Type Test Data Algorithm Accuracy (%) International Journal of Computer Vision, vol. 60, no. 2, pp. 91-110,
10-fold D1: 100 Random Forest, 96.9 2004.
classes,11 100 trees, 13 [4] J. Fu, J. Wang, Z. Li, M. Xu. and H. Lu, "Efficient clothing retrieval with
samples each features semantic-preserving visual phrases," in ACCV, 2012.
10-fold D1: as above Multi-layer neural 97.0
[5] K. Yamaguchi, M. H. Kiapour, L. E. Ortiz and T. L. Berg, "Parsing
network, 3 hidden
clothing in fashion photographs.," 2012.
layers
Single run D1: 10 training, Convolutional 28.0 [6] M. H. Kiapour, X. Han, S. Lazebnik, A. C. Berg and T. L. Berg, "Where
1 as test NN, 2 layers of 32 to buy it: Matching street clothing photos in online shops," in ICCV,
conv filters. 2015.
Single run D2: 98 classes, Random Forest, 97.0 [7] J. Huang, R. S. Feris, Q. Chen and S. Yan, "Cross-domain image
training & test 100 trees, 13 retrieval with a dual attribute-aware ranking network.," in ICCV, 2015.
data overlap; 1- features
[8] Z. Liu, P. Luo, S. Qiu, X. Wang and X. Tang, "DeepFashion: Powering
3 test images
Robust Clothes Recognition and Retrieval with Rich Annotations," in
Single run D3: 84 classes, Random Forest, 93.4
CVPR, 2016.
training data 100 trees, 13
different from features [9] V. Andrearczyk and P. F. Whelan, "Using Filter Banks in Convolutional
test. Neural Networks for Texture Classification," Pattern Recognition
Letters, vol. 84, pp. 63-69, 2016.
The results obtained are quite respectable, and the random [10] M. Cimpoi, S. Maji, J. Kokkinos and A. Vedaldi, "Deep Filter Banks for
forest training and test is very fast. The CNN we used is Texture Recognition, Description, and Segmentation," International
Journal of Computer Vision, vol. 118, no. 1, pp. 65-94, 2016.
essentially the same used in by LeCun et al [14]. CNN are
[11] F. H. C. Tivive and A. Bouzerdoum, "Texture classification using
useful for extracting invariant image features but in the case of convolutional neural networks," in 2006 IEEE Region 10 Conference
textile patterns the statistical relationship between pixels are (TENCON 2006), Hong Kong.
more important. Experiments with D2 and D3 used fashion [12] M. Haralick, K. Shanmugam and I. Dinstein, "Textural features forimage
model images for testing (Fig. 2) but patches for training. The classification," IEEE Trans. Syst., Man, Cybernetics, vol. 3, no. 6, pp.
610-621, 1973.
winning classification is determined by the largest average
[13] M. Cheong and K.-S. Loke, "Textile Recognition Using Tchebichef
score after subdividing the image into 100x100 patches. Moments of Co-occurrence Matrices," Lecture Notes in Computer
Science, vol. 5226, pp. 1017-1024, 2008.
V. CONCLUSION [14] Y. LeCun, L. Bottou, Y. Bengio and P. Haffner, "Gradient-based
learning applied to document recognition," in Proc. IEEE, 1998.
We demonstrated a fast and accurate method to distinguish
patterns on clothes easily. This method can be used in

376

You might also like