Professional Documents
Culture Documents
Conference Dates
ISBN
Published by
Table of Contents
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
ABSTRACT
Image segmentation is an important and
interesting digital image pre-processing phase to
enhance the performance of various pattern
recognition and computer vision applications.
Segmentation process enhance images analysis
through the extractions of features from the
relevance part of image only. In this paper, a
comparative study between five different color
segmentation techniques is performed. The
experimental results of PSNR and MSE metrics
show that K-means clustering algorithm has better
results when compared to the other algorithms,
but still need to be modified to deal with different
types of sharp and smooth edges.
KEYWORDS
Image segmentation, HSV color space, K-means,
Digital image processing.
1 INTROCUCTION
Image Segmentation as a pre-processing
phase is an effective process in many fields
such as computer vision, remote sensing,
traffic control, health care, industry, pattern
recognition, and video surveillance. The
segmentation process involves splitting an
image into a number of different regions in
which each region has a set of pixels with
high similarities but with high divergence in
pixels from pixels in other regions. The
splitting process is based on some extracted
features that may include color, intensity,
shape, depth, and/or texture gray-level. One
or more features can be used to perform the
segmentation process [1]. Several researchers
have focused on gray-level features during the
image
segmentation
process
[2].
Segmentation underlie its importance in
detecting specific part of an image that we are
Aalaa Albadarne
Princess Sumaya University for Technology
1438 Al-Jubaiha, 11941 Jordan
aalaaalbadarneh83@gmail.com
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
2 LITERATURE REVIEW
Previous work on image segmentation
includes many proposed systems such as the
work of Liu Yucheng who proposed a new
fuzzy morphological system that is a fusion
based image segmentation technique [4]. To
smooth the image, this work uses the opening
and closing morphological operations
followed by the gradient operation on the
image. In addition, they have solved the over
segmentation drawback of Watershed
algorithm. This system proved that the fusion
approach saved the information details of an
image and improved the speed of
segmentation [5].
Fernando C. Monteiro proposed an image
segmentation algorithm that is based on
morphological watershed edge and region
information using the spectral method [6]. It
applies an enhancement of a preprocessing
stage using bilateral filter to reduce the noise
from an image, then preliminary segmentation
is performed using region merging. Also, it
had generated region similarity and then
region based grouping using the Multi-class
Normalized Cut method.
Weihong Cui Yi Zhang [7] image
segmentation algorithm generates multi-scale
image segmentation with an edge based auto
threshold select method, then it calculates
edge weighte. Minimum spanning tree and
edge based threshold method are used to
perform image segmentation. Experiments
results have shown that this method
maintained the object information and kept
object boundaries while segmenting the
image.
R. V. Patil claims that K-means image
segmentation will provide better results if the
number of clusters is estimated in an accurate
manner based on edge detection [8]. Phase
congruency is used to detect the edges. Then
these edges with threshold and Euclidean
distance are used to find the clusters. Kmeans is used to find the final segmentation
of an image. Experiments results have
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
(1)
(2)
A high value of PSNR is better since it
indicates that the Signal to Noise ratio is
higher where the 'signal' is the original image
and the 'noise' is the error in rebuilding this
image. Therefore, a scheme that has a lower
MSE value and has a higher PSNR value is a
good schema [19].
4 COMPARISON OF SEGMENTATION
ARLGORITHMS
In this section, we present a performance
comparative study between the five color
image segmentation algorithms presented
earlier in this paper. These algorithms are;
Hill-climbing
with
K-means
(HKM)
algorithms, Fuzzy C-Means clustering (FCM),
OTSUs adaptive Thresholding K-means
clustering (KMC), Region Growing (RG),
Table 1. MSE and PSNR Performance Comparison of the Five Segmentation Algorithms
Image
Metric(dB)
KMC
FCM
RG
HKM
MWS
Beach
PSNR
58.69
52.58
52.35
57.92
57.92
Bird
MSE
PSNR
0.08
60.18
0.36
54.48
0.15
52.26
0.39
54.89
0.11
59.04
Building
MSE
PSNR
0.06
57.20
0.23
56.53
0.39
52.26
0.21
56.37
0.08
58.08
Car
MSE
PSNR
0.12
57.15
0.14
55.32
0.39
53.58
0.15
55.21
0.10
55.59
Flower
MSE
PSNR
0.12
58.69
0.19
58.10
0.29
51.72
0.20
53.21
0.18
58.93
MSE
0.13
0.08
0.44
0.31
0.08
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
KMC
58.38
0.10
FCM
55.40
0.2
RG
52.43
0.33
HKM
55.52
0.25
MWS
57.91
0.11
KMC
FCM
RG
HKM
MWS
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
[10]
[11]
[12]
[13]
[14]
[15]
R.
Vijayanandh,
and
G.
Balakrishnan,
"Hillclimbing Segmentation with Fuzzy C-Means
Based Human Skin Region Detection using Bayes
Rule," European Journal of Scientific Research
(EJSR), Vol. 76(1), pp.95-107, 2012.
[16]
[17]
[18]
[19]
[20]
REFERENCES
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
KEYWORDS
Aesthetics, computational photography, composition rules/guidelines, image cropping and scaling,
image recomposition
Figure 1. Different cropping window overlays on
INTRODUCTION
The goal of image recomposition tools is to enhance the image quality in terms of aestheticsdriven composition of the given image. Ca1 http://mylio.com/true-stories/next/one-trillion-
photos-in-2015-2
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
RELATED WORKS
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
and deleting seams2 in the image frame. Noticeable artifacts are visible in complex and geometrical background images due to its discrete nature.
Continuous methods utilize the traditional
warping technique to recompose the image content. The non-homogeneous warping method [11] enhanced image aesthetics
by minimizing a set of aesthetic quality errors. Recently, Chang et al. [2] proposed an
exemplar-based warping method to modify the
image composition where prior composition
knowledge is not mandatory. More recently,
Islam et al. [10] applied non-homogeneous
warping to the stereoscopic image domain.
Notably, the noticeable feature distortions are
unavoidable for large scale warping due to over
composition.
Hybrid methods [14, 12, 22] utilized the benefits of more than one image operators. Crop
and retarget [14] used a passive cropping operator and then retargeted the cropping window
to get the aesthetically pleasing image which
was a relatively slow process. To speed up the
total process, crop-and-warping [12] used an
active crop operator to crop the input image
and then applied the non-homogeneous warping [11] on the cropped image. Tearable image warping [22] proposed to unify the cutand-paste [1] and non-homogeneous warping
[11]. This method segmented the foreground
objects, warps the inpainted backgrounds and
then pasted the segmented objects in the optimized background layer. Recently, Tan et al.
[21] improved the single tearable image warping to tearable stereo image warping.
3.1
Rule of Thirds
AESTHETICS COMPOSITION
Photographic composition rules can alone enhance the image aesthetics. We select two most
popular composition rules and apply them to
our AAICS method to enhance the image composition of the regular photographs captured.
2A
3.2
Visual Balance
10
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
Figure 4. An overview of our proposed AAICS algorithm which has mainly three parts; (1) significant map
generation, (2) apply cropping operator based on the generated significant map which follows some pre-defined
composition rules, the red rectangle represents the cropping window in the given image, and (3) rescale the
cropped image to get the recomposed image. The horizontal and vertical yellow lines are representing the rule
of thirds photographic composition.
ject centroid are not located at the same position that creates an unbalanced harmony in the
frame. After minimizing the distance between
the center of visual mass and the image center, the balance of salient objects (triangle and
rectangle) is shown in Figure 3.
4
Our proposed AAICS method aims to enhance the image aesthetics of the regular images which is captured by casual photographers by rearranging the image components.
To maximize the image aesthetics, our proposed AAICS method minimizes a set of aesthetic quality errors, e.g. rule of thirds and
visual balance. Figure 4 shows the overview
and workflow of our proposed AAICS method.
Given an input image, we first compute the significant map using graph-based visual saliency
[8]. The girl is the most important content of
the input image which does not belong to the
rule of thirds composition. In the second stage,
our method minimizes a set of aesthetic quality errors to search an aesthetically pleasing
cropped window. Finally, the cropped window
is rescaled to get the respective enhanced recomposed image.
4.1
(left to right) input image, significant map and overlay of significant map to the given images respectively.
11
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
4.2
4.3
Rule of thirds composition is the most popular composition in the photography. We already discussed details in Section 3. The rule
of thirds error is defined as the minimization
problem between the most important content
of the image and one of the power points. The
rule of thirds error defined as,
2
E p =
D p Oo
(1)
(2)
EXPERIMENTAL RESULTS
Recomposition Results
Figure 7 shows different results of our proposed AAICS method. Our results are visually gaining more appealing compared to the
original images counterpart. The photos of airplane and man (row: 1, 2) do not follow the
rules of thirds composition in the original image, whereas our results relocate the salient
content to the nearest power points to get aesthetically pleasing photos. The location of the
boat in the original image (row: 3) is slightly
far from the right power points. Our recomposed result changes the location of the right
power points accurately which have visually
enhanced its aesthetic value. The landscape
image (row: 4) ideally has no significant object. Our method considers the island as most
salient part of the object instead of the golden
sun. The recomposed result is also pleasing to
the viewers, although the location of the sun is
nearly identical with the original image. For
visual balance, two salient objects (row: 5) are
required to balance its component. The original images seem to be unbalanced due to its
12
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
Figure 7. Recomposition results of a set of images; (left to right) original image, graph-based visual saliency
(significant) map, overlay saliency map on the corresponding given image and recomposed image, respectively.
White rectangles represent the optimized cropping window which adheres the rule of thirds and visual balance
composition.
random location. Our result shows more balance and harmony between two men.
Figure 8 shows the comparison of our results
with the state of the art methods. Although,
crop-and-retarget [14] and data-driven cropping [17] generate some promising results, but
these methods may not be able to relocate the
salient objects according to the rule of thirds
composition in their results. Our result is more
convincing to adhere the rule of thirds composition. Figure 9 shows the comparison of
our results with the recent multiple models for
cropping [5]. Both of the recomposed images
enhance the visual aesthetics than the origi-
Retargeting Results
Our method enables to generate aestheticsdriven retargeting results by providing the retargeting scale factor whose aspect ratio may
not be identical to the given input image. Figure 10 shows the image retargeting results. The
original image is resized horizontally with two
different scaling factor Sx = 0.6 and Sx = 0.8
scaling factor. Our method is not only en-
13
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
Figure 8. Compared our result with the result of the state of the art recomposition methods; (left to right)
original image, result of crop-and-retarget method [14], data-driven cropping method [17] and our proposed
method, respectively.
Figure 9. Compared our results with the results of a recent cropping using triple models [5] method. (left to
right) original images, results of triple models [5] and our results, respectively.
Limitations
14
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
Figure 11. Compare our different retargeting results with the results of linear scaling. The reduction of the
image size at 40%, (top) horizontally (Sx = 0.6, Sy = 1), (middle) horizontally and vertically (Sx = 0.6, Sy =
0.6), and (bottom) vertically (Sx = 1, Sy = 0.6). (left to right) original image, result of linear scaling, and ours
respectively. Red rectangles are representing the inability of object protection by linear scaling.
15
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
Figure 12. Failure case of our proposed method; (left to right) original image, overlay saliency map to the
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
[19]
[20]
[21]
16
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
[22]
[23]
[24]
[25]
scopic image retargeting. In Pacific-Rim Symposium on Image and Video Technology, pages 257
268. Springer, 2015.
L.-K. Wong and K.-L. Low. Tearable image warping for extreme image retargeting. In Computer
Graphics International (CGI), pages 18, 2012.
J. Yan, S. Lin, S. B. Kang, and X. Tang. Learning the change for automatic image cropping. In
IEEE Conference on Computer Vision and Pattern
Recognition (CVPR), pages 971978. IEEE, 2013.
F.-L. Zhang, M. Wang, and S.-M. Hu. Aesthetic
image enhancement by dependence-aware object
recomposition. IEEE Transactions on Multimedia
(TMM), 15(7):14801490, 2013.
M. Zhang, L. Zhang, Y. Sun, L. Feng, and W. Ma.
Auto cropping for digital photographs. In IEEE
International Conference on Multimedia and Expo
(ICME), pages 47. IEEE, 2005.
17
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
ABSTRACT
1 INTRODUCTION
KEYWORDS
Narration, First-person Games, RPG, Storytelling,
Artificial Intelligence, Interactive Game, Video
Games.
18
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
3 TECHNICAL RESEARCHES
According to Bourg and Seemann (2004), A*
Path-finding is one of the most efficient pathfinding methods to date. It always guarantees
finding the shortest path to the target, but it
does consume some CPU cycles, and may
19
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
Setting
Lighting
Engine
Sound Effects
Combat
Point of View
Enemy Types
Darkness is Handicap
Limited Light Source
Amnesia:
The Dark Descent
Haunted castle
Light comes mostly
from the lantern and lit
torches
HPL
Haunting sounds and
musical cues
None
First-Person
Monsters
Yes
Yes
Afraid of Monsters
Mental asylum
Sparse electrical lighting
Varies
Electrical light sources
and flashlight
Unreal
Ambient sounds and
whispering
None
First-Person
Psychotic people
Yes
Yes
GoldSrc
Dark music and ambient
sounds
Yes
First-Person
Hallucinations
Yes
Yes
20
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
5 POTENTIAL BENEFITS
The tangible benefits for this project are of
course a more interesting game itself. The game
is developed for those who have an interest not
only in horror games, but also games that
require some thinking as well.
21
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
[3]
[4]
[5]
[6]
[7]
6 CONCLUSIONS
In this paper we tried to determine a conclusion
to fix giving the player any incentive to explore
problem. The project has been done and tested,
and issues were found in it that can be rectified
for future improvement efforts. The project
itself works fine and can be improved so that it
better resembles an actual game.
There were initially goals to actually make a
harder gameplay by involving a time limit. The
game gets progressively harder as time goes on,
making it harder and harder for the player to
escape the castle. Compared to the original
grand idea, it resembles some part of it.
However, given enough time and help, the
project can indeed be reworked to be
completely playable, fun, and interactive.
Hopefully, the above solutions can be carried
out and done so that the game project can show
the problems and solutions provided. It is
possible to make more levels, details, and new
types of enemy A.I., but all that will require a
substantial amount of time and manpower to
work on to have this game on market.
REFERENCES
[1]
[2]
22
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
ABSTRACT
Computer science and speech recognition have
enjoyed a long and successful relationship. Speech
recognition has been a useful tool to detect and
record voices. In computer science, a great
challenge is to interpret the speech signals into
purposeful and important data and to develop
algorithms and applications to establish an interface
between human voice signals and the computer.
Significant interest has been raised in speech
processing, especially of the Quran to provide a
second opinion on diagnosis with less error and
higher accuracy and reliability than the results
normally achieved by human experts. This paper
offers an overview of the use of this technology
related to the Holy Quran.
1. INTRODUCTION
The Holy Quran is the Holy Book or Scripture of
the Muslims. It is believed to be the word of Allah
revealed to His Prophet Muhammad (PBUH). A
Muslim must be careful to read it without making
mistakes. These mistakes may include misreading
words, and incorrect utterance, punctuation, and
pronunciation.
Nowadays, there are many applications related to
the Holy Quran based on the techniques of
Automatic Speech Recognition (ASR). Such
software helps people read the Quran verse
properly, such as detection/correction of specific
pronunciation errors, verification of recitation,
memorising Quran, auto reciter, Quran explorer
and checking the tajweed rules etc. This paper
highlights the progress in automatic speech
recognition for the Holy Quran.
23
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
A. Pre-processing:
In speech recognition, the first phase is the preprocessing where recording speech signals is passed
through the pre-processing block to remove the
noise contained in the speech signal and separate
voice from unvoiced speech to reduce the group of
attributes which assure only the information that
wants to be conveyed while discarding useless or
irrelevant information. The other main process is
endpoint detection. The start point and end points
are based on both energy and zero-crossing rates. In
this task, the information will be reduced to a group
of attributes which ensures only the intended
information is conveyed and discards useless and
irrelevant information.
24
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
3. PERFORMANCE OF AUTOMATIC
SPEECH RECOGNITION SYSTEM
Word Error Rate (WER) is the widely used metric
to evaluate speech recognition. The general
difficulty of measuring performance depends on the
fact that the recognised word series could have a
different length from the reference word sequence.
The word error rate is derived from the Levenshtein
distance, working at the word level instead of the
phoneme level. WER is defined as the sum of
substitutions (the reference word is replaced by
another word), deletion (a word in the reference
transcription is missed) and insertion (a word is
hypothesised that was not in the reference) words
by the number of reference words. [5]
= (H I) / N
Where H = (N-S-D) is the correctly recognised
words.
4. RELATED WORK
For several years, great effort has been devoted to
the study of Arabic speech by many researchers.
There already exists a detailed reviews of the
evolution of Arabic ASR. In this paper, only works
related to Holy Quran are highlighted where
several theories have been proposed to evaluate and
obtained high accuracy for Arabic Quran speech
recognition.
In recent years, there has been numerous research
related to Al-Quran or Arabic speech recognition
that uses CMU Sphinx (The Carnegie Mellon
University Sphinx 4 tools were used to train and
evaluate a language model which is a statistical
speaker-independent set of tools using the Hidden
Markov Models (HMM). It provides a more flexible
framework for research in speech recognition.) tools
to recognise the Arabic phoneme. However, Y.
Yekache describes the process of designing a taskoriented continuous speech recognition system for
Arabic, based on CMU Sphinx4, to be used in the
voice interface of Quranic reader. The concept of
the Quranic reader controlled by speech is
presented, the collection of the corpus and creation
of acoustic model is described in detail, taking into
account a specificities of Arabic language and the
desired application. [6]
Another system developed by T. Hassan et al.
demonstrates an analysis and implementation of a
Quranic verses delimitation system using the
sphinx tools. In this study, MFCC and HMM are
used as features extraction to extract verse from the
audio file and feature classier, respectively. This
research discovered that the verse of the holy AlQuran automated delimiter is possible to build. The
system carried two different types of test, where the
first test recorded 85% and 90% of mean
recognition ration for normal Arabic speaking for
short Al-Ikhlas for both females and males,
respectively. Meanwhile, the results from the
second test among 13 professional reciters for
tajweed and tarteel was 90% and 92% of the mean
recognition ration, respectively. The system also
had a problem with the extra noise due to audio file
compression and poor quality during the recording
process. [7]
El Amrani et al. investigated the use of a simplified
set of Arabic phonemes for the Arabic Automatic
Speech Recognition (AASR) using the CMU
Sphinx tools for the Holy Quran. The results
25
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
26
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
Reference
01
Feature
Extraction and
Classification
Techniques
CMU Sphinx
Tools
02
CMU Sphinx
Tools
03
CMU Sphinx
Tools
Performance
Not stated
85% and 90% of mean recognition ration
for Arabic speaking (sourate Al-Ikhlas)
for both female and male.
Tajweed and tarteel was 90%and 92% of
mean recognition ration
. The Word Error Rates (WER) obtained
was 1.5%
Real-time
Factor
Non
Reported
Non
Reported
Non
Reported
27
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
04
CMU Sphinx
Tools
05
MFCC
(GMM) modelling
06
MFCC features
and HMM model
07
08
09
MFCC and VQ
MFCC for feature
extraction
and ANN for
classification
MFCC
MLP
hybrid MFCC-VQ
architecture
10
11
Non
Reported
Non
Reported
Non
Reported
Non
Reported
Non
Reported
Non
Reported
0.156,
0.161 and
0.261 for
male,
female and
children
respectively
MFCC
HMM
97.58% - 96.96%
System. Performance with
confidence=96.87%
12
13
Tri-phone/HMM model
MFCC
Mean Square Error
(MSE)
5. CONCLUSION
Researchers focusing on the particular promising
and challenging area of automatic speech
recognition specifically for Quranic verse recitation
1.192,
2.928 and
0.740 for
male,
female and
children
respectively
Non
Reported
Non
Reported
Non
Reported
28
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
REFERENCES
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
[16]
[17]
[18]
29
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
ABSTRACT
This study provides a review of efforts made so far
to develop applications that assist in proper Quranic
recitation based on the spectrogram voice analysis.
The voice recorded from users recitation is used to
compare with the sample recitation features
collected from expert reciters that have been stored
in the database. These comparison is utilized to
evaluate the performance result of the various
algorithms that have been applied in Quranic
recitation processing. Before reviewing how to
develop the actual applications, a study was made to
determine the effectiveness of spectrogram voice
analysis. Samples are often collected to measure the
articulation of Arabic letters. The collected samples
are analyzed using Fourier analysis technique. The
sound features samples waveforms are transformed
into spectrums that are basically frequency
representations of the signals. The spectrogram is
used to determine the formant frequency and are
observed for each subject to determine its mean
formant frequency. Subsequently after features
extraction, a classification algorithm is employed to
compare the features extracted in real-time with that
available in the knowledge base. Based on the
results obtained, the processes are often developed
into mobile applications for portability.
KEYWORDS
Fast Fourier Transform, formant frequency, Signal
analysis, spectrogram, voice recognition.
1 INTRODUCTION
Quran is written in Arabic language and for the
general observance of religious duties, it is read
in Arabic by majority of the Muslims as well as
non-Muslims to learn and understand the
content. People who are intent to learn the
30
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
31
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
32
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
33
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
Description
Settings
Principal
Component
analysis (PCA)
Independent
Component
Analysis (ICA)
Linear
Predictive
coding
Cepstral
Analysis
34
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
Mel-frequency
cepstrum
(MFFCs)
Hidden Markov
Model
Artificial
Neural Network
Vector
Quantization
Spectrogram
35
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
Recognition Technique
Spectrographic Analysis based on
different frequency band of intensity
Hidden Markov Model (HMM)
Recurrent Neural Network
(RNN)
Hidden Markov Model (HMM)
Hidden Markov Model (HMM)
MFCC
FFT
MFCC
MFCC
Performance
93.33%
References
[19][20]
90.2%
MFCC 95.9% 8.6%
LPCC 94.5%- 99.3%
85% - 92%
90.62%, 98.01 % and 97.99% for sentence correction,
word correction and word accuracy respectively.
86.41% to 91.95%
Over 90%
90%-92%
82.1-95%
[21]
[22]
[23]
[24]
[10]
[4]
[25]
[18]
36
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
REFERENCES
[1]
[2]
[3]
[4]
[5]
[6]
[7]
[8]
[9]
[10]
[11]
[12]
[13]
[14]
[15]
37
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
[16]
[17]
[18]
[19]
[20]
[21]
[22]
[23]
[24]
[25]
[26]
38
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
ABSTRACT
Spark has several advantages compared to other
big data and MapReduce technologies like
Hadoop and Storm.
Spark provides a
comprehensive, unified framework to manage
big data processing requirements with a variety
of data sets that are diverse in nature (text data,
graph data etc.) as well as the source of data
(batch vs. real-time streaming data). Spark
SQL is an easy-to-use and power API provided
by Apache Spark. Spark SQL makes it much
easier reading and writing data to do analysis.
MongoDB Connector for Apache Spark is a
powerful integration that enables developers
and data scientists to create new insights and
drive real-time action on live, operational and
streaming data. This paper demonstrates some
experimentation on the MongoDB Connector
for Apache Spark that how Spark SQL library
can be used to store, retrieve and execute the
structured/semi-structured datasets such as
BSON against the Non-Relational database
MongoDB, an open-source and leading NoSQL
database.
KEYWORDS
Spark SQL, MongoDB, NoSQL databases,
MongoDB Connector for Apache Spark, Data
Analytics with Spark SQL and MongoDB, Data
Mining on NoSQL data
1 INTRODUCTION
In the era of Big data, the Complex data is
growing. Unstructured data will account for
more than 80% of the data collected by
Volume
Structure
Database
Schema
Structure
GBs-TBs
Structured
TBs-PBs
Structured, Semistructured
and
Unstructured
Relational
Non-Relational
Datbases:
Datastores:
(Oracle, SQL (Hadoop,
Server,
HBASE,
MySQL etc.) MongoDB etc.)
Fixed
Dynamic/Flexible
DBAApplicationcontrolled
controlled
39
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
SPARK
40
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
3 SPARK SQL
SparkSQL is a distributed and fault tolerant
query engine. It allows users to run interactive
queries on structured and semi-structured data.
Spark SQL can support Batch or Streaming
SQL. It runs SQL/Hive queries, connects
existing BI tools to Spark through JDBC. It
binds Python, Scala, Java and R. Spark SQLs
data source API can read and write
DataFrames3 using a variety of formats such as
JSON, CSV, Parquet, MySQL, HDFS, HIVE,
HBASE, Avro or JDBC etc.
3.1 Capabilities of SparkSQL [5]
Seamless Integration
Spark SQL allows us to write queries inside
Spark programs, using either SQL or a
DataFrame API. We can apply normal
spark functions (map, filter, ReduceByKey
etc) to SQL query results.
Supports variety of Data Formats and
Sources
Data Frames and SQL provide connection
to access a variety of data sources,
including Hive, Avro, Parquet, Cassandra,
CSV, ORC, JSON or JDBC. We can load,
query and join data across these sources.
Hive Compatibility
We neither to make any changes to our data
in existing hive metastore to make it work
with Spark nor to change our hive queries.
4 MONGODB
MongoDB is an open-source document
database that provides high performance, high
availability and automatic scaling. MongoDB
obviates the need for an Object Relational
Mapping (ORM) to facilitate development. A
record in MongoDB is a document, which is a
data structure composed of field and value
pairs. MongoDB documents are similar to
JSON objects. The values of fields may include
other documents, arrays and arrays of
documents.
Table 2. Comparison of RDBMS terminology with
MongoDB [9]
RDBMS
Database
Table
Tuple/Row
Column
MongoDB
Database
Collection
Document
Field
4
3
41
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
Table Join
Primary Key
Embedded Documents
Primary Key (Default key
_id is provided by
MongoDB itself)
42
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
sudo
service
mongod
6 EXPERIMENTATION
This experimentation demonstrates on how the
MongoDB Connector for Apache Spark can
access to all Spark libraries to analyze
MongoDB dataset [10][11][12].
MongoDB Spark Connector is compatible with
MongoDB 2.6 or later, Apache Spark 1.6.x and
Scala 2.10.x (if using the mongo-sparkconnector_2.10 package)
or
Scala
2.11.x (if using
the mongo-spark
connector_2.11 package). This experimentation
uses the Spark shell with Mongo shell6.
6.1 Proposed Model:
The MongoDB Connector is a plugin for both
Hadoop and Spark that provides the ability to
use MongoDB as an input source and/or an
output destination for jobs running in both
environments.
Start the Mongo shell
6
hduser@ubuntu:~$ mongo
MongoDB shell version: 3.2.8
connecting to: test
>
43
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
Note:
The bin/spark-shell to start the Spark shell.
The --conf option
to
configure
the
MongoDB Spark Connector. These settings
configure the SparkConf object.
https://raw.githubusercontent.com/mongodb/docsassets/primer-dataset/primer-dataset.json
44
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
Step (7):
To save the BSON file into
MongoDB collection
scala>
sc.parallelize(mydocs.map(Document.par
se)).saveToMongoDB()
show dbs
use Sivijaya
show collections
db.primer.find()
> db.primer.find().pretty()
45
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
scala> val
mydf2=sqlContext.sql("select
restaurant_id,address,grades,borough,c
uisine from primer where borough ==
'Queens' or cuisine == 'American '")
scala> mydf2.show()
46
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
> db.Query1_output.find()
> db.Query2_output.find()
8 CONCLUSION
service
mongod
http://www.simba.com/
https://www.progress.com/datadirect-connectors
47
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
REFERENCES
[1]
http://spark.apache.org/research.html
[2]
http://spark.apache.org/
[3]
https://www.infoq.com/articles/apache-sparkintroduction
[4]
http://www.slideshare.net/SparkSummit/addingcomplex-data-to-spark-stackneeraja-rentachintala
[5]
[6]
http://www.kdnuggets.com/2015/09/spark-sql-realtime-analytics.html
[7]
http://www.slideshare.net/databricks/sparkdataframes-simple-and-fast-analytics-on-structureddata-at-spark-summit-2015
[8]
https://www.mongodb.com
[9]
http://www.tutorialspoint.com/mongodb
[10]
https://docs.mongodb.com/
[11]
https://www.mongodb.com/press/mongodb-enablesadvanced-real-time-analytics-on-fast-moving-datawith-new-connector-for-apache-spark
[12]
[13]
http://www.tableau.com/
[14]
http://www.simba.com/webinar/connect-tableau-bigdata-source/
48
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
ABSTRACT
KEYWORDS
1.0.
INTRODUCTION
49
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
important
Many
images.
for image
are
in
such
Traditional
images
methods
[2].
calculated
which
is
invariant
to
of CBIR systems.
50
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
retrieval.
SIMBA,
CIRES,
approach
for
image
retrieval.
global
The
evaluation
by
of
proposed
technique
properties
of
the
image
by
between
has
RELATED WORK
been
proposed
in
[14].
Vector
users
RGB-space.
that
expectation-
segmentation
are
maximization
established
(EM)
in
like
query
region
of
number
interest
of
[16].
techniques
51
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
on
feature
extraction
at
global
level.
An imperative
introduced
vector
between
Euclidean distance.
in
[24]
for
feature
their
feature
vectors
through
four
different
values
extracting
comprehensive information.
3.0.
PROPOSED TECHNIQUE
52
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
Algorithm
Step 1: Color histogram is created for the
for each
constructed
by
computing
Co-
3.1.
Patterns
operator used is
as
effectively
where
structural
and
statistical
information
is
53
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
image
important
properties
by the
calculate
the
regarding
contrast.
= t(sc , s0 - sc, s1 -
Therefore
it
is
.
classification.
is now a
(3)
(4)
Where
(2)
for every
and
Rcos(2 i/I).
Circular
(5)
54
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
The
For I = 8,
Figure 4: Uniform rotation invariant patterns for circularly symmetric neighboring pixels of
EXPERIMENTAL RESULTS
image
categories,
where
each
Precision indicates
55
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
been retrieved.
4.2.
The
Results
proposed
system
is
evaluated
by
Table 1: Percentage precision values for 5 different image categories from Corel image database
Image
Proposed
LBP_8
LBP-16
ICLTP -
ICLTP
LDP 8
LDP 16
Category
Method
(%)
(%)
8 (%)
16 (%)
(%)
(%)
(%)
Dinosaur
100
96
98
99
100
96
97
Mountain
67
42
40
57
62
40
44
Horse
96
71
72
92
93
77
76
People
78
60
62
76
77
62
66
Elephant
74
42
46
62
65
54
60
Average
83
62
64
77
79
66
69
that
our
proposed
method
outperforms
56
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
100
90
80
Proposed Method
70
LBP_8
60
50
LBP_16
40
ICLTP_8
30
ICLTP_16
20
LDP_8
10
LDP_16
0
Dinosaur
Mountain
Horse
People
Elephant
image categories
Figure 5: Comparison of retrieval precision of the proposed technique with 6 different methods for 5 distinct image
categories
(a) Dinosaur
(b) Vehicle
57
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
(c) Horse
(d) Flower
(e) Beach
(f) Building
(g) Mountain
(h) People
58
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
(i) Elephant
(j) Food
Figure 6: Comparison of the proposed method with proposed global method in terms of average retrieval precision at
different number of images retrieved.
operator is used to
CONCLUSION
invariant
REFERENCES
patterns
is
for
calculated
and
by
statistical
information
co-occurrence
matrix.
cbir
and
biometrics
systems,
International
operator
Journal
Of
Biology
And
2007.
59
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
International
conference
on
image
processing, 2009.
[9] Faloutsos et al., Effcient and effective
[3] I. FelciRajam and S. Valli, A survey on
2013.
[10] Pentland et al., Photobook: Content[4] G.Ohashi and Y.Shimodaira, Edge-based
Aug. 2002.
retrieval:
Inverted
weights
and
les,
frequency-based
relevance
feedback,
recognition,
doctoral
and
discriminative
dissertation,
RWTH
60
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
Multimedia, 2004.
features,
International
Conference
on
June 2011.
2005.
ISBN: 978-1-941968-37-6 2016 SDIWC
61
Proceedings of the International Conference on Data Mining, Multimedia, Image Processing and their Applications (ICDMMIPA), Kuala Lumpur, Malaysia, 2016
using
International
mixed
Journal
binary
of
patterns,
Engineering
62