Professional Documents
Culture Documents
L
where
i
O is the i
th
column of output matrix. After combing by sum rule output matrix
becomes
1 1 1
2 2 2
3 3 3
4 4 4
5 5 5
sum
a b c
a b c
O a b c
a b c
a b c
+ +
+ +
= + +
+ +
+ +
When the value in the first row is larger than other values, the result of MCS is
speaker1 (S1). Similarly, if value in the second row is larger than other values then
the result of MCS is speaker2 (S2) and so on.
3.3 Product Rule (Logarithmic Combination)
Product rule, also called logarithmic combination, is another simple rule for classifier
combination system. It works in the same manner than linear combination but instead
of sum, the outputs for each speaker from all classifiers are multiplied [12], [16]. The
product rule is defined as
1
1, 2, 3,...,
k
prod i
i
O O where i k
=
= =
When the output of any classifier for a particular speaker is zero, this value is re-
placed by a very small positive real number. After combining the output vectors of all
classifiers, output matrix becomes:
1 1 1
2 2 2
3 3 3
4 4 4
5 5 5
prod
a b c
a b c
O a b c
a b c
a b c
=
The decision criterion is similar as the Sum rule.
122 A. Zulfiqar et al.
3.4 Min Rule
Min rule combination method measures the likelihood of a given speaker by
finding the minimum normalized measurement level output for each speaker.
Then final decision for identifying a speaker is made by determining the maximum
value [12] [13].
Example 1: Consider an output matrix which is obtained by combining the output
vectors of the three classifiers. Each column of the matrix represents the output of a
classifier for five speakers.
0.0 0.3 0.2
0.4 0.3 0.2
0.6 0.5 0.4
0.0 0.0 0.1
0.2 0.1 0.3
O
=
Each element of O
min
is the minimum value selected from each row of the output ma-
trix O. Each row corresponds to the output of a classifier for a particular speaker. The
final decision is the maximum value of the vector O
min
which is 0.4. This value shows
that the true speaker is the speaker number 3.
[ ]
min
0.0 0.2 0.4 0.0 0.1
T
O =
3.5 Max Rule
In the Max rule, the combined output of a class is the maximum value of the output
values provided by different classifiers for the corresponding speaker [12] [16]. For a
better explanation, consider the following example.
Example 2: Assume that we have three classifiers and five speakers. Their output
matrix is given below:
0.0 0.3 0.2
0.4 0.3 0.2
0.6 0.5 0.4
0.0 0.0 0.1
0.2 0.1 0.3
O
=
The combined output vector is obtained by selecting maximum value from each row
of the output matrix. The resultant vector is
[ ]
max
0.3 0.4 0.6 0.1 0.3
T
O =
Maximum value in the vector O
max
is 0.6 which corresponds to speaker number 3. So,
the joint decision of all the classifiers is the speaker3.
Text-Independent Speaker Identification Using VQ-HMM Model 123
3.6 Confusion Matrix
Confusion matrix is a handy tool to evaluate the performance of a classifier. It con-
tains the information of both truly identified speakers as well as misclassified speak-
ers [15] [18]. Each column of this matrix represents the true speaker. Let us assume
that 50 voice samples of speaker3 are tested by the identification system. If all these
voice samples are truly identified then value at the 3
rd
row, 3
rd
column will be 50 with
zeros elsewhere. On the other hand, if values of 1
st
, 2
nd
, 3
rd
, 4
th
, and 5
th
row of 4
th
column are 3, 1, 0, 41, and 5 respectively show that 3 times speaker4 is misclassified
as speaker1, 1 time speaker4 is misclassified as speaker2, 45 times speaker4 is truly
identified, and 5 times speaker4 is misclassified as speaker5 by the system. A confu-
sion matrix is shown in Figure 4.
50 0 0 3 0
0 47 0 1 0
0 2 50 0 1
0 0 0 41 0
0 1 0 5 49
1
2
3
4
5
1 2 3 4 5
True Speakers
Identified Speakers
Fig. 4. A Confusion Matrix
4 Results
The identification rates of MSCIS, after applying Sum, Product, Min, and Max Rule
on the output of individual classifiers, are presented in Figure 5. Max Rule as a Com-
bination rule in MCS has shown an increase of 4.54% in identification rate than that
of best individual classifier which was 90.91%.
8 8 .6 4
8 1. 8 2
70 .4 5
9 5. 4 5
0
2 0
4 0
6 0
8 0
10 0
Sum Rule Product Rul e M i n Rule M ax Rul e
P
e
r
c
e
n
t
a
g
e
Fig. 5. Identification Rates of Combination Techniques
124 A. Zulfiqar et al.
Some combination techniques show poor identification rate even than individual
classifiers. A comparison between identification rates of best individual classifier K1,
best combination technique (Max Rule), and confusion matrix is depicted in Figure 6.
9 0 . 9 1
9 5. 4 5
9 7. 72
0
2 0
4 0
6 0
8 0
10 0
I nd i vi d ual C l assi f i er M ax R ul e Co nf usi o n M at r i x
P
e
r
c
e
n
t
a
g
e
Fig. 6. Comparison of Confusion Matrix Technique with Max Rule and Individual Classifier
5 Conclusion and Future Work
MFCC based VQ classifier, LPC based VQ classifier, and MFCC based HMM are
combined to make Multiple Classifier System (MCS). Normalized measurement level
outputs of the classifiers are combined by using Min Rule, Max Rule, Product Rule
and Sum Rule. Combination technique Max Rule demonstrated good results as com-
pared to other combination technique. Max rule improves identification rate by 4.54%
than best individual classifiers. But when classifiers are combined by using Confusion
matrix, it shows improvement of 6.81% than best individual classifier and 2.27% than
Max Rule in the proposed multiple classifier text-independent system. Experiment
shows that Confusion matrix based MCS produces excellent result as compared to
each individual classifier. These results are also better than various combination tech-
niques, i.e. Sum, Product, Min Rule, and Max Rule.
In the identity of the speaker case studied, our proposed MCS for CISI system
gives the same importance to the results obtained by each classifier. In order to en-
hance the performance in the decision process, the output of a classifier can be
pondered by a weight, when its performance is better than other classifiers within the
environment tested. It is our future validation in which we continuous making tests.
References
[1] Furui, S.: Recent Advances in Speaker Recognition. Pattern Recognition Letter 8(9),
859872 (1997)
[2] Chen, K., Wang, L., Chi, H.: Methods of Combining Multiple Classifiers with Different
Features and Their Application to Text-independent Speaker Identification. International
Journal of Pattern Recognition and Artificial Intelligence 11(3), 417445 (1997)
Text-Independent Speaker Identification Using VQ-HMM Model 125
[3] Reynolds, D.A.: An Overview of Automatic Speaker Recognition Technology. Proc.
IEEE 4, 40724075 (2002)
[4] Godino-Llorente, J.I., Gmez-Vilda, P., Senz-Lechn, N., Velasco, M.B., Cruz-Roldn,
F., Ballester, M.A.F.: Discriminative Methods for the Detection of Voice Disorder. In: A
ISCA Tutorial and Research Workshop on Non-Linear Speech Processing, The COST-
277 Workshop (2005)
[5] Xugang, L., Jianwu, D.: An investigation of Dependencies between Frequency
Components ans Speaker Characteristics for Text-independent Speaker Identification.
Speech Communication 2007 50(4), 312322 (2007)
[6] Huang, X.D., Ariki, Y., Jack, M.A.: Hidden Markov Model for Speech Recognition.
Edinburgh University Press, Edinburgh (1990)
[7] Linde, Y., Buzo, A., Gray, R.M.: An Algorithm for Vector Quantizer Design. IEEE
Transaction on Communications 28, 8495 (1980)
[8] Higgins, J.E., Damper, R.I., Harris, C.J.: A Multi-Spectral Data Fusion Approach to
Speaker Recognition. In: Fusion 1999, 2nd International Conference on Information
Fusion, Sunnyvale, CA, pp. 11361143 (1999)
[9] Premakanthan, P., Mikhael, W.B.: Speaker Verification /Recognition and the Importance
of Selective Feature Extraction:Review. In: Proc. of 44th IEEE MWSCAS 2001, vol. 1,
pp. 5761 (2001)
[10] Razak, Z., Ibrahim, N.J., Idna Idris, M.Y., et al.: Quranic Verse Recitation Recognition
Module for Support in J-QAF Learning: A Review. International Journal of Computer
Science and Network Security (IJCSNS) 8(8), 207216 (2008)
[11] Becchetti, C., Ricotti, L.P.: Speech Recognition Theory and C++ Implementation. John
Wiley & Sons, Chichester (1999)
[12] Kittler, J., Hatef, M., Duin, R.P.W., Mates, J.: On Combining Classifiers. IEEE
Transactions on Pattern Analysis and Machine Intelligence 20(3), 226239 (1998)
[13] Kuncheva, L.I., Bezdek, J.C., Duin, R.P.W.: Decision Templates for Multiple Classifier
Fusion: An Experimental Comparison. Pattern Recognition 34(2), 299314 (2001)
[14] Shakhnarovivh, G., Darrel, T.: On Probabilistic Combination of face and Gait Cues for
Identification. In: Proc. 5th IEEE Intl Conf. Automatic Face Gesture Recognition, pp.
169174 (2002)
[15] Ho, T.K., Hull, J.J., Srihari, S.N.: Decision Combination in Multiple Classifier Systems.
IEEE Transactions on Pattern Analysis and Machine Intelligence 16(12), 6675 (1994)
[16] Tumer, K., Ghosh, J.: Linear and Order Statistics Combiners for Pattern Classification.
In: Sharkey, A. (ed.) Combining Artificial Neural Networks, pp. 127162. Springer,
Heidelberg (1999)
[17] Chen, K., Chi, H.: A Method of Combining Multiple Probabilistic Classifiers through
Soft Competition on Different Feature Sets. Neuro Computing 20(1-3), 227252 (1998)
[18] Kuncheva, L.I., Jain, L.C.: Designing Classifier Fusion systems by Genetic Algorithms.
IEEE Tran. on Evolutionary Computation 4(4), 327336 (2000)