You are on page 1of 15

A Rohde & Schwarz Company

NiNA+Voice Measurement Description


Manual
June 2012

SwissQual License AG
Allmendweg 8 CH-4528 Zuchwil Switzerland
t +41 32 686 65 65 f +41 32 686 65 66 e info@swissqual.com
www.swissqual.com

Part Number: 16-100-200425 REV 1

SwissQual has made every effort to ensure that eventual instructions contained in the document are adequate and free
of errors and omissions. SwissQual will, if necessary, explain issues which may not be covered by the documents.
SwissQuals liability for any errors in the documents is limited to the correction of errors and the aforementioned advisory
services.
Copyright 2000 - 2012 SwissQual AG. All rights reserved.
No part of this publication may be copied, distributed, transmitted, transcribed, stored in a retrieval system, or translated
into any human or computer language without the prior written permission of SwissQual AG.
Confidential materials.
All information in this document is regarded as commercial valuable, protected and privileged intellectual property, and is
provided under the terms of existing Non-Disclosure Agreements or as commercial-in-confidence material.
When you refer to a SwissQual technology or product, you must acknowledge the respective text or logo trademark
somewhere in your text.
SwissQual, Seven.Five, SQuad, QualiPoc, NetQual, VQuad, Diversity as well as the following logos are
registered trademarks of SwissQual AG.

Diversity Explorer, Diversity Ranger, Diversity Unattended, NiNA+, NiNA, NQAgent, NQComm, NQDI,
NQTM, NQView, NQWeb, QPControl, QPView, QualiPoc Freerider, QualiPoc iQ, QualiPoc Mobile,
QualiPoc Static, QualiWatch-M, QualiWatch-S, SystemInspector, TestManager, VMon, VQuad-HD are
trademarks of SwissQual AG.
SwissQual acknowledges the following trademarks for company names and products:
Adobe, Adobe Acrobat, and Adobe Postscript are trademarks of Adobe Systems Incorporated.
Apple is a trademark of Apple Computer, Inc.
DIMENSION, LATITUDE, and OPTIPLEX are registered trademarks of Dell Inc.
ELEKTROBIT is a registered trademark of Elektrobit Group Plc.
Google is a registered trademark of Google Inc.
Intel, Intel Itanium, Intel Pentium, and Intel Xeon are trademarks or registered trademarks of Intel Corporation.
INTERNET EXPLORER, SMARTPHONE, TABLET are registered trademarks of Microsoft Corporation.
Java is a U.S. trademark of Sun Microsystems, Inc.
Linux is a registered trademark of Linus Torvalds.
Microsoft, Microsoft Windows, Microsoft Windows NT, and Windows Vista are either registered trademarks or
trademarks of Microsoft Corporation in the United States and/or other countries U.S.
NOKIA is a registered trademark of Nokia Corporation.
Oracle is a registered US trademark of Oracle Corporation, Redwood City, California.
SAMSUNG is a registered trademark of Samsung Corporation.
SIERRA WIRELESS is a registered trademark of Sierra Wireless, Inc.
TRIMBLE is a registered trademark of Trimble Navigation Limited.
U-BLOX is a registered trademark of u-blox Holding AG.
UNIX is a registered trademark of The Open Group.

NiNA+Voice Measurement Description Manual


2000 - 2012 SwissQual AG

Contents
1

Introduction .......................................................................................................................................... 1

Listening Quality.................................................................................................................................. 2
Introduction ............................................................................................................................................ 2
The Definition of Listening Quality ......................................................................................................... 2
Subjective and Objective Quality assessment ...................................................................................... 2
Assessment of Intrusive-/Non-Intrusive Calls ........................................................................................ 3

NiNA+ Network Quality Assessment ................................................................................................. 5


Introduction Why NiNA+ ..................................................................................................................... 5
Technical Background of NiNA+............................................................................................................ 5
Technical requirements and performance ............................................................................................. 5
Measurement results of NiNA+.............................................................................................................. 8

Figures
Figure 2-1 Subjective versus objective quality assessment .............................................................................. 3
Figure 3-1 NiNA+ Listening Quality values for noise-free speech transmissions ............................................. 7
Figure 3-2 NiNA+ Listening Quality values in GSM connections using real handsets ...................................... 8
Figure 3-3 Example of NiNA+ measurements shown in NQDI ......................................................................... 9
Figure 3-4. Average NiNA+ results .................................................................................................................. 10
Figure 3-5 Signal Envelope [dB] (Received Speech Signal) ........................................................................... 12
Figure 3-6 Time Domain Chart (Received Speech Signal) ............................................................................. 12

Tables
Table 3-1 Correlation coefficients between MOS values obtained in auditory tests and scores of NiNA+ ....... 6
Table 3-2 Typical MOS values of auditory tests and NiNA+ ........................................................................... 11

ii

Contents |
CONFIDENTIAL MATERIALS

NiNA+Voice Measurement Description Manual


2000 - 2012 SwissQual AG

Introduction

This document describes the technical background, the application scenarios as well as the parameters that
are measured with the single ended NiNA+ voice quality measurement. The application used was the
SwissQual QoS Measurement System, the screenshots are made from the SwissQual Post Processing
System NQDI.
NiNA+ provides an opportunity for assessing the signal quality of a signal transmitted via a
telecommunications network without the knowledge of the originally transmitted signal. The speech quality is
determined by only using the output signal. SwissQuals NiNA+ solution can be applied for rating of any
arbitrary connection where a self-answering far-end side is playing back human speech (e.g. weather
forecast or similar). Since, NiNA+ can be applied on the mobile unit, the radio link forms part of the tested
connection. Of course, by using NiNA+ any fixed line connection, even Voice over IP, can be rated.
Furthermore, the NiNA+ method is not restricted to end-to-end measurements; it can be used at any arbitrary
location in the transmission chain. It can be used for quality monitoring at any electrical measuring point
within a real established voice link (e.g. in a VoIP Gateway or a at an E1/T1 interface). The calculated score
reflects the true speech quality from the perspective of the end-user as if using a conventional shaped
handset at this measuring point.

Chapter 1 | Introduction
CONFIDENTIAL MATERIALS

NiNA+Voice Measurement Description Manual


2000 - 2012 SwissQual AG

Listening Quality

Introduction
For network operators or equipment manufacturers, it is important to know where and why there is speech
quality degradation. Since listening quality is a major factor determining customer satisfaction, encoding
techniques must be designed for optimal speech quality. In order to assess the quality of speech encoding
techniques, large-scale auditory tests are commonly employed. However, it is very difficult to reproduce
results obtained in such a way. Furthermore, such results are depending on the level of motivation of the
individual test candidates. It is, therefore, a big advantage to have an automated method capable of
physically measuring speech quality parameters and producing results, which correlates as closely as
possible with subjectively acquired results.
Listening quality is a vague term compared with bit rate, echo or loudness. Since customer satisfaction can
be measured directly by the quality of the transmitted speech, encoding techniques must be selected and
optimized based on their listening quality.

The Definition of Listening Quality


Listening Quality is defined as a measure of a listeners satisfaction based on his experience and
expectation regarding voice communication. It is generally expressed as a Mean Opinion Score (MOS). The
Listening Quality is usually measured by applying Absolute Category Rating Tests (ACR), which shows the
MOS on a scale from 1 (bad) to 5 (excellent).
This measurement denotes the average of many individual opinions on speech quality, which are obtained
from a representative number of listeners. Listening quality is a complex psycho-acoustic phenomenon within
the process of human perception. As such, it is a subjective measurement.
Listening Quality is the main factor for a perceived overall quality in speech telecommunications. However,
as listed below, Listening Quality is only one of three dimensions determining the overall speech quality of a
telephone call:

Listening Quality: covers the listening situation between the two calling parties, where one party is
talking and the other party is listening (non active).

Talking Quality: perceived quality by the talker during own speech activity (mainly influenced by
echoes and side tones)

Conversational Quality: perceived overall quality in a human conversation. It combines Listening and
Talking Quality together with signal delay and double talk interferences..

Detailed definitions of these dimensions and test scenarios for auditory tests can be found in ITU-T P.800
series.

Subjective and Objective Quality assessment


Assessing the quality of a telecommunication network is an important instrument for achieving and
maintaining the required service quality. One method of assessing the service quality of a
telecommunications network involves determining the quality of a signal transmitted via the
telecommunications network. Therefore a test connection has to be established and a signal will be
transmitted from A to B. In the case of audio signals and in particular voice signals, several of these so-called
intrusive or double ended procedures are used for this purpose. As the name suggests, such procedures
intervene in the system to be tested in such a way that a transmission channel is allocated and a reference
signal is transmitted along it.
The transmitted speech signal can be collected and assessed in 2 ways:
Subjective assessment: This is where test persons conduct subjective auditory tests, either comparing the
received signal with the known reference signal or rate the received signal by their own experience and
2

Chapter 2 | Listening Quality


CONFIDENTIAL MATERIALS

NiNA+Voice Measurement Description Manual


2000 - 2012 SwissQual AG

expectation This procedure is, however, very time consuming and therefore expensive.
Objective assessment: An automated speech quality assessment method making:

an evaluation and rating of the received signal compared to the known reference (double-ended
method and intrusive, requires a testcall), or

an evaluation and rating is conducted on the received signal alone. (single-ended method, might be a
test call to a answering machine or even live monitoring)

The basic relationship between subjective /objective assessments and double-ended/single-ended is shown
in Figure 2-1.

reference speech signal

transmitted speech signal

Network under test

Experience
expectation
semantic

Human
listener

Methods requiring
a reference signal

Quality
rating

Methods requiring
NO reference

Quality
rating

Figure 2-1 Subjective versus objective quality assessment

Assessment of Intrusive-/Non-Intrusive Calls


With reference to objective speech quality testing, the Intrusive and Non-intrusive methods can be used in
several application scenarios. Namely, the test options are as follows:

Intrusive and double-ended: Both ends of the connection are under control and a defined audio signal
will be transmitted in this test connection.

Non-intrusive In-service Monitoring: Assessment of speech signals in real human conversation by


parallel monitoring (e.g. at E1/T1 interface or VoIP-Gateway)

Intrusive and single-ended: A test connection will be established to any answering station which is
playing back a voice signal (e.g. weather forecast). Here the same model is applied as the Non-intrusive
In-service Monitoring.

Intrusive and double-ended Speech Quality Assessment: Here the methods, which require a known
reference signal, will be applied normally. Both ends of the connection are under control and a pre-defined
voice-signal will be transmitted.
This approach generally has the disadvantage that, it is necessary to intervene in the network to be tested.
This means, to determine the signal quality, at least one transmission channel must be occupied for the
reference signal to be transmitted on it. This transmission channel cannot be used for data transfer purposes
3

Chapter 2 | Listening Quality


CONFIDENTIAL MATERIALS

NiNA+Voice Measurement Description Manual


2000 - 2012 SwissQual AG

during this period of time. In addition, although in a broadcasting system such as a radio service, for
example, it is in principle possible to assign the signal source for transmitting test signals, however, since all
channels are consequently occupied and the test signal would be transmitted to all receivers, this procedure
is extremely impractical. Also, Intrusive procedures are likewise unsuitable for the purpose of simultaneously
monitoring the quality of a large number of transmission channels.
Of course, the advantages of the double ended method, is that the input signal or reference signal is
known, this allows for very accurate and detailed analysis of voice quality impairments. Each change in the
signal during its transmission can be detected and be proven for its impact on perceived quality by applying
psycho-acoustic models. Such models are well applicable for optimization processes in laboratories as well
as in real networks. They are able to predict even the minimal degradations of the signals and can be
applied to compare different or similar transmission scenarios.
Non-intrusive and single-ended Speech Quality Assessment: Models assessing speech quality without a
pre-defined reference speech signal, which has to be transmitted, often called non-intrusive or single-ended
models. These models analyse the transmitted and maybe distorted speech without any possibility to
compare it with a separate input or known reference signal. Therefore, no reference input signal is available
for a detailed comparison.
The single ended models often look for pre-defined distortions by applying conventional signal analysis
methods. This means, they are looking for background noises, interruptions, frame repeats and so on. More
advanced solutions try to reconstruct a reference speech signal from the distorted one and apply similar
psycho-acoustic based methods for comparisons like the intrusive and double-ended methods.
Of course, the accuracy of a single ended approach is lower than that of an intrusive and double ended
approach. However, due to the advanced integrated speech extraction and the psycho-acoustic based
calculations, the single-ended approach is now accurate enough to be applied in real environments.
A non-intrusive, single-ended algorithm has two base applications, namely:
In-Service Monitoring: Here the speech signal of a real conversation will be assessed. This can be done
with a terminal or maybe more efficient at the PBX side at an E1/T1 link or even in a VoIP Gateway. The
advantages are two-fold:

the ability to collect a large amount of measurement data without allocating network resources and

Gain a more realistic overview about the speech quality as perceived by the subscribers. This is because
the impact to speech quality coming from the sending side (e.g. Background noise) is included in the
measurement and end result.

NiNA+ will be connected at an electrical interface, therefore the real acoustical environment of the listener
cannot be measured, instead a modelled handset is applied to the signal to act as an intermediate receiving
function.
Applications for such quality monitoring scenarios except the pure quality reporting could be also qualitybased routing or quality based billing.
For the network operator the quality monitoring scenario can be used as a powerful quality reporting tool
application, however further applications are possible like quality-based routing or quality based billing.
Intrusive and single-ended Quality Here a test connection has to be established at both ends but it is not
required that the far-end side plays back a pre-defined signal. This is an advantage as there is no need to
install a dedicated answering station. The model works with any speech signal from the far-end, these could
be public numbers like the weather forecast or the time service. This is really helpful for monitoring multi-link
connections especially to other providers or other countries. Only at the listening side a test system has to
be installed. Furthermore, the network provider will have the possibility to monitor there own voice-based
announcement services for possible impacts or accessibility.
NiNA+ is SwissQuals solution for smart predicting MOS-LQO on a single ended approach. It covers a signal
pre-processing and calculates additional parameters such as causes of quality degradations, noise and
speech levels. NiNA+ as stand-alone solution is a complete suite for non-intrusive listening quality
assessment.

Chapter 2 | Listening Quality


CONFIDENTIAL MATERIALS

NiNA+Voice Measurement Description Manual


2000 - 2012 SwissQual AG

NiNA+ Network Quality Assessment

Introduction Why NiNA+


SwissQual has developed NiNA already in 2001. The main structures of NiNA form also an integrated part in
ITU-T P.563, which was developed in a joint process and was approved in 2003. However, ITU-T P.563 is a
very complex model, which doesnt allow to be integrated in low-performing platforms such as mobile
operating systems or DSP solutions.
Due to the progress in the transmission technologies and the experiences made SwissQual decided to reconstruct their own single-ended model widely. Since, the used methods of NiNA as well as the performance
were improved significantly, the developed solution were renamed into NiNA+. It shows the relationship
within SwissQuals family of measurements but signalizes also the step forwards.
Like NiNA also NiNA+ is predicting a MOS value on the well-known 1 to 5 point scale. NiNA+ takes into
account the full range of distortions occurring in public switched telephone networks and that is able to
predict the speech quality on a perception based scale MOS-LQO according to ITU-T Recommendation
P.800.1.
In addition NiNA+ re-uses and extends the so-called cause-analysis, which gives detailed information about
the reason of a quality degradation in a technical manner. New in NiNA+ is also a signal classification. Thus,
NiNA+ itself can decide whether the signal is speech or not. NiNA+ includes further a plausibility check of the
signal to be evaluated. Consequently, mis-predictions are avoided in case of signals, which are not fulfilling
the requirements such as silence or non-speech signals.
Of course, NiNA+ is providing additional information about the speech signal such as speech and noise
level, interruptions and clippings as it should be expected from single ended measurement approaches.

Technical Background of NiNA+


As mentioned in the previous chapter, in comparison to SwissQuals SQuad-LQ (a so-called double-ended
method) that compares a high quality reference signal to the degraded signal on a basis of a perceptual
model, NiNA+ predicts the Listening Quality without any knowledge about the input reference signal.
The NiNA+ approach could be visualized as a human expert who is listening to a real call with a test device
like a conventional handset into the line in parallel. This visualization is also the main application and allows
the user to rate the scores gained by NiNA+.
After filtering excluding signal parts outside of the telephone band, the active voice parts are assigned.
Based on this voice activity detector (VAD), the signal and noise level is calculated.
The following analysis is detection and scoring the unnaturalness of the speech. Therefore, models and
expectations on human speech signals are used. Furthermore, interruptions, clippings, saturations and
bandwidth limitations are analysed.
Finally, a set of quality describing characteristics are calculated and mapped into the MOS-LQO.
Based on these characteristics also the cause analysis and the signal classification is done.

Technical requirements and performance

SwissQuals NiNA+ solution runs on Windows 32bit platform. It requires only a speech signal with 8000 Hz
sampling frequency as input. Because of SwissQuals consequent run time optimization, it requires only
0.25% of the speech sample duration for the complete calculation on a state of the art Pentium 4 processor
1
(2.6 GHz) . For comparison, it runs nearly 100 times fast than ITU-T P.563 and even more than 20 times
faster than SwissQuals speed optimized solution for P.563.

Requires INTEL CPUs. 50% active speech assumed.


5

Chapter 3 | NiNA+ Network Quality Assessment


CONFIDENTIAL MATERIALS

NiNA+Voice Measurement Description Manual


2000 - 2012 SwissQual AG

SwissQuals NiNA+ solution runs also on Windows 32bit platform. This low complexity makes NiNA+ to an
ideal component at low performing platforms such as mobile phone operating systems and digital signal
processors.
Furthermore, the NiNA+ method has some useful requirements on the speech signal to be assessed to avoid
false predictions or malfunctions.
Sampling Frequency:
The sampling frequency has to be 8000 Hz and a linear quantized PC-signal (16bit) is required. The
conversion from other formats is not part of the algorithm itself and has to be done separately. This process
is done automatically by SwissQuals QoS measurement systems, therefore no further work needs to be
done by the customer.
Speech Sample Length:
A sample length between 5 and 20 seconds is recommended. The signal length will be checked by
SwissQuals QoS system. Defined sample length below 5 seconds will be not accepted. Sample length of
above 20 seconds will result in a warning message and will be truncated at 20 seconds. It is recommended
that the speech activity has to be in minimum 25%, but more than three seconds and should not exceed 90%
(especially for short samples).
Minimum Speech Activity:
The main requirement is the minimum amount of active speech in the file. To obtain accurate results the
speech signal should contain, at least 3 seconds of active speech. Otherwise, the processing might lead to
wrong results, because the balance between voiced and unvoiced sections is not given anymore. Even for
auditory tests with human listeners a minimum speech activity of 4 seconds is recommended. To avoid a
mal-function, the configuration of the measurement probe does not allow the definition of speech sample
length below 5 seconds. Nevertheless, the active speech might under-run the minimum speech activity.
Consequently, SwissQuals QoS system is configured not to process speech samples with less than 3 sec
active speech, instead a warning message is displayed.
Speech Level:
NiNA+ accepts range of active speech level from -16 dBov down to -45dBov. Higher levels will lead to
annoying clippings of the higher amplitudes. However, if the high speech level is caused by the network
under test, it should be considered in the quality but if the clipping is caused by measurement interface, it will
lead to artificial quality impacts.
Likewise, measurements with low speech level will have a decreasing signal-noise-ratio caused by the limited digital resolution of the used A/D converter in the measurement environment. This will also lead to
additional quality impacts. SwissQuals QoS system will ensure the proper level adjustment for all supported
cellular phones and ISDN/PSTN cards. Only in the transparent mode by using arbitrary terminals the
customer it self has to control the correct level adjustment. For that reason speech levels, which are out of
the recommended range, will be highlighted in red color by analyzing the results in SwissQuals NQDI data
interface. Please note, that files with a speech level of below -65dBov will be not analyzed and a warning
message will be displayed.
Accuracy of predicted Listening Quality:
The accuracy of the NiNA+ model was by using large speech databases covering the complete scope of
todays public switched telephone networks.
The performance against well-known databases from the ITU-T set is shown below. Due to the target
applications from SwissQuals QoS system, a strong focus was set for an outstanding performance in real
live network connections, such as the mentioned test real GSM with handset variations. The numbers are
describing the correlation coefficient between the MOS values obtained in the auditory tests and the
predicted scores by NiNA+. Therefore a third-order mapping was applied before calculation of the
correlation. The results below are comparing the NiNA+ performance with the current ITU-T standard P.563.
Table 3-1 Correlation coefficients between MOS values obtained in auditory tests and scores of NiNA+

Speech Database
Suppl. 23, Exp. 1 Am. English

ITU-T P.563
0.902

NiNA+
0.905
6

Chapter 3 | NiNA+ Network Quality Assessment


CONFIDENTIAL MATERIALS

NiNA+Voice Measurement Description Manual


2000 - 2012 SwissQual AG

Speech Database

ITU-T P.563

NiNA+

Suppl. 23, Exp. 1 Japanese

0.842

0.918

Suppl. 23, Exp. 3 Am. English

0.916

0.857

Suppl. 23, Exp. 3 Japanese

0.929

0.903

Real GSM handsets, different positions

0.895

0.925

Real GSM Background Noises

0.935

Real VoIP

0.950

5.0

NiNA+ MOS scores vs. Auditiry Test results


ITU-T Suppl. 23 Exp. 1, American English
r = 0.905

4.5
4.0

NiNA+

3.5
3.0
2.5
2.0
1.5
1.0
1.0

1.5

2.0

2.5

3.0

3.5

4.0

4.5

5.0

Auditory Test (MOS)


Figure 3-1 NiNA+ Listening Quality values for noise-free speech transmissions

This database shown in Figure 3-1 is taken from the G.729 characterization phase of ITU-T and consists of a
wide range of existing codecs and combinations thereof. The results given are on a so-called per-condition
basis, which means the results of four samples transmitted through the same application scenario were
averaged.

Chapter 3 | NiNA+ Network Quality Assessment


CONFIDENTIAL MATERIALS

NiNA+Voice Measurement Description Manual


2000 - 2012 SwissQual AG

5.0

NiNA+ MOS scores vs. Auditiry Test results


Real GSM handsets, different positions
r = 0.925

4.5
4.0

NiNA+

3.5
3.0
2.5
2.0
1.5
1.0
1.0

1.5

2.0

2.5

3.0

3.5

4.0

4.5

5.0

Auditory Test (MOS)


Figure 3-2 NiNA+ Listening Quality values in GSM connections using real handsets

This database shown in Figure 3-2 is taken from a subjective test performed or
ITU-T within the P.563 competition phase. It was organized by SwissQual in the Deutsche Telekom
Laboratories in Berlin. Compared to the common ITU-T databases, where simulated speech files are used
this test contains speech recordings in real GSM circuits. The speech signals were inserted in the handset
microphone using an artificial mouth in different acoustical environments.

Measurement results of NiNA+


The following figures and results were taken from SwissQuals post-processing tool NDQI. However the
same set of results will be supported by applying SwissQuals NiNA+ solution in other environments.
After the measurement results are imported into NQDI the analysis of the results can be done as shown in
Figure 3-3. Here a complete overview about all of the obtained results is given. In addition to the calculated
parameters also the signal envelope as well as the signal in the dime domain is graphically presented.
Please note that sequences without or to less speech activity will be also analyzed but they will be signalized
separately within SwissQuals QoS systems and instead of the results the information Silence or Speech
Activity too low will be presented.

Chapter 3 | NiNA+ Network Quality Assessment


CONFIDENTIAL MATERIALS

NiNA+Voice Measurement Description Manual


2000 - 2012 SwissQual AG

Figure 3-3 Example of NiNA+ measurements shown in NQDI

Typically, of most interest to the users is the Listening Quality value gained by Figure 2-1 applying NiNA+. In
line with ITU-T Recommendation P.800.1 it is called MOS-LQO where the LQO stands for Listening Quality
Objective. The MOS-LQO is defined in range 1 to 5 where 1 is standing for bad and 5 for excellent speech
quality. In real measurements, the value will scarcely exceed 4.5.
In addition to the MOS-LQO, further analysis can be done by analysing the average section as shown in
Figure 3-4.

Chapter 3 | NiNA+ Network Quality Assessment


CONFIDENTIAL MATERIALS

NiNA+Voice Measurement Description Manual


2000 - 2012 SwissQual AG

Figure 3-4. Average NiNA+ results

The following values are presented in an average section:

MOS-LQO provided by NiNA+

Speech Level in dB OVL

Noise Level in dB OVL

Static SNR in dB

Amplitude Clipping in %

Speech Activity in %

DC Offset in %

Pitch frequency in Hz

Main Signal Distortion

Signal Class

The MOS-LQO is truly the main result of the analysis and gives an overview about the quality in a single
number result. To give a bit more feeling about the results, which can be expected, the following table lists
results obtained by analyzing coded speech with typical speech codecs.

10

Chapter 3 | NiNA+ Network Quality Assessment


CONFIDENTIAL MATERIALS

NiNA+Voice Measurement Description Manual


2000 - 2012 SwissQual AG

Table 3-2 Typical MOS values of auditory tests and NiNA+

Codec

Typical MOS-LQS (Auditory Test)

Typical MOS-LQO (NiNA+)

G.711

4.3

4.4

G.729

3.8

3.8

G.728

3.7

3.7

G.726 (32kbit/s)

3.9

3.8

GSM-FR

3.5

3.2

GSM-EFR

3.9

3.8

Speech Activity is a ratio


Number of speech frames / Total number of frames * 100
in percentage. If this value is 50 % then the number of speech active frames equal to the number of silent
frames. The higher this number is the higher is a speech density in an input signal. As mentioned above,
NiNA+ can deal with a range of 20 to 90%. A minimum amount of 3s active speech is required for both
approaches. The Speech Activity as well as the Speech Level will be calculated by internal voice activity
detection, the results are similar, but not identical, to ITU-T P.56 Active Speech Level.
Speech Level shows the R.M.S. level of all frames containing active speech. Because silent intervals and
speech pauses will be not considered, it is a good measure for the actual speech level control in the channel.
The Speech Level is presented in dB rel. to the Overload Point (32768 for 16Bit quantization) and is close to
Active Speech Level according ITU-T P.56.
Noise Level is an estimation of the background noise floor. It is mainly calculated by the noise occurring in
speech pauses. The Noise Level is the r.m.s. in dB rel. to the Overload Point (32768 for 16Bit quantization)
and is spectral un-weighted (linear filter response for calculating) except a weak telephony bandpass.
Static SNR gives brief information about the signal-to-noise ratio of the signal. Here the ratio between the
active speech and the estimated background noise floor is calculated.
Pitch Frequency is a value which represents a pitch frequency of the input signal in Hz. The Pitch
Frequency in case of speech signal is the fundamental oscillation of the talkers vocal tract. Typical pitch
frequencies for female speakers are in the range 100 to 200 Hz and for male speaker in a range 50 to 150
Hz. Even the Pitch Frequency is out of the telephone pass-band, it can be recovered by analyzing harmonic
oscillations in upper frequencies.
DC Offset: This number shows a constant value of the input signal in percentage. Human ear can not
perceive a DC Offset. The DC-Offset will not influence the quality score because it will neither be transmitted
by the transducer in the terminal nor perceived by a human ear. But a certain amount of DC Offset (>0.5%)
signalizes problems in the terminal interface or in the transmission channel itself.
Amplitude Clipping: The latest versions of NiNA+ present the Amplitude Clipping as a separate value. In
this case the corresponding label is enabled. This value describes roughly an estimated amount of amplitude
clipping. Since, no reference signal is available and the hard saturation in the time signal might be affected
by filtering, hence, this figure will only react on severe detectable clippings.
Signal Class classifies the analysed signal into

Clean speech

Noisy speech

No speech

In case of No Speech, no MOS-LQO is calculated but the signal level.


Problem code shows a possible cause for the speech degradation.
11

Chapter 3 | NiNA+ Network Quality Assessment


CONFIDENTIAL MATERIALS

NiNA+Voice Measurement Description Manual


2000 - 2012 SwissQual AG

It is possible to see more then one cause (code) in the average section. There are eight different problem
codes:

Background noise is signalized if the Noise Level is higher than -50 dB or the static SNR is below 20 dB.

Modulated Noise occurs when the segmental SNR is under-run a defined multi-dimensional threshold. It
signalizes mainly signal-form speech codecs.

Interruptions flag is set to true if one or more signal interruptions are detected in a speech signal

Level problem occurs if the signal level exceeds the nominal level for more then 10 dB. Likewise, this
problem will be also signalized if the signal level will fall 12dB below nominal level. Nominal speech level
is -26 dBov (dB to digital overload point).

DC Offset problem is shown when the DC offset of speech signal has exceeded the predefined
thresholds of +/- 0.2 %.

Amplitude clipping is shown if the saturation of the signal will lead to significant distortions.

Restricted Audio Bandwidth is flagged if there a significant limitation relatively to the expected telephone
band (3003400) can be detected.

NotSpecified signalizes that the speech quality is degraded but no outstanding reason for that
degradation could be classified

OK shows that the speech quality is nearly non-degraded

Silence and LowSpeechActivity are also signalized, but no MOS-LQO is calculated

The next step in the analysis is done by looking at the signal envelope as well as by listening to the live
recordings.
Analyzing Envelope of Received Signal:
The signal envelope is graphically presented. It provides the experienced user with visual charts information
on amplitude clippings, background noises and interruption. Especially the locations of interruptions are
marked separately by vertical lines. At the top of the line the detected length of the interruption is printed in
ms (Figure 3-5).
Tim e Dom ain
Envelope

Interruptions

0.00
0
-10
82 ms

Envelope [dBov]

-20

107 ms

71 ms

-30
-40
-50
-60
-70
-80
-90
0.00

0.20

0.40

0.60

0.80

1.00

1.20

1.40

1.60

1.80

2.00

2.20

2.40

2.60

2.80

3.00 3.20
Time [s]

3.40

3.60

3.80

4.00

4.20

4.40

4.60

4.80

5.00

5.20

5.40

5.60

5.80

6.00

Figure 3-5 Signal Envelope [dB] (Received Speech Signal)

The envelope below presents the signal in the common time domain format (Figure 3-6). Also here the
experienced user can obtain some information as peaks and amplitude clippings.
Coded Sam ple

Level

0.00
30'000
25'000
20'000
15'000
10'000
5'000
0
-5'000
-10'000
-15'000
-20'000
-25'000
-30'000
0.00

0.20

0.40

0.60

0.80

1.00

1.20

1.40

1.60

1.80

2.00

2.20

2.40

2.60

2.80

3.00 3.20
Time [s]

3.40

3.60

3.80

4.00

4.20

4.40

4.60

4.80

5.00

5.20

5.40

5.60

5.80

6.00

Figure 3-6 Time Domain Chart (Received Speech Signal)

Furthermore, the NQDI presentation sheet gives the possibility to play back the received sample by using the
default or a specified audio player as well as several options to export the results into external tables or text
documents.
12

Chapter 3 | NiNA+ Network Quality Assessment


CONFIDENTIAL MATERIALS

You might also like