You are on page 1of 66

Real-time audio streaming in a

mobile environment using


J2ME

Magnus Sillen and Jan Nordlund

July 15, 2005

Master's Thesis in Computing S ien e, 2*20 redits


Supervisor at CS-UmU: Jerry Eriksson
Examiner: Per Lindstr
om

Umea University
Department of Computing S ien e
SE-901 87 UMEA
SWEDEN
Abstra t

This master's thesis report des ribes the te hnology and implementation of a system
prototype to stream audio books to mobile phones in GPRS and 3G networks. The
appli ation on the mobile phone is developed using Java, J2ME. The audio books are
streamed from a streaming server to mobile phones in real time. The audio format used
for streaming is AMR audio.
Contents

1 Introdu tion 1

1.1 Ba kground . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Goal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.3 Purpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.4 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.5 Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.5.1 J2ME . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2 Pre-study 5

3 Requirements 7

3.1 System requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7


3.2 Appli ation requirements . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.3 Simpli ed system overview . . . . . . . . . . . . . . . . . . . . . . . . . 8

4 Mobile network te hnologies 9

4.1 GPRS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
4.2 EDGE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
4.3 3G . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4.4 Mobile network ar hite ture . . . . . . . . . . . . . . . . . . . . . . . . . 12

5 Streaming te hnology 13

5.1 Streaming proto ols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13


5.1.1 RTSP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
5.1.2 RTP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
5.1.3 RTCP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
5.2 Session Des ription Proto ol . . . . . . . . . . . . . . . . . . . . . . . . . 22
5.3 Audio formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
5.3.1 AMR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
5.3.2 AMR-WB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
5.3.3 AMR RTP data to AMR audio data . . . . . . . . . . . . . . . . 25

iii
6 Results 27

6.1 Ful llment of requirements . . . . . . . . . . . . . . . . . . . . . . . . . 27


6.2 System overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
6.3 Communi ation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
6.4 Playing streaming audio in J2ME . . . . . . . . . . . . . . . . . . . . . . 33
6.5 Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
6.5.1 Mobile network operators . . . . . . . . . . . . . . . . . . . . . . 35
6.5.2 Mobile phones . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
6.6 Requirements for running the J2ME appli ation . . . . . . . . . . . . . . 35

7 User's Guide 37

7.1 Starting the appli ation . . . . . . . . . . . . . . . . . . . . . . . . . . . 37


7.2 Main menu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
7.3 My books . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
7.4 Toplist . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
7.5 O ers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
7.6 Book sear h . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
7.7 Player . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

8 Con lusions 47

8.1 Restri tions and limitations . . . . . . . . . . . . . . . . . . . . . . . . . 47


8.2 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

9 A knowledgments 49

Referen es 51

A RTSP status odes 53

B Abbreviations 55
List of Figures

1.1 MIDP ar hite ture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2.1 Di eren e in size between WAV, MP3 and AMR audio . . . . . . . . . . 6

3.1 An overview of the system . . . . . . . . . . . . . . . . . . . . . . . . . . 8

4.1 Typi al mobile streaming ar hite ture . . . . . . . . . . . . . . . . . . . 12

5.1 3GPP Pa ket Swit hed Streaming proto ol sta k . . . . . . . . . . . . . 13


5.2 Basi stru ture of the RTSP messages . . . . . . . . . . . . . . . . . . . 14
5.3 Stru ture of the RTSP request message . . . . . . . . . . . . . . . . . . 15
5.4 Stru ture of the RTSP Request-Line. The WS stands for whitespa e and
CRLF means arriage return followed by a line feed. . . . . . . . . . . . 15
5.5 Stru ture of the RTSP response message . . . . . . . . . . . . . . . . . . 16
5.6 The Status-Line of the RTSP response message. The WS stands for
whitespa e and CRLF means arriage return followed by a line feed. . . 16
5.7 Overview of streaming using RTSP, SDP and RTP . . . . . . . . . . . . 17
5.8 RTP header . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
5.9 The proto ol sta k for RTP . . . . . . . . . . . . . . . . . . . . . . . . . 21
5.10 Header of AMR and AMR-WB payload within a RTP pa ket . . . . . . 23
5.11 The stru ture of AMR and AMR-WB payload within a RTP pa ket if
the payload ontains more than one ARM or AMR-WB frame . . . . . . 24
5.12 AMR le header . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
5.13 AMR-wb le header . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
5.14 AMR frame header . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
5.15 AMR le . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

6.1 The three main modules in the system . . . . . . . . . . . . . . . . . . . 29


6.2 The submodules of the J2ME appli ation . . . . . . . . . . . . . . . . . 29
6.3 Overview of the di erent data ommuni ation proto ols used in the system 30
6.4 Book information sent from the servlet to the mobile phone . . . . . . . 31
6.5 Toplist information sent from the servlet to the mobile phone . . . . . . 31
6.6 O ers-list information sent from the servlet to the mobile phone . . . . 31

v
6.7 An overview of the bu ering and play ba k of the audio . . . . . . . . . 33

7.1 Main menu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37


7.2 My books . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
7.3 Book options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
7.4 Toplist . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
7.5 Book information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
7.6 Buy on rmation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
7.7 O ers list . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
7.8 Book sear h . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
7.9 Sear h result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
7.10 Bu ering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
7.11 Playing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
7.12 A bookmark is set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
List of Tables

4.1 GPRS data-rates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

A.1 RTSP Status odes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

vii
Chapter 1

Introdu tion

This is the nal report for the master's thesis proje t to develop a system prototype for
delivering real-time streaming of audio books to mobile phones supporting Java[20℄.

1.1 Ba kground
During the last few years the interest for listening to audio books in Sweden has in-
reased. In year 2004 over 160 new Swedish audio book titles were published and the
total turnover for the audio book business was 96,4 million Swedish rowns, a ording
to statisti s from Forlaggareforeningen[7℄.

Through new te hnology and higher Internet bandwidth, distribution of audio books
in mp3-format (MPEG-1 Audio Layer 3) to omputers has started. This in ombination
with the fa t that ordinary audio books do not use DRM (Digital Rights Management)
have put the audio book publishing houses in a tri ky situation on erning pirate opy-
ing versus new te hnology. The pirate opying of audio books has dramati ly in reased
and osts the publishing houses a lot of money every year.

In ooperation with Bonnier Audio[4℄, the largest audio book publishing house in Swe-
den, an idea about trying to reate a system prototype for delivering real-time streaming
of audio books to the mobile phone ame about. In su h a system the mobile phone
users ould have easy a ess to audio books but without getting an a tual opy of the
audio data, that potentially ould be saved and redistributed. This is be ause no data
is permanently stored on the mobile phone. This ould be one way for the publishing
houses to take advantage of the new te hnology to rea h new users without having to
worry about pirate opying.

1.2 Goal
The goal with this thesis proje t is to develop a system prototype to deliver real-time
streaming of audio books to mobile phones supporting Java. The main task will be to
implement streaming using J2ME[19℄ (Java 2 Platform, Mi ro Edition), a version of Java

1
that is used in most modern mobile phones. More on rete this means that following
proto ols needs to be implemented into J2ME, RTSP[11℄, SDP[12℄ and RTP[10℄. More
about RTSP, SDP and RTP in hapter "Streaming te hnology".

Another goal was that the appli ation, on the mobile phone, should be easy to download,
install, use and run independent of the mobile phone's operating system.

1.3 Purpose
Sin e more and more people seems to like the idea of listening to audio books while on
the move an appli ation for this purpose in the mobile phone, whi h people often arry
around, would be a great feature. The purpose of this thesis proje t is to develop a
system prototype for su h a feature using J2ME.

1.4 Methods
The following steps were used as method for this proje t:

{ Use ase s enarios: To get a better overview of how the system should work and
how the user should intera t with the appli ation on the mobile phone, use ase
s enarios was used.
{ Te hnology: An in-depth study of the te hnology used for streaming had to be
done.
{ Design and implementation: Design, implementation and testing of the system.
{ Do umentation: Writing the nal thesis report.

1.5 Tools
The following tools were used during the development of the system:

{ Platform: Windows XP
{ Program language: Java, J2ME
{ Editor: Jedit
{ Version ontrol system: CVS
{ Stream server: Darwin streaming server[16℄
{ Database: MySQL[9℄
{ Servlet ontainer: Tom at[6℄
{ Mobile phone: Nokia 6680
{ Audio books: A few audio books from Bonnier Audio.
{ Network sniÆng program: Ethereal[5℄
{ Mobile phone simulation environment: Sony Eri sson J2ME SDK 2.2.0[1℄

1.5.1 J2ME

J2ME stands for Java 2 Platform, Mi ro Edition. This is a version of Java that is de-
signed for devi es with limited memory, display and pro essing power. Mobile phones
and PDAs (Personal Digital Assistant) are examples of su h devi es.

The on guration, that de nes the Java language features and the ore Java libraries of
the JVM (Java Virtual Ma hine), that is used is CLDC. CLDC stands for Conne ted,
Limited Devi e Con guration.

The JVM is a virtual ma hine that runs the Java appli ations. It translates the lass
les into ma hine ode for the platform where the JVM is running. It is the use of a
virtual ma hine that makes Java independent of underlying operating systems.

When using CLDC the virtual ma hine is alled KVM (K Virtual Ma hine). It is a
virtual ma hine designed for limited devi es.

Apart from the on guration, CLDC, a pro le is needed to further spe ify what kind of
devi e the appli ation will operate on. The pro le is like an extension of the on gu-
ration. The pro le used in this proje t is MIDP 2.0 (Mobile Information Devi e Pro le).

Figure 1.1 shows the hierar hy of the virtual ma hine (KVM), on guration (CLDC)
and the pro le (MIDP 2.0) ar hite ture.

MID Profile
CLDC Core Libraries
K Virtual Machine (KVM)
Host Operating system

Figure 1.1: MIDP ar hite ture

In order to play ba k sound using J2ME an API alled MMAPI (Multimedia API) is
needed.
MMAPI

MMAPI is an extension to J2ME. It supports time-based multimedia on small wireless


devi es, for example mobile phones.

MMAPI an be divided into the following four elements:

{ Player. A Player that a epts and de odes the media data. It is neutral to
whatever media data it re eives. There is no di eren es between a Player for
audio or video data. Di eren es between audio and video data are made with
ontrols asso iated with the Player.
{ Controls are used to render media of a spe i type. Every media type require
or re ommend that one or more ontrols are added to the Player. Example of a
ontrol is a volume ontrol for audio data.
{ DataSour e. The DataSour e provides proto ol handling and methods that ontrol
media play ba k and syn hronization. It hides the details of how the media data
is read from its sour e to the Player. For example, the media an be read from
HTTP[13℄, a le or from other me hanisms.
{ Manager. The manager puts all these pie es together by letting the user reate
Players and asso iate them with a DataSour e.
Chapter 2

Pre-study

In the early stage of the pre-study, before any implementation de isions were made, some
use ase s enarios was made up to point out the issues that an arise when streaming
audio in a mobile environment.

The next stage was to hoose a mobile appli ation platform. To make the appli ation
easy to install and independent of underlying operating systems a de ision to implement
the appli ation in J2ME was taken. Another ontributing fa tor for the hoi e of J2ME
was that most modern mobile phones supports J2ME.

When the hoi e of platform was done the fo us turned to investigate pros and ons
of di erent streaming proto ols. To make the solution independent of underlying trans-
port proto ols a de ision was made to use RTSP (Real-Time Streaming Proto ol) in
ombination with RTP (Real-Time Transport Proto ol). These proto ols are standard
in most streaming servers. However, these proto ols have some limitations in the ability
to pass through rewalls and networks with NAT/NAPT (Network Address Transla-
tion/Network Address Port Translation), whi h an be a major issue when streaming
data in mobile networks.

The quality versus bandwidth-requirements of the audio data is a big issue when stream-
ing data in a mobile environment. The system protoype required an audio format that
was optimized for spee h and demanded low bandwidth.

There were two formats that stood out, AMR[3℄ (Adaptive Multi-rate Code ) and
Speex[18℄, both built on CELP (Code Ex ited Linear Predi tion) te hnology. Speex
is an open-sour e patent-free audio ompression format in ontrast to the proprietary
AMR format patented by Voi eAge. Both are optimized for spee h and have low band-
width requirements.

The fa t that Speex is still under development means that it su ers from some ma-
jor drawba ks on erning eÆ ien y on devi es with low pro essing apa ity. AMR on
the other hand have been used for a very long time and is supported in the hardware
of all mobile phones with MMS apability. After areful onsideration a de ision to use
AMR as audio format was made.

5
Figure 2.1 shows the di eren e in size between WAV, MP3 and AMR (AMR-NB) audio.
The three blo ks represents an audio book CD in these audio formats. The size of these
les are:

{ WAV: 357,25 MB
{ MP3: 48,61 MB
{ AMR: 14,77 MB

The AMR le is approximately 4,1% of the size of the WAV le and approximately
30,4% of the size of the mp3 le. This makes AMR audio the best hoi e of these three
audio types for this system.

400

350

300

250
Size (MB)

200

150

100

50

0
WAV MP3 AMR

Figure 2.1: Di eren e in size between WAV, MP3 and AMR audio
Chapter 3

Requirements

To get organized and to keep fo us on the main tasks during this proje t some re-
quirements was set. The requirements has been divided into two parts, SR (system
requirements) and AR (appli ation requirements).

3.1 System requirements


Below are the proposed requirements for the system:

{ SR1: All audio book information should be stored in a database.


{ SR2: The system should work in 3G and GPRS networks.
{ SR3: The audio format used for streaming should be suitable for spee h and use
low bandwidth.
{ SR4: The system should be s alable.
{ SR5: The system should use open-sour e alternatives where possible.

3.2 Appli ation requirements


Below are the proposed requirements for the mobile phone appli ation:

{ AR1: The appli ation should be developed using J2ME.


{ AR2: Support audio streaming through RTSP, SDP and RTP proto ols. RTSP
and SDP should have TCP as underlying transport proto ol while RTP should
have UDP.
{ AR3: The appli ation should be able to fet h book information from the database.
{ AR4: From the appli ation a user should be able to buy a book.
{ AR5: From the appli ation a user should be able sear h for books in the database.

7
{ AR6: The appli ation should have an audio book toplist and an o ers-list.
{ AR7: Bookmarks and information about books that have been bought should be
stored in the memory of the mobile phone.
{ AR8: From the appli ation a users hould be able to play, pause, rewind, fast
forward, set bookmark and hange the volume.
{ AR9: The appli ation should be able to resume the listening from where the users
last stopped listening.
{ AR10: The appli ation should have support for multiple languages.

3.3 Simpli ed system overview


To get a view of how the system should be onstru ted and linked together with the
requirements, a simpli ed system des ription is shown i gure 3.1.
Servlet
Database

Internet
3G/GPRS
network
Mobile phone

Stream server

Figure 3.1: An overview of the system

The di erent parts of the gure are explained below.


{ Database. In the database all information about the books are stored, su h as
title, author, audio length et .
{ Servlet. To get information from the database to the mobile phone, the phone has
to ommuni ate with a servlet that is onne ted to the database.
{ Stream server. The stream server is where the audio books are stored. The server
streams these books to mobile phones.
{ Mobile phone. A Java appli ation is installed on the mobile phone. That appli a-
tion is used to listen to the audio books.
Chapter 4

Mobile network te hnologies

As the appli ation should be able to run in GPRS[17℄, EDGE[17℄ and 3G[17℄[8℄ networks,
this hapter will explain these te hnologies and the di eren es between them.

4.1 GPRS
GPRS (General Pa ket Radio Servi e) is an additional omponent to existing GSM
(Global System for Mobile Communi ations) ar hite ture to support pa ket-swit hed
proto ols for transferring data.

GPRS was introdu ed be ause the existing GSM network was not suited to handle
pa ket-swit hed data in an eÆ ient way. GPRS has two main omponents, Coding
S hemes (up to four s hemes) and Time Slots (up to eight s hemes).

GPRS theoreti ally onsist of four hannel oding s hemes, CS-1, CS-2, CS-3 and CS-4.
All s hemes have di erent properties, CS-1 has the most eÆ ient error orre tion and
is suited to be used when the quality of the radio link is poor. CS-4 has no error or-
re tion and should only be used in very good onditions. Today mostly CS-1 and CS-2
are implemented in a tual mobile networks.

GPRS allows several mobile stations (usually mobile phones) to share the same fre-
quen y by dividing it into di erent timeslots. Due to the pa ket-swit hed hara teristi s
of GPRS the allo ation of the available timeslots may vary from one instant to the next
(e.g. it may have eight timeslots at one time and four later on). This allows multiple
users to share the same transmission medium by using only the part of the bandwidth
they require. This ould potentially be a problem when dealing with real-time streaming
where one would like to have a uent bit rate at all time.

GPRS main features are speed, immedia y, and better use of utilization of network
resour es. By using multiple timeslots simultaneously and more eÆ ient algorithms
for hannel oding GPRS an a hieve higher data-rates (speed) than GSM. Immedia y
means that no "dial-up pro edure" must be used as in ir uit-swit hed data networks.
The data is instead transferred in pa kets and routed individually whi h means that

9
there is no need to establish onne tions between the network nodes. Therefore the
data an be transferred almost immediately to the mobile station upon request.

For the end user this also makes billing more exible, be ause the user only have to
pay for the a tual data transferred and not upon onne tion time. GPRS an also use
timeslots that are left over from ir uit-swit hed onne tions to transfer data, this im-
proves the utilization of the radio resour es in the network.

GPRS an theoreti ally transfer data in 171.2 kbps, by using all eight timeslots si-
multaneously and a hannel oding with redu ed error orre tion.

Coding s heme CS-1 CS-2 CS-3 CS-4


Data-rate with 9.05 Kbps 13.4 Kbps 15.6 Kbps 21.4 Kbps
1 time-slots
Maximum data-rate 72.4 Kbps 107.2 Kbps 124.8 Kbps 171.2 Kbps
with 8 time-slots
Table 4.1: GPRS data-rates

4.2 EDGE

EDGE (Enhan ed Data Rates for Global Evolution), is an upgrade of the GSM and
GPRS network that improves the air interfa e between a mobile station, for example a
mobile phone, and a base station. Using EDGE the speed of transferring data will be
improved in both pa ket-swit hed and ir uit-swit hed onne tions.

The networks apa ity will grow using EDGE be ause it makes it possible for more
users to share the same timeslots. EDGE an also share timeslots with onventional
GPRS networks whi h improves the utilization of the radio resour es.

EDGE and GSM/GPRS operates on the same frequen y, but they use di erent ra-
dio hannel modulations and proto ols. EDGE uses 8-Phase Shift Keying, 8PSK, and
GSM/GPRS uses Gaussian Minimum Shift Keying, GMSK. 8PSK is usually more ef-
fe tive than GMSK.

As GSM/GPRS uses four di erent oding s hemes, CS-1 to CS-4, EDGE uses nine
di erent s hemes, alled MCS-1 to MCS-9. The rst four s hemes use GMSK and are
e e tive in onditions with bad data-rate. The last ve s hemes use 8PSK and they
o er more data-rate. The rst four s hemes use rates from 8.8 kbps to 17.6 kbps per
timeslot and the last ve use data-rates from 22.4 kbps to 59 kbps per timeslot.

The theoreti al maximal data-rate using EDGE with eight timeslots is 473.6 kbps.
4.3 3G
3G is short for third generation mobile system. One 3G system is UMTS[8℄ (Universal
Mobile Tele ommuni ations System). UMTS is ompletely di erent from GSM/GPRS.
UMTS uses a te hnology alled WCDMA (Wideband Code Division Multiple A ess)
that does not use timeslots like GSM/GPRS. Instead the devi es that uses WCDMA,
su h as mobile phones, share the same frequen y and they separate ea h other with the
use of hash odes.

Be ause of this UMTS networks needs di erent base station sub systems than GSM/GPRS
networks. UMTS base station sub systems are alled UTRAN (UMTS Terrestrial Radio
A ess Network).

UMTS and GPRS uses the same GPRS ore network as GSM/GPRS systems and
UMTS mobile stations are also ba kward ompatible with GSM/GPRS so that they
an use GSM/GPRS if there is no UMTS onne tion available.

UMTS have mu h higher bitrates than GSM, GPRS and EDGE, with theoreti al max-
imal speeds of 384 kbps for ir uit-swit hed onne tions, voi e and video alls, and 2
Mbps for pa ket-swit hed onne tions, data and Internet onne tions.

UMTS has four di erent quality-of-servi e lasses. See the list below for examples of
appli ations that uses the di erent lasses.

{ Conversational lass: For real-time servi es like voi e and video all and real time
gaming.
{ Streaming lass: For streaming multimedia ontent.
{ Intera tive lass: For web browsing and non real-time gaming.
{ Ba kground lass: For ba kground download of emails, for example.

The rst two lasses will be transmitted as real-time onne tions over the WCDMA air
interfa e and the last two will be transmitted as s heduled non real-time pa ket data.

The onversational and the streaming lass is for servi es with lower response times
and higher throughput than the intera tive and the ba kground lass.
4.4 Mobile network ar hite ture
Below in gure 4.1 a simpli ed view how the mobile te hnologies (GPRS/EDGE/3G)
links together as a network. Both UTRAN and GSM BSS (GSM Base Station Sub-
system) share the same GPRS ba kbone to send and re ieve data. From the GPRS
ba kbone the data is passed through a rewall to rea h the Internet and the streaming
server.
3G Mobile Streaming
Server

UTRAN
UTRAN
GPRS
GPRS Internet
backbone Internet
backbone
GSM BSS
GSM BSS
Firewall

GPRS/EDGE
Mobile

Figure 4.1: Typi al mobile streaming ar hite ture


Chapter 5

Streaming te hnology

This hapter des ribes the streaming te hnology used in the system.

5.1 Streaming proto ols


To stream audio books to mobile phones the system use RTSP, SDP and RTP proto ols,
whi h are standard for network media streaming. The RTSP proto ol is used to set up
the data stream between the server and the appli ation on the mobile phone. It is also
used to handle events su h as play and pause et . RTP is the proto ol that deliver the
a tual data that is streamed, in this ase the audio data. Figure 5.1 shows the pa ket
swit hed streaming proto ol sta k that 3GPP (Third Generation Partnership Proje t)
has agreed upon for the 3G mobile networks.
Still images, Seesion set-up, Quality feedback Media devilery
graphics, text control

payload
SDP formats

HTTP RTSP RTCP RTP

TCP UDP

IP

Figure 5.1: 3GPP Pa ket Swit hed Streaming proto ol sta k

13
5.1.1 RTSP

RTSP stands for Real-Time Streaming Proto ol. The RTSP proto ol is used for estab-
lishment and ontrol of time-syn hronized streams of ontinuous media su h as audio
and video. One ould think of RTSP as a remote ontrol for multimedia servers. To
make the intera tion between RTSP servers and lients exible, the proto ol does not
have a notion of a onne tion. Instead a RTSP server maintains a session identi er
assosoiated to media streams and their state. This means that during a RTSP session
a lient may use many di erent types of reliable transport proto ols to issue RTSP re-
quests to the RTSP server as long as it knows the session identi er.

The multimedia streams ontrolled by RTSP are not spe i ed to use any spe i trans-
port proto ol to deliver the media. This makes the RTSP proto ol very general and
extensible. However the most ommon transport proto ol for the media in use with
RTSP is RTP (real-time transport proto ol).

The RTSP proto ol is text-based and has inherited the design to a high degree from
HTTP/1.1. This makes it easy to read, understand and debug. However RTSP di ers
in some key aspe ts from HTTP:
{ RTSP use a session on ept in the proto ol.
{ An RTSP server needs to maintain state opposed from the stateless nature of
HTTP.
{ Both an RTSP server and lient an issue requests.
{ The Request-URI always ontains the absolute URI.
The fa t that RTSP is text-based (UTF-8) means that it is vulnerable to bit errors and
should not be exposed to them. The messages an be use in any low-level transport
proto ol that is 8-bit lean. Every line is terminated by CRLF ( arriage return followed
by a line feed). The basi stru ture of the text-based messages in RTSP is shown in
gure 5.2. Explanation to gure 5.2:
General Message type Entity
Message type header* CRLF [Message-body]
header* header*

Figure 5.2: Basi stru ture of the RTSP messages

{ Message type an either a request or response.


{ General-header, message type-header and entity-header are marked with * be ause
they are all optional.
{ Message body will follow only if the Content-Length entity-header is set.

RTSP request messages

As mentioned before the RTSP request messages may be issued by either the lient or
the server. Below, in gure 5.3, the stru ture of a request message is shown.
General Request Entity
Request-Line header* header* CRLF
header*

Figure 5.3: Stru ture of the RTSP request message

The most important part of the request message is the so alled Request-Line, see gure
5.4. The Request-Line is the rst line of a request message and onsist of Method,
Request-URI and proto ol version.

Method WS Request-URI WS RTSP-Version CRLF

Figure 5.4: Stru ture of the RTSP Request-Line. The WS stands for whitespa e and
CRLF means arriage return followed by a line feed.

The methods that an be requested are the following:

{ DESCRIBE
{ GET PARAMETER
{ OPTIONS
{ PAUSE
{ PLAY
{ PING
{ REDIRECT
{ SETUP
{ SET PARAMETER
{ TEARDOWN

Below is an example of how a request method ould be used by a lient. The lient
would like to know what types of methods the server supports and sends the following
request message.

OPTIONS rtsp://example. om/ RTSP/1.0

For more about the methods and how they all link together see se tion "RTSP Methods"
in this hapter.

RTSP response message

This type of RTSP message is a response to a RTSP request message. Below, in gure
5.5, the stru ture of su h a message is shown.
General Response Entity
Status-Line header* header* CRLF
header*

Figure 5.5: Stru ture of the RTSP response message

The rst line of a response message is the Status-Line, see gure 5.6. The Status-Line
onsist of the proto ol version followed by a numeri status ode. Ea h status ode is
asso iated with a textual phrase a so alled Reason-Phrase.

RTSP-Version WS Status-Code WS Reason-Phase CRLF

Figure 5.6: The Status-Line of the RTSP response message. The WS stands for whites-
pa e and CRLF means arriage return followed by a line feed.

The status odes of response messages have been divided into some general groups seen
below:

{ 1XX - Informal
{ 2XX - Su ess
{ 3XX - Redire tion
{ 4XX - Client Error
{ 5XX - Server Error

Ea h of these groups are divided into spe i messages that are used to give more
spe i information. For example a server may send the lient "RTSP/1.0 551 Option
not supported" when the lient requests an option that is not implemented. For a
omplete list see Appendix A.
RTSP Methods

The most important RTSP methods are des ribed in the following se tion in order to
ve a more pra ti al view of RTSP. Figure 5.7 shows an overview of a streaming session
using RTSP, SDP and RTP.
Streaming Streaming
client server
DESCRIBE
RTSP/1.0 200 OK
(with SDP description)

SETUP

RTSP/1.0 200 OK
T
PLAY i
m
RTSP/1.0 200 OK e

RTP packet flow (UDP)


.
.
.
TEARDOWN
RTSP/1.0 200 OK

Figure 5.7: Overview of streaming using RTSP, SDP and RTP

DESCRIBE
The DESCRIBE method retrieves the des ription of a presentation or media obje t
identi ed by the request URI from a server. The DESCRIBE reply-response pair an
be seen as an initialization phase of RTSP. The des ription of the media in the example
below, is in SDP (Session Des ription Proto ol) format. For easier reading the CRLF
ending of every line will only be printed in this example.

(Client -> Server)


DESCRIBE rtsp://example. om/file.3gp RTSP/1.0\r\n
CSeq: 1\r\n
A ept: appli ation/sdp\r\n
User-Agent: MobiBook (V 0.1)\r\n\r\n

(Server -> Client)


RTSP/1.0 200 OK\r\n
Cseq: 1\r\n
Last-Modified: Fri, 14 Jan 2005 09:53:08 GMT\r\n
Ca he-Control: must-revalidate\r\n
Content-length: 451\r\n
Date: Thu, 12 May 2005 15:27:09 GMT\r\n
Content-Type: appli ation/sdp\r\n\r\n
v=0
o=StreamingServer 3324900457 1105696388000 IN IP4 193.0.0.2
s=/file.3gp
=IN IP4 193.0.0.2
t=0 0
a= ontrol:*
a=range:npt=0-19.52000
m=audio 0 RTP/AVP 97
a= ontrol:tra kID=4
a=rtpmap:97 AMR/8000/1
a=fmtp:97 o tet-align=1;

SETUP
The SETUP request for an URI spe i es the transport me hanism to be used for the
streamed media. For the lient, a eptable transport parameters will be spe i ed in the
Transport header. The server response will ontain the transport parameters for both
lient and server, in luding a SSRC (syn hronization sour e) number used by RTCP[10℄.
Even port numbers will be used for transmitting the data and odd port numbers will
be used by RTCP.

(Client -> Server)


SETUP rtsp://193.0.0.2:554/file.3gp/tra kID=4 RTSP/1.0
CSeq: 2
Transport: RTP/AVP;uni ast; lient_port=6974-6975
User-Agent: MobiBook (V 0.1)

(Server -> Client)


RTSP/1.0 200 OK
Cseq: 2
Session: 5394457754580508750
Transport: RTP/AVP;uni ast;sour e=193.0.0.2; lient_port=6974-6975;
server_port=6970-6971;ssr =0F0F0E42

PLAY
The PLAY request tells the server to start streaming the data to the lient. The server
will send the data a ording to the transport parameters agreed upon in the SETUP
request. By in luding the session identi ation number, retrieved in the SETUP re-
sponse, the server knows what data to send. In the example below, a Range header is
also in luded that tells the server to stream a spe i interval of the media. The server
response of the PLAY request, in this example, also in ludes a RTP-info header. The
RTP-info header onsists of a semi olon separated string.

(Client -> Server)


PLAY rtsp://193.0.0.2:554/file.3gp RTSP/1.0
CSeq: 3
Range: npt=0.000000-19.520000
Session: 5394457754580508750
User-Agent: MobiBook (V 0.1)

(Server -> Client)


RTSP/1.0 200 OK
Cseq: 3
Session: 5394457754580508750
Range: npt=0.00000-19.52000
RTP-Info: url=rtsp://193.0.0.2:554/file.3gp/tra kID=4;seq=43413;rtptime=310761492

PAUSE
The PAUSE request auses the stream to be halted temporarily. The server's resour es
are kept until a PLAY request is sent to the server or the session times out.

(Client -> Server)


PAUSE rtsp://193.0.0.2/file.3gp RTSP/1.0
CSeq: 4
Session: 5394457754580508750
User-Agent: MobiBook (V 0.1)

(Server -> Client)


RTSP/1.0 200 OK
Cseq: 5
Session: 5394457754580508750

TEARDOWN
The TEARDOWN request stops the stream for the given URI, freeing the resour es
asso iated with it on the server side.

(Client -> Server)


TEARDOWN rtsp://193.0.0.2:554/file.3gp RTSP/1.0
CSeq: 5
Session: 5394457754580508750
User-Agent: MobiBook (V 0.1)

(Server -> Client)


RTSP/1.0 200 OK
CSeq: 5

5.1.2 RTP

RTP stand for Real-Time Transport Proto ol. It is a proto ol to stream data in real-
time. The RTP proto ol delivers the data from the server to the part that requested
the data.

RTP does not ensure that the pa kets will be delivered to the re eiver in the right
order. That is up to the re eiver to assure. To handle this all pa kets have a sequen e
number that in rements by one for every RTP pa ket sent.

Another thing RTP does not ensure is timely deliver and other quality-of-servi e guar-
antees. For this it relies on other, lower layer, servi es to handle.

The stru ture of an RTP pa ket header is shown in gure 5.8. The header is followed
by the payload, whi h ontains the media that is streamed. Ea h number in the top of
the gure represents one bit.
0 1 2 3 4 5 6 70 1 2 3 4 5 6 7 0 1 2 3 4 5 6 70 1 2 3 4 5 6 7
V=2 P X CC M PT Sequence number
Timestamp
Synchronization source (SSRC) identifier
Contributing source (CSRC) identifiers
...

Figure 5.8: RTP header

The xed header does not ontain the CSRC ( ontributing sour e) identi ers. It on-
tains CSRC identi ers only if there is more than one re eiver of the streaming data
session in the system. The audio book system in this proje t will only have one re eiver
in ea h session, so no CSRC identi ers are ne essary.

Explanation to gure 5.8:

{ V (2 bits) = Version. Des ribes what version of RTP this pa ket use. In the gure
the version is 2.
{ P (1 bit) = Padding. If this bit is set to 1 it means that the RTP pa ket has at
least one padding o tet in the end. The last byte of the payload shows the number
of padding o tets in the pa ket.
{ X (1 bit) = Extension. If the extension bit is set to 1 the xed header is extended
by one header extension. That is not relevant in this proje t but more information
about that an be found in [10℄.
{ CC (4 bits) = CSRC ount. The CC ontains the number of CSRC identi ers
ontained in the RTP header. The CSRC follows the xed header.
{ M (1 bit) = Marker. How to interpret the marker bit is de ned by a pro le. For
example events like frame boundaries an be marked this way.
{ PT (7 bits) = Payload type. The PT holds information about what kind of payload
the pa ket ontains. In this system the PT identi es that AMR audio is streamed.
{ Sequen e number (16 bits). The sequen e number is the pa ket's sequen e number.
It in rements by one for ea h pa ket sent by the server. The initial value of the
sequen e number is a random and unpredi table value. This is to make the proto ol
more safe. The sequen e number is used by the re eiver to put all the re eived
pa kets in the right order.
{ Timestamp (32 bits). The timestamp re e ts the sampling frequen y of the rst
o tet of data in the payload of the RTP pa ket. The initial value is random, and
it in reases linearly and monotoni ally in time.
{ SSRC (32 bits). SSRC identi es the syn hronization sour e. This is hosen ran-
domly to avoid that two or more syn hronization sour es in the same RTP session
gets the same SSRC identi er.
{ CSRC (0-15 items, 32 bits ea h). This list of CSRC identi ers identi es the on-
tributing sour es for the payload ontained in this pa ket.

The implementation of RTP in this proje t has UDP[15℄ (User Datagram Proto ol)
as underlying transport proto ol. UDP has IP[14℄ (Internet Proto ol) as underlying
transport proto ol. See gure 5.9 for an overview of the di erent layers that is used in
the RTP implementation.

RTP

UDP

IP

Figure 5.9: The proto ol sta k for RTP

5.1.3 RTCP

There was no time to implement this proto ol and RTCP is not really ne essary for
the appli ation. But RTCP is often used in systems that uses RTP, so here is a short
des ription of the RTCP proto ol. For further information about RTCP, see [10℄.

RTCP stands for RTP Control Proto ol. As the name indi ates it is used as a on-
trol proto ol while streaming with RTP. RTCP is used to monitor the quality servi e
in the streaming session. It is also used to onvey information about the members in a
streaming session.

Using RTCP it is possible to monitor delay, bandwidth quality and gather statisti s.
This an be used to deliver the best quality possible to the members in a session. For
example if there is a video onferen e with tree parti ipants and one of them do not
have as good bandwidth as the other two, that member an re eive its data in lower
quality than the others, using a so alled mixer.

Sin e there only will be one member in ea h streaming session of the audio book system,
one mobile phone that re eives the data for ea h spe i stream, the use of RTCP is not
as ne essary as in sessions with more than one member. For now the implementation of
RTCP will be left for future work.
5.2 Session Des ription Proto ol
The SDP (Session Des ription Proto ol) is used to des ribe general multimedia sessions.
The proto ol des ribes the media a user wants to re eive su h as audio, video or both,
whi h ode s to use and so on. The SDP proto ol is used in the DESCRIBE reply-
request method to des ribe the media session of the audio book the user want to listen
to.

A SDP session des ription onsist of a number of text-lines of the form Type=Value.
Type is always exa tly one hara ter and is ase signi ant. The di erent types that
are used are:

Session des ription


v= proto ol version
o= owner/ reator and session identifier.
s= session name
i=* session information

u=* URI of des ription


e=* email address
p=* phone number
=* onne tion information - not required if in luded in all media
b=* bandwidth information

<Blo k with one or more time des riptions see below>

z=* time zone adjustments


k=* en ryption key
a=* zero or more session attribute lines

<Blo k with zero or more media des riptions see below>

Time des ription


t= time the session is a tive
r=* zero or more repeat times

Media des ription


m= media name and transport address
i=* media title
=* onne tion information - optional if in luded at session-level
b=* bandwidth information
k=* en ryption key
a=* zero or more media attribute lines

The types marked with * signs are optional.


5.3 Audio formats
The sound format primary used in the system is AMR (Adaptive Multi-Rate) spee h
ode . This hapter is not a des ription of how to en ode/de ode audio les of other
format to AMR les, this is a des ription of how to send and re eive AMR audio through
the RTP proto ol.

5.3.1 AMR

Ea h audio frame of AMR-NB (Narrow-Band), or simply AMR, delivered through RTP


is 20 ms long, so the a tual data size of the frame varies depending on the sampling
frequen y of the audio.
The AMR audio sampling frequen y varies from 4,75 kbit/s to 12,2 kbit/s.

The header of the AMR payload within the RTP pa kets is shown in gure 5.10. It
is one byte long.
0 1 2 3 4 5 6 7
F FT Q P P

Figure 5.10: Header of AMR and AMR-WB payload within a RTP pa ket

Explanation to gure 5.10:

{ F (1 bit). If this bit is set to 1 it indi ates that this frame is followed by another
frame in this payload, otherwise it should be set to 0.
{ FT (4 bits)= Frame type. This indi ates what kind of AMR or AMR-WB[2℄
(Wide-Band) spee h oding mode or omfort noise mode (using SID - Silen e
Des riptor) the frame is in. From this it is possible to look up the sampling
frequen y of the frame.
{ Q (1 bit) is a frame quality indi ator. If Q is set to 0 it means that the frame is
severely damaged.
{ P (2 bits) is a padding bit and must be set to zero.

If the RTP payload only ontains one AMR or AMR-WB frame, the a tual AMR/AMR-
WB audio data follows after the header. If it ontains more than one frame the payload
is stru tured like in gure 5.11.
01234567012345670123456701234567
Header1 Header2 Header3 Header4
Frame 1 ...
Frame 2 ...
Frame 3 ...
Frame 4 ...

Figure 5.11: The stru ture of AMR and AMR-WB payload within a RTP pa ket if the
payload ontains more than one ARM or AMR-WB frame

Explanation to gure 5.11:

{ Header1 - Header4 are all headers of the type shown in gure 5.10. Ea h header
is one byte long.
{ Frame 1 - Frame 4 ontains the data of the ARM/AMR-WB audio. Frame 1 is
the orresponding payload/data to Header1 and so forth. The frames are marked
with "..." be ause they vary in size. As said before the size of the frames depends
on the sampling frequen y of the the AMR/AMR-WB audio.

5.3.2 AMR-WB

AMR-WB (Wide-Band) has mu h better sound quality than AMR audio. But AMR-
WB needs more bandwidth be ause it ontains more data.

The sampling frequen y of AMR-WB audio varies from 6,60 kbit/s up to 23.85 kbit/s.

The header of the AMR-WB payload within a RTP pa ket is the same as for AMR
audio, see gure 5.10. Also, as in the ase with AMR audio, AMR-WB audio an be
sent with more than one frame per RTP pa ket, see gure 5.11.
5.3.3 AMR RTP data to AMR audio data

The AMR/AMR-WB le formats are di erent when streamed as payload within a RTP
pa ket from when it is played ba k as an audio le. This means that the AMR/AMR-
WB audio data re eived from the RTP pa kets needs to be onverted to the format used
when playing an AMR/AMR-WB audio le.

A header, that identi es the audio type, is needed in the beginning of the le. For
AMR audio the header looks like in gure 5.12 and for AMR-WB the header looks like
in gure 5.13. Ea h hara ter in the two headers is one byte long.

#!AMR\n

Figure 5.12: AMR le header

#!AMR-WB\n

Figure 5.13: AMR-wb le header

The audio data from the RTP pa kets needs to be appended to the header. Ea h audio
frame needs a header, that looks like in gure 5.14, where ea h hara ter represents one
bit. P stand for padding, T for payload type and V for valid. The frame header is one
byte long.

PTTTTVPP

Figure 5.14: AMR frame header

The audio data frames needs to be pa ked in big-endian order, that is with the most
signi ant bit of ea h byte as the rst bit.
Figure 5.15 shows how the headers and the audio data is built up to a AMR audio
le. The gure does not show the size of the le or the di erent parts in it, only the
stru ture.

#AMR\n

Header1

Frame1
...
Header2

Frame2
...
.
.
.
HeaderN

FrameN
...

Figure 5.15: AMR le

For more information about this see [21℄.


Chapter 6

Results

The result of this master's thesis proje t is a system prototype that shows that it is
possible to stream audio books to mobile phones using J2ME and the RTSP, SDP and
RTP proto ols.

6.1 Ful llment of requirements


The implemented system have rea hed the following status from the proposed require-
ments:
System requirements:

{ SR1: All audio book information should be stored in a database.


Ful lled. All audio book information is stored in a MySQL database.
{ SR2: The system should work in 3G and GPRS networks.
Ful lled. The system is tested in both 3G and GPRS network and works ne.
{ SR3: The audio format used for streaming should be suitable for spee h and use
low bandwidth.
Ful lled. The audio format used is AMR, whi h is optimized for spee h and very
bandwidth e e tive.
{ SR4: The system should be s alable.
Full lled. Both streaming server and database has been proven to s ale very
well, those together with the mobile network are the potential bottle ne ks in the
system.
{ SR5: The system should use open-sour e alternatives where possible. Not ful lled.
The system unfortunately use a proprietary sound format (AMR).

27
Appli ation requirements:

{ AR1: The appli ation should be developed using J2ME.


Ful lled.
{ AR2: Support audio streaming through RTSP, SDP and RTP proto ols. RTSP
and SDP should have TCP as underlying transport proto ol while RTP should
have UDP.
Ful lled. All proto ols are implemented, although in a slightly redu ed version.
{ AR3: The appli ation should be able to fet h book information from the database.
Ful lled. The appli ation use HTTP to ommuni ate with a servlet that sends
ba k the requested information from the database to the appli ation.
{ AR4: From the appli ation a user should be able to buy a book.
Not ful lled. No a tual billing system has been onne ted to the appli ation. But
it is possible to buy a book without any billing.
{ AR5: From the appli ation a user should be able sear h for books in the database.
Ful lled. It is possible sear h for books by author, title, ISBN and ategory.
{ AR6: The appli ation should have an audio book toplist and an o ers-list.
Ful lled.
{ AR7: Bookmarks and information about books that have been bought should be
stored in the memory of the mobile phone.
Full lled.
{ AR8: From the appli ation a users hould be able to play, pause, rewind, fast
forward, set bookmark and hange the volume.
Ful lled.
{ AR9: The appli ation should be able to resume the listening from where the users
last stopped listening.
Full lled.
{ AR10: The appli ation should have support for multiple languages.
Ful lled. Currently implemented languages are Swedish and English.
6.2 System overview
The system an be divided into three main modules, see gure 6.1.

Servlet/Database

J2ME application

Streaming server

Figure 6.1: The three main modules in the system

The J2ME appli ation, on the mobile phone, an be divided into ve submodules, see
gure 6.2.

Main-menu

Servlet communication

Streaming communication

Player-GUI

Audio player

Figure 6.2: The submodules of the J2ME appli ation

Explanation of the di erent subsystems in gure 6.2:


{ Main-menu: The main module and the "Main-menu" of the J2ME appli ation.
The appli ation starts and ends from this module. It is from here the other
submodules of the appli ation starts.
{ Servlet ommuni ation: This submodule handles the HTTP ommuni ation be-
tween the appli ation and the servlet/database.
{ Streaming ommuni ation: This submodule handles the RTSP/SDP and RTP
ommuni ation between the appli ation and the streaming server.
{ Player-GUI: The player-GUI handles the GUI of the player, what to be painted
on the s reen and when.
{ Audio player: This part handles the playba k of the audio re eived from the
streaming server. It extra ts the audio from the RTP-pa kets and put it together
as an AMR audio le and then plays it.

6.3 Communi ation


Figure 6.3 shows the di erent data ommuni ation proto ols and between whi h parts
in the system they are used.

Servlet HTTP
Database RTSP/SDP
RTP

HTTP

RTP Internet
RTSP/SDP 3G/GPRS
network
Mobile phone

Stream server

Figure 6.3: Overview of the di erent data ommuni ation proto ols used in the system

As seen in gure 6.3 there are four di erent data ommuni ation proto ols in the system,
HTTP, RTSP, SDP and RTP.

HTTP stands for Hypertext Transfer Proto ol. This is the standard proto ol for the
WWW (World-Wide Web). The mobile phone uses HTTP to ommuni ate with the
servlet and vi e versa. The information that is sent between these two parts are text-
strings ontaining information about audio books stored in the database.

So when sear hing for books, getting the toplist and the o ers-list, buying books and
re eiving more information about books HTTP is the transport proto ol that is used.
The string with book information is built up like gure 6.4.

Author;Title;Info;Category;Year;Length;ISBN;
Price;Image_URL;Stream_server_URL;Nr_of_parts$

Figure 6.4: Book information sent from the servlet to the mobile phone

The string with toplist information is built up like gure 6.5.

Author;Title;Category;ISBN;...;...$

Figure 6.5: Toplist information sent from the servlet to the mobile phone

The string with o ers-list information is built up like gure 6.6.

Author;Title;Category;Price;ISBN;...;...$

Figure 6.6: O ers-list information sent from the servlet to the mobile phone

Explanation to the di erent parts of gure 6.4, gure 6.5 and gure 6.6 follows below.

{ Author = The name of the author of the book.


{ Title = The title of the book.
{ Info = Information about the story of the book.
{ Category = The ategory of the book.
{ Year = The year the book was published.
{ Length = The length of the book in se onds.
{ ISBN = The ISBN of the book.
{ Pri e = The Pri e of the book.
{ Image URL = The URL to an image of the book over.
{ Stream server URL = The URL to the streaming server ontaining the name of
the audio le.
{ Nr of parts = The number of parts the book is divided into.
The di erent parts in the string are separated with ";". At the and of the information
string from the servlet there is a "$". "..." means that there an be information about
more than one book in a string.

RTSP is des ribed in hapter "Streaming Te hnology" and is used to set up and on-
trol the state of the audio book stream su h as play, pause et . It is used between the
streaming server and the mobile phone.

RTP is also des ribed in hapter "Streaming Te hnology". It is the proto ol that trans-
ports the audio data from the streaming server to the mobile phone.
6.4 Playing streaming audio in J2ME
J2ME and MMAPI does not support playba k of streaming media. The player in
MMAPI needs to bu er (realize) a whole audio le to be able to play it. To get around
this problem the appli ation uses two bu ers taking turns in re eiving the audio data
from the RTP stream. By using the "two bu er" method the appli ation an ontinue
re eiving data in one bu er while feeding the other bu er to the player as an audio le
that it an realize.

One problem with this solution is when the swit hing from one bu er to the other
takes pla e in the middle of a word. When this happens there will be a short break in
the middle of the word. To get around this the system uts the audio at the last silent
part of the bu er before sending it to the player. The part that has been ut is added
to the next audio part.

The silent parts of the AMR audio that is streamed in the system is represented by
SID (Silen e Des riptor) frames.

This solution an not be used while streaming musi , be ause musi generally do not
have any repeating silent parts through out a song.

Figure 6.7 shows an overview of how the bu ering and play ba k of the audio is made.

RTP
.
.
.
} Receiving RTP
packets in
buffer 1

}
} }
Converting the received audio data from
Receiving RTP buffer 1 to a playable AMR audio file
RTP
packets in
. buffer 2 Playing the AMR audio file
. converted from buffer 1
.

} } }
T Converting the received audio data from
I RTP Receiving RTP buffer 2 to a playable AMR audio file
M packets in
E . buffer 1 Playing the AMR audio file
. converted from buffer 2
.
}
}
Converting the received audio data from
Receiving RTP buffer 1 to a playable AMR audio file
RTP

}
packets in
buffer 2 Playing the AMR audio file
converted from buffer 1

} Converting the received audio data from


buffer 2 to a playable AMR audio file

} Playing the AMR audio file


converted from buffer 2

Figure 6.7: An overview of the bu ering and play ba k of the audio


6.5 Testing
The appli ation is tested on a Nokia 6680 mobile phone. It is a 3G phone that supports
both 3G, GSM and GPRS. It also has support for bluetooth. Using bluetooth it is easy
to install the appli ation on the phone, and it does not ost anything.

It is not possible to tell how the appli ation will behave or if it will work at all with only
simulations. So the testing is the most riti al part, to see if the appli ation really works.

Unfortunately, the test equipment, mobile phones and money for traÆ osts, was not
re eived until the end of the proje t. So no real testing was done ex ept for simulations
until the proje t was almost nished.

The "Main menu"-GUI and the HTTP onne tions between the servlet and the phone
worked just ne.

But there was trouble with the RTSP/SDP proto ol. The way the TCP-pa kets where
read on the phone from the streaming server had to be modi ed, so that a stream session
ould be started.

This is one thing that worked ne in the simulation enviroment, but not on the phone.

Rumors that it is not possible to onne t to a mobile phone using UDP, without start-
ing the UDP onne tion from the phone rst was a worrying fa tor. If this was true
the streaming server would not be able to onne t to the phone and stream the RTP
pa kets, ontaining the audio data.

NAT and NAPT in the mobile networks an make it diÆ ult to establish UDP on-
ne tions to mobile phones. This means that the streaming server maybe would not be
able to onne t to the phone and stream the RTP pa kets, ontaining the audio data.

But on e the TCP- onne tion worked well between the server and the phone the UDP-
onne tion also worked ne without any problems. Di erent mobile network operators
handle NAT/NAPT in di erent ways, so the system might not ne essarily work in all
mobile networks. But it worked in the networks Telia and 3 provides.

The onne tion worked but there was another problem. The appli ation on the phone
re eived the streamed audio data, but what ame out from the phones speaker was a
lot of noise. The Player-GUI did not update as smoothly on the real phone as on the
simulated phone.

It o ured that the way the GUI was updated took up too mu h pro essing power
from the other threads in the appli ation so everything was severely slowed down, in-
luding the Player-GUI itself. There was also a few busy waits in the player-GUI that
took up a lot of y les.

When the player-GUI was optimized and the busy waits eliminated, everything worked
mu h better. The player-GUI updated smoothly and the noise was turned into the audio
that was streamed to the phone.
The main problem when testing the system was the update/repainting of the player-
GUI.

6.5.1 Mobile network operators

It is best to test the system in as many di erent mobile operators networks as possible.
The system is tested in the following operators networks:

{ Telia
{ 3

The rst tests was in Telia's network. When the system worked well on Telia's network
it was also tested in 3's network. It worked ne there as well.

The system is also tested outside while walking around and driving in a ar. There
was no problems with this on Telia's and 3's networks.

6.5.2 Mobile phones

The system is only tested using the Nokia 6680 mobile phone. It would be better to test
the appli ation on di erent phones and ompare the results.

But there was no resour es to do that.

6.6 Requirements for running the J2ME appli ation


These are the requirements of the mobile phone to use the appli ation.

{ Dis Memory: 60 KB
{ Java version: J2ME - MIDP 2.0
{ Communi ation: So ket and HTTP support
{ Additional J2ME APIs: MMAPI
{ Supported sound formats: AMR, AMR-WB
Chapter 7

User's Guide

This hapter is a guide to learn how to use the appli ation on the mobile phone.

7.1 Starting the appli ation


To start the appli ation, go to where the it is stored on the mobile phone and hoose to
start it.

7.2 Main menu


When the appli ation is started the main menu will show. It is shown in gure 7.1.

Figure 7.1: Main menu

37
Explanation to the the main menu list:

{ My books: Choose this menu to go "My books" whi h is where the books that
have been bought is. The number inside parenthesis indi ates how many books
that are stored in "My books". In this example there is one book. If no books
have been bought, the number inside parenthesis would be "0".
{ Toplist: Choose this to open the audio book toplist.
{ O ers: Choose this to open the audio book o ers.
{ Book sear h: Choose this to sear h for audio books.
{ Help: Under help there should be a user's guide and other relevant information.
But this is not implemented in this version.
{ Settings: Under settings it should be possible to set language, network options
and performan e et . Only set language is implemented in this version.

7.3 My books
Figure 7.2 shows how the "My books" menu looks like.

Figure 7.2: My books

The books that have been bought are listed here. If no books have been bought, a
message that says that there is no books in "My books" will appear and the appli ation
will go ba k to the main menu. If a book under "My books" is hoosed another menu
will appear, see gure 7.3.
Figure 7.3: Book options

There are ve options to hoose from in gure 7.3.

{ Resume: Choose this to resume the listening from where it last was stopped. By
default, if the audio book never has been played, it will start from the beginning.
{ Listen from bookmark: Choose this to start listen from the bookmark that has
been set by the user. By default, if the bookmark is not set, the audio book will
be played from the beginning.
{ Listen from beginning: Choose this to listen from the beginning of the book.
{ Information: Choose this to get more information about the book, su h as author,
publishing year et .
{ Remove the book: Choose this to remove the book from the "My books"-list. A
on rmation s reen will appear to ask if you really want to remove the book from
the appli ation.

How to intera t with the audio book player is des ribed in se tion "Player" in this
hapter.
7.4 Toplist
The toplist is shown in gure 7.4.

Figure 7.4: Toplist

Here the user an hoose to get more information about a book. Just press "Information"
and more information about that book will be shown, su h as book over, author and
pri e et . Figure 7.5 shows what that looks like.

Figure 7.5: Book information

From this menu the book an be bought by pressing "Buy". Then a on rmation s reen
will show, see gure 7.6. After the buy has been on ramated it will be added to the
"My books"-list. Then the appli ation will show the "My books"-list and will be ready
to start playing the book.

Figure 7.6: Buy on rmation

7.5 O ers
The o ers list has the same fun tions as the "Toplist". The only ex eption is that the
pri e of the books are shown in the o ers list, see gure 7.7. See se tion "Toplist" in
this hapter for more information.

Figure 7.7: O ers list


7.6 Book sear h
The book sear h window is shown in gure 7.8.

Figure 7.8: Book sear h

Here the user an hoose to sear h audio books by di erent ategories. These ategories
are:

{ Author: Sear h by the authors name.


{ Title: Sear h by the book title.
{ ISBN: Sear h by the book's ISBN.
{ Category: Sear h by the book's ategory, su h as novel or love story et .
To make the sear h, enter a sear h string and press the "Sear h" button. When a sear h
is made the results will be shown in a window shown in gure 7.9.

Figure 7.9: Sear h result

The sear h result list looks like and has the same fun tionality as the toplist. See se tion
"Toplist" in this hapter for more information about that.

If no results an be found from the sear h a window will appear that noti es that
no sear h results was found. Then the appli ation, automati ally, will go ba k to the
"Sear h book" window.
7.7 Player
This se tion will go through how to use the audio book player. When the user have
hoosed to listen to a book from the "My books"-list the player-GUI will be shown on
the phone. The player will start bu ering the book immediately. This is shown in gure
7.10.

Figure 7.10: Bu ering

When the appli ation have nished bu ering the player will start to play the audio book
automati ally. see gure 7.11.

Figure 7.11: Playing


Below is a list explaining how to manoeuvre the player.

{ Play/Pause: Press Play or Pause, depending whether the player is in playing mode
or in paused mode, or press "2" on the phone to play or pause the audio book.
{ Stop: To stop playing press stop. Then the player will be losed and the appli ation
will go ba k to the "My books"-list.
{ Rewind: To rewind press, and hold down, "1" on the phone.
{ Fast forward: To fast forward press, and hold down, "3" on the phone.
{ Set bookmark: To set a bookmark, press "5" on the phone. A message indi ating
that a bookmark is set will appear on the s reen, se gure 7.12, and then disappear
automati ally after a few se onds.
{ Control the volume: Press the "*"-key to de rease and the "#"-key to in rease
the volume.
{ Exit: To exit press stop.

Figure 7.12: A bookmark is set


Chapter 8

Con lusions

The hardest part of the proje t was to implement the streaming proto ols, RTSP, SDP
and RTP in J2ME. No information about su h implementations from other people or
organizations were found, so it was a hallenge to implement it.

During the proje t many ideas of extra features that ould be added to the system
ame up. All these feature did not have to do with the the audio streaming so it was
important that the fo us did not slip away from the main task, to implement streaming
proto ols in J2ME.

8.1 Restri tions and limitations


If no RTP pa kets are re eived by the appli ation during a spe i ed timed the player
will assume that the audio book part that was played before the break is nished and
then start to play the next part of the audio book. If it is the last part of the audio
book, it will assume that the book is nished. This is be ause the time of the audio part
re eived from the stream does not fully agree with the time of the audio that is played
by the J2ME player. There was no time to x that problem during the proje t, as this
was not the main fo us.

To make this a full, redundant and s alable system, the work would take mu h longer
time than the master's thesis proje t o ers. But that was not the goal with this proje t.
The goal was to see if it was possible to stream audio books to mobile phones using
J2ME and to make a prototype of the system.

The billing-part of the system is not fully developed. The idea was the billing should
be done by sending an SMS.

The reason that this was not developed fully was that there was not enough resour es
for doing it, for example a SMS-server and a deal with a mobile network operator would
be needed.

47
8.2 Future work
It would be ni e to implement the ontrol proto ol RTCP to measure data-rates, delay
and pa ket loss of the RTP streaming. With this information it would be possible to
make the appli ation more dynami in data-rate so that it ould adjust the sound qual-
ity after available bandwidth in the mobile network.

More testing on di erent mobile phones and in di erent environments would be good.
There were no resour es for doing this during the proje t.

A user interfa e for updating the audio book database would make the system more
easy to administrate and use.
Chapter 9

A knowledgments

We would like to thank the following people:

{ Our supervisor Jerry Eri sson.


{ Jonas Bystrom and Annelie Malmstrom on Bonnier Audio for their ooperation
and for providing us with test equipment.
{ Peter Lindblom and Gothe Lindahl on Vimio AB for showing interest in our work.

49
Referen es

[1℄ Sony Eri sson J2ME SDK 2.2.0. http://developer.sonyeri sson. om/site/global/
do stools/java/p java.jsp (visited 2005-06-10).
[2℄ 3GPP. Arib std-t63-26.201 v5.0.0 spee h ode spee h pro essing fun tions; amr
wideband spee h ode ; frame stru ture (release 5). 2001.
[3℄ 3GPP. Arib std-t63-26.101 v4.2.0 - mandatory spee h ode spee h pro essing
fun tions; amr spee h ode frame stru ture (release 4). 2002.
[4℄ Bonnier Audio. http://www.bonnieraudio.se (visited 2005-06-10).
[5℄ Ethereal. http://www.ethereal. om (visited 2005-06-10).
[6℄ The Apa he Software Foundation. Apa he tom at.
http://jakarta.apa he.org/tom at (visited 2005-06-10).
[7℄ Forlaggareforeningen. http://www.forlaggareforeningen.se (visited 2005-06-10).
[8℄ Harri Holma and Anttu Toskala. WCDMA for UMTS - Radio A ess For Third
Generation Mobile Communi ations. Wiley, 2000.
[9℄ MySQL. http://www.mysql. om (visited 2005-06-10).
[10℄ Rf -1889. Rtp: A transport proto ol for real-time appli ations.
http://www.faqs.org/rf s/rf 1889.html (visited 2005-05-06), 1996.
[11℄ Rf -2326. Real time streaming proto ol. http://www.rtsp.org/2003/drafts/draft05/
draft-ietf-mmusi -rf 2326bis-05.pdf (visited 2005-05-12), 2003.
[12℄ Rf -2327. Sdp: Session des ription proto ol.
http://www.faqs.org/rf s/rf 2327.html (visited 2005-05-12), 1998.
[13℄ Rf -2616. Hypertext transfer proto ol { http/1.1.
http://www.w3.org/Proto ols/rf 2616/rf 2616.html (visited 2005-05-12), 1999.
[14℄ Rf -760. Internet proto ol. http://www.faqs.org/rf s/rf 760.html (visited 2005-05-
12), 1980.
[15℄ Rf -768. User datagram proto ol. http://www.faqs.org/rf s/rf 768.html (visited
2005-05-12), 1980.
[16℄ Darwin Streaming Server. http://developer.apple. om/darwin/proje ts/streaming
(visited 2005-06-10).

51
[17℄ MediaLab Telia Sonera. Streaming in mobile networks - white paper. 2004.
[18℄ Speex. http://www.speex.org (visited 2005-06-10).
[19℄ Sun. J2ME. http://java.sun. om/j2me/index.jsp (visited 2005-05-05).
[20℄ Sun. Java. http://java.sun. om (visited 2005-05-05).
[21℄ Eri Woudenberg. Conversion between amr (adaptive multi-rate ode ) le formats.
http://www. onna tivity. om/ eaw/amrwork/ (visited 2005-05-23), 2003.
Appendix A

RTSP status odes

Status Meaning Status Meaning


ode ode
100 Continue 413 Request Entity Too Large
200 OK 414 Request-URI Too Large
201 Created 415 Unsupported Media Type
250 Low on Storage Spa e 451 Parameter Not Understood
300 Multiple Choi es 452 Conferen e Not Found
301 Moved Permanently 453 Not Enough Bandwidth
302 Moved Temporarily 454 Session Not Found
303 See Other 455 Method Not Valid in This State
304 Not Modi ed 456 Header Field Not Valid for Resour e
305 Use Proxy 457 Invalid Range
400 Bad Request 458 Parameter Is Read-Only
401 Unauthorized 459 Aggregate operation not allowed
402 Payment Required 460 Only aggregate operation allowed
403 Forbidden 461 Unsupported transport
404 Not Found 462 Destination unrea hable
405 Method Not Allowed 500 Internal Server Error
406 Not A eptable 501 Not Implemented
407 Proxy Authenti ation Required 502 Bad Gateway
408 Request Time-out 503 Servi e Unavailable
410 Gone 504 Gateway Time-out
411 Length Required 505 RTSP Version not supported
412 Pre ondition Failed 551 Option not supported
Table A.1: RTSP Status odes

53
Appendix B

Abbreviations

{ 3G - Third Generation Mobile System


{ 3GPP - Third Generation Partnership Proje t
{ 8PSK - 8-Phase Shift Keying
{ AMR - Adaptive Multi-Rate
{ AMR-NB - Adaptive Multi-Rate Narrow-Band
{ AMR-WB - Adaptive Multi-Rate Wide-Band
{ AR - Appli ation Requirement
{ CD - Compa t Dis
{ CELP - Code Ex ited Linear Predi tion
{ CLDC - Conne ted Limited Devi e Con guration
{ CRLF - Carriage return (CR) line feed (LF)
{ CSRC - Contributing Sour e
{ DRM - Digital Rights Management
{ EDGE - Enhan ed Data Rate for Global Evolution
{ GMSK - Gaussian Minimum Shift Keying
{ GPRS - General Pa ket Radio Servi e
{ GSM - Global System for Mobile Communi ations
{ GSM BSS - GSM Base Station Subsystem
{ GUI - Graphi al User Interfa e
{ HTTP - Hypertext Transfer Proto ol
{ IP - Internet Proto ol

55
{ ISBN - International Standard Book Number
{ J2ME - Java 2 Platform, Mi ro Edition
{ JVM - Java Virtual Ma hine
{ KVM - K Virtual Ma hine
{ MIDP - Mobile Information Devi e Pro le
{ MMAPI - Multimedia API
{ MMS - Multimedia Messaging System
{ mp3 - MPEG-1 Audio Layer 3
{ MPEG-1 - Moving Pi tures Experts Group - 1
{ NAPT - Network Address Port Translation
{ NAT - Network Address Translation
{ PDA - Personal Digital Assistant
{ RMS - Re ord Management System
{ RTCP - RTP Control Proto ol
{ RTP - Real-Time Transport Proto ol
{ RTSP - Real-Time Streaming Proto ol
{ SDP - Session Des ription Proto ol
{ SID - Silen e Des riptor
{ SR - System Requirement
{ SSRC - Syn hronization Sour e
{ TCP - Transmission Control Proto ol
{ UDP - User Datagram Proto ol
{ UMTS - Universal Mobile Tele ommuni ations System
{ URI - Uniform Resour e Identi er
{ URL - Uniform Resour e Lo ator
{ UTF-8 - 8-bit Uni ode Transformation Format
{ UTRAN - UMTS Terrestrial Radio A ess Network
{ WAV - Waveform Audio
{ WCDMA - Wideband Code Division Multiple A ess
{ WS - White-Spa e
{ WWW - World-Wide Web

You might also like