You are on page 1of 19

Voice and Internet multimedia in UMTS networks

M C Bale

Voice telephony is the predominant service on today’s cellular mobile networks, in terms of number of customers, revenues
and network usage. However, it is difficult to predict how long this will be the case given the rising demand for new Internet
multimedia services. It is therefore essential that 3rd generation (3G) mobile networks support a voice telephony service, but
also that these networks are also capable of providing Internet multimedia services using the same technology.

This paper provides an overview of how voice telephony is provided in the initial phase of the universal mobile
telecommunications system (UMTS). It then describes how this is expected to evolve in later phases — so that voice
telephony becomes one of a large number of multimedia services provided from a common Internet protocol-based mobile
network.

1. Introduction
information services with which it will be integrated. This
T he main driver behind 2nd generation digital mobile
networks, such as the global system for mobile
communications (GSM) [1], was the need to provide a voice
requires a more radical approach to the provision of voice
services, one that is more aligned with the Internet and the
telephony service to mobile users. This has been achieved protocols standardised by the Internet Engineering Task
with incredible success. Moreover, GSM has established the Force (IETF) [5]. This challenge is being addressed by
starting point from which future mobile networks must 3GPP in the production of the Release 4 and 5 standards,
evolve and an important benchmark for voice services that and by the IETF in the production of the protocols needed to
the 3rd generation of mobile networks must exceed in terms realise mobile Internet multimedia.
of functionality and quality.
This paper initially provides an overview of how a voice
The Universal Mobile Telecommunications System telephony service is supported by a UMTS network
(UMTS), the 3rd generation network and systems conforming to the 3GPP Release 1999 standards. It then
standardised by the 3rd Generation Partnership Programme describes the proposed solution currently being standardised
(3GPP) [2], aims to provide voice services that will meet the by 3GPP for Internet multimedia services (including voice)
needs of mobile users. This is being done in collaboration known as the Release 5 standards. This solution is
with the International Telecommunications Union (ITU) illustrated with message sequence flows to show the
‘International Mobile Telecommunications — 2000’ project dynamic aspects of the solution and the application of the
[3]. various protocols. It is assumed that the reader already has
an awareness of GSM and general packet radio service
In the initial phase of UMTS, defined by the 3GPP (GPRS) networks.
Release 1999 standards, the voice telephony service is
essentially an evolution of the GSM voice service that Work to address the challenges of providing voice and
benefits from the 3rd generation technologies adopted for multimedia services in a mobile and wireless Internet
the UMTS Terrestrial Radio Access Network (UTRAN) [4]. environment is progressing rapidly within 3GPP as well as
the other bodies producing standards for this area (such as
However, the customer’s needs for mobile voice the IETF). However, the reader should be aware that there is
telephony must also be considered in the light of the still much work to be done, especially at a detailed level. At
growing demand for mobile Internet multimedia services. In the time of writing, this paper reflects current views, which
particular, voice will be a feature of many of these may differ from the actual standards when they are
multimedia services, e.g. videoconferencing, mobile completed. To aid understanding of some of the issues,
commerce (mCommerce), games and multimedia mail. To potential solutions are described, but it should be recognised
enable such services, it is important that the voice service is that these are only illustrative and may not be endorsed as
48 as much part of the mobile Internet as the data and standards in the future.

BT Technol J Vol 19 No 1 January 2001


VOICE AND INTERNET MULTIMEDIA IN UMTS NETWORKS

2. Voice in the 3GPP Release 1999 network Figure 1 shows the overall network for the support of
voice services in the 3GPP Release 1999 standards, and is
R elease 1999 is the first phase of the 3GPP standards for
UMTS. This is a completed set of standards that
defines a UMTS network able to provide users with voice
more fully described in Lobley [6].

To achieve compatibility with GSM, the Release 1999


and data services fully compatible with those of GSM and
network effectively adopts the GSM core network and
GPRS. The standards allow users to migrate on to the
service architecture. This has a significant benefit to the
UMTS and to roam seamlessly between UMTS and GSM/
network operator since it enables a cost-effective and low-
GPRS networks without any loss of capability. It also has
risk evolutionary approach to be taken for the deployment
the benefit to the network operator of being able to target
of UMTS. However, in the UTRAN, changes in both the
the introduction of UMTS to specific geographical areas,
architecture and the radio and transport technologies
while relying on existing GSM and GPRS networks to
employed result in differences from GSM, but also enable
provide coverage in other areas.
some improvements to the way in which voice services are
Specifically, current GSM networks support voice and provided. The main areas where the UTRAN affects voice
low-speed data services that are circuit-switched, so called services are described below.
because the voice or data is carried between users in bearer
circuits that are switched into place across the network for a • Improved quality of service in the radio access
time period, under the control of signalling from the users.
In contrast, the GPRS network supports packet-switched The use of wideband code-division multiple access
data services. For the purposes of this paper, only the voice (WCDMA) and the various modes of operation in the
services in Release 1999 are described, but the descriptions radio access can improve the quality of the voice
also apply to low-speed circuit-switched data services. service in terms of availability and reliability [4].

application
and service HLR
environment A

BSS
GSM A
interface

GSM radio access


network

VLR
B

MSC GMSC

RNC circuit-switched domain


UMTS lu-CS (TDM or ATM AAL2
interface core network)

EIR

RNC

UMTS lu-PS C
UMTS terrestrial radio interface
access network (UTRAN)

signalling A mobility management signalling to other networks

speech paths B speech circuits and call signalling to other networks (e.g. PSTN)

packet data C packet data and signalling to the packet-switched domain (i.e. GPRS)

Fig 1 3GPP Release 1999 voice network overview. 49

BT Technol J Vol 19 No 1 January 2001


VOICE AND INTERNET MULTIMEDIA IN UMTS NETWORKS

• Well-defined interface between a UTRAN and a core although these may not meet all the quality-of-service
network requirements of a Release 1999 network.

The interface between the radio network controllers In Release 1999, the user’s speech is digitally sampled
(RNCs) and the core network, the Iu interface, is more by the mobile user equipment, and then coded for
clearly defined and open, such that a UTRAN from one transmission. The default speech coding, which must be
vendor will interoperate with a core network from supported by all mobile user equipment (terminals) and the
another vendor. The Iu interface itself is separated into UTRAN, is adaptive multi-rate (AMR). The AMR coder
the Iu-CS interface between the RNC and the core supports eight source rates ranging from 4.75 kbit/s− 1 to
network circuit-switched domain, and the Iu-PS 12.2 kbit/s− 1, and is rate-controlled which enables it to
interface between the RNC and the core network rapidly switch between these at any point in the call. The
packet-switched domain (not shown in Fig 1). The AMR coder encodes and decodes the digitally sampled
separation of the core network domains and the Iu speech to make optimum use of the battery power and
interface allows the deployment and evolution of voice bandwidth available, particularly on the radio link between
services independently of packet data services in the mobile equipment and the radio base stations (node B).
Release 1999. The bit rates are selected depending on the quality of speech
required and the quality of the transport provided by the
• Use of ATM as the transport technology
network, and primarily that of the radio link. The AMR
ATM is used as the transport technology between the coder also supports a low-rate background noise encoding
radio base-stations and RNCs, between RNCs, and mode to reduce transmission during silence, further
between the RNCs and the core network (the Iu reducing bandwidth and battery usage in the user
interface). Both circuit-switched and packet-switched equipment. In addition to AMR, other speech coding may
services are carried in ATM cells, using appropriate be optionally selected, such as enhanced full rate (EFR) or
adaptation layer protocols. In the case of the voice full rate (FR), as also specified for GSM. Within the core
bearer circuits this is ATM adaptation layer 2 (AAL2), network, the ITU-T Recommendation G.711 speech coding
and for the signalling is ATM adaptation layer 5 at 64 kbit/s− 1 or 56 kbit/s− 1 is generally used as in the public
(AAL5). ATM provides a number of benefits in the switched telephony network (PSTN) and GSM core
access network, such as the ability to transport packet- networks. Transcoding from AMR (or other speech coding)
and circuit-switched services with low delay, high to G.711 is performed in the MSC.
bandwidth and manageable quality of service.
Conversion of ATM to the circuit-switched time If the user’s equipment at both ends of a voice call use
division multiplexed (TDM) technology, if used to the same coding, then transcoding to G.711 (or other
switch the voice paths in the core network, can be codings) is not necessary. There are two procedures that can
performed by the mobile switching centre (MSC) or by be adopted to remove or reduce transcoding, namely:
a gateway function between the RNC and the MSC.
• tandem-free operation of transcoders — where inband
• Speech transcoders located in the core network signalling between the transcoders determines the
transcoders in use and allows the transcoders to drop
Speech transcoding is performed in the MSC in out of the speech circuit if both terminals are using the
Release 1999, rather than at the base-station sub- same speech coding,
system of the GSM radio access network. The
relocation of this function into the core network allows • transcoder-free operation — where the mobile
operators to provide lower cost access transmission terminals negotiate the speech coding during call set-
networks, and eases the introduction of transcoder and up, and transcoders are only inserted into the speech
tandem-free operation. path if end-to-end compatibility cannot be achieved.
Although considered for Release 1999, it is not until
A significant benefit of retaining the GSM core is that Releases 4 and 5 that standards will be available for
the MSCs can interface to both the UTRAN and existing tandem-free operation and transcoder-free operation.
GSM radio access networks, and more easily support user
roaming and in-call handover from the UMTS to GSM As with GSM, signalling from the user to the network
networks. Within the core network, the only notable change broadly falls into two categories — call-related signalling
from GSM is that voice services may be supported either on for establishing, maintaining and terminating voice calls,
circuit-switched TDM (as in GSM) or via ATM transport. and non-call-related signalling for mobility management
Again, AAL2 is recommended for providing the voice (e.g. for location registration, roaming and in-call
bearer circuits and switching if ATM is used. Other handover). The signalling protocols and procedures are
transport protocols such as ATM adaptation layer 1 or generally the same as for GSM, although new lower layer
50 voice-over-IP solutions could in theory be used instead — protocols provide adaptation to the underlying ATM

BT Technol J Vol 19 No 1 January 2001


VOICE AND INTERNET MULTIMEDIA IN UMTS NETWORKS

transport in the UTRAN. Within the core network and for sessions and the interconnection to other networks, such as
interconnect to other networks, the ITU-T recommendations the PSTN and other UMTS networks. The IM domain also
for Signalling System 7 (SS7) are used, again with lower- relies on a managed core IP network that is enabled to
level adaptation layer protocols in the case where ATM provide the quality of service needed for voice and
transport is used. multimedia services.

Supplementary services, such as call diversion and The main reasons for the introduction of the IM domain
caller identity, are provided from the MSCs, which also are to enable new services and to reduce cost. The IM
provide tones and announcements to the user. More architecture uses IP and the other protocols standardised by
advanced voice services can be provided from the the Internet engineering task force (IETF) as interfaces to
application and service environment [7]. A user profile, component ‘building blocks’ of the Release 5 network.
containing information on the individual’s subscribed
services, is provisioned into the home location register These protocols provide a very adaptable suite of
(HLR) for that user. This is then copied into the technologies for building packet-based networks and
corresponding visitor location register (VLR) in the MSC services, and the growth in the use of these protocols and
responsible for controlling users’ calls, so that their services associated networking equipment over the last decade has
can be provided as they change location. For billing resulted in considerable cost reductions. However, while the
purposes, call detail records, for example containing IETF protocols can be adopted to provide many of the
information on call duration and destination, are generated functions of the IM domain, each UMTS service has
by the MSCs and sent to the operator’s billing engine. specific requirements that impact on the overall design of
Information may also be collected from the HLR for billing the network and the detailed information carried within the
purposes. The MSC also communicates with an equipment protocols. Therefore, to determine the IM network and
identification register (EIR), for example to validate protocol design, the services to be supported must be
whether the mobile terminal is a stolen one. understood.

With the Release 1999 standards completed, it is Examples of the services that will be supported in
anticipated that UMTS Release 1999 voice networks will be Release 5 by the IM domain are:
operational by 2002, with operators already beginning to • voice telephony,
deploy network and UTRAN equipment in order to meet
this date. However, it is not until the second phase of UMTS • real-time interactive games,
standards that support for other real-time multimedia • videotelephony,
services is defined.
• instant messaging,
3. Voice and multimedia in the 3GPP Release 5 network
• emergency calls,

T he following phases of UMTS evolution specify how


voice and multimedia can be supported by an Internet
Protocol (IP) transport service. Currently, two phases are
• multimedia conferencing.

These services tend to share a number of characteristics


defined: — they are generally a conversational session between two
• Release 4, which includes the migration of the Release or more parties requiring some degree of real-time
1999 circuit-switched domain core network and interactivity. The real-time aspects of the service can be
services to an IP transport, and is described further in described in terms of the quality of service of the transport
section 5 of this paper, (such as transmission delay or packet jitter) and of the
session (or call) control, such as time to establish the
• Release 5, which takes a more radical approach to the session.
introduction of conversational and interactive
multimedia services on to an end-to-end IP transport To meet the interactive needs of these services, the
provided by an enhanced general packet radio service GPRS network provides quality-of-service levels — for
in the packet-switched domain. example by operating at low levels of network utilisation or
by employing mechanisms such as Diffserv (see RFCs 2474
These releases were previously known singly as Release and 2475 [5]). Additionally, IP version 6 (IPv6 — see RFC
2000. 2460 [5]) has been recommended as the transport protocol
to be used for the IM domain, since this has a number of
Release 5 specifies voice and multimedia services that features that are beneficial to UMTS networks (such as a
make use of GPRS for the transport of speech and large address space, support for packet prioritisation, and
signalling, rather than the circuit-switched domain trans- easier manageability).
port. A new core network domain, the Internet multimedia
(IM) core network subsystem, or IM domain for short, is The IM domain has four important roles in meeting the
introduced for the control of voice and multimedia calls and requirements of services: 51

BT Technol J Vol 19 No 1 January 2001


VOICE AND INTERNET MULTIMEDIA IN UMTS NETWORKS

• it enables users and applications to control the sessions • it generates call detail records (CDRs), for example
and calls between multiple parties, for example to containing information on time, duration, volume of
establish, maintain, modify and terminate calls1, data sent/received, and the call participants — the
• it controls and supports network resources (such as
CDRs, together with records from the GPRS network
on the data volumes transmitted and received are used
media gateways and GPRS gateway support nodes
for charging purposes.
(GGSNs), multimedia resource functions (MRF) and
the core IP network) to provide the functionality, An overview of the IM domain and its relationship with
security and quality required for the call, the GPRS packet-switched domain is shown in Fig 2. The
• it provides for registration of users on the ‘home’ and purpose of each of the functional entities is more fully
‘roamed to’ networks, so that users may access their described in Lobley [6].
services from any UMTS network, The IM domain architecture complements the voice
1
Strictly speaking, sessions and calls are different (see RFC 2543 [5] for over IP (VoIP) protocols and architectures developed by the
a definition of each). However, for the purposes of this paper the term IETF [5], ETSI Tiphon [8] and ITU-T Study Group 16 [3],
‘call’ is used to refer to simple cases where calls and sessions can be
considered the same, for example, in the case of a point-to-point voice although these were primarily developed for fixed IP
telephony call. networks. Supporting VoIP in a mobile and wireless

application A
signalling
and service HSS gateway
environment

B
CSCF

C
signalling
MGCF
EIR gateway
RNC

DHCP
and DNS D
servers
media
MRF gateway
RNC
GGSN
SGSN E

UMTS terrestrial radio packet-switched Internet multimedia domain


access network (UTRAN) domain (GPRS) (IPv6 core network)
UMTS lu-PS
interface

signalling A mobility management signalling to other networks

speech paths B call related and mobility management signalling to other Release 5 networks

C call related signalling to other circuit-switched and VoIP networks

D circuit-switched speech circuits to other networks (e.g. PSTN and GSM)

E speech paths to other Release 5 and other VoIP networks

52 Fig 2 3GPP Release 5 network overview.

BT Technol J Vol 19 No 1 January 2001


VOICE AND INTERNET MULTIMEDIA IN UMTS NETWORKS

environment raises a number of additional requirements, user A IM domain user B


which are being addressed by the 3GPP group. In particular, equipment call control equipment
they include: INVITE
1 INVITE
• the ability of the network to hand over a call (signalling trying
and speech paths) from one radio base-station (node B) ringing
2
to another, without perceivable loss of speech quality, ringing
OK
for example as a user moves between radio base- 3
OK
stations,
• the ability of the network to cope with the additional 4
ACK
ACK
delays imposed on the speech path due to radio access
and use of AMR coding, without perceivable loss of both way speech path established
speech quality,
• the ability of the network to allow users to roam to 5
BYE
BYE
another operator’s network (a visited network), and
still receive service, OK
OK
• the ability of the network to control the voice service of
a roaming user from either the user’s home network or
the visited network. Fig 3 Simple establishment of a speech path.

The first two points are addressed by the mechanisms name representing the called user (either similar to an
used to transport IP packets carrying speech and IP packets Internet e-mail address or a telephone number), a
carrying signalling over the GPRS network and IM domain description of the call (e.g. codec to be used) and the
core IP network. The subsequent points are addressed by the address of the endpoint of the speech path on user A’s
registration, discovery and call control procedures of the IM equipment (e.g. a telephone). A call control entity in
domain. the IM domain receives this invitation, and confirms
back to user A’s telephone that it is trying to contact
3.1 Overview of VoIP in 3GPP Release 5 user B’s telephone. The call control entity then
performs a database look-up to translate user B’s name
In common with fixed network VoIP, digitised speech to an address to which it can route the invitation. On
from each user is carried in IP packets between one user’s resolving the address, the IM domain call control
terminal equipment and another by an IP network. The path routes the invitation on to user B’s telephone.
that these packets take through the network is referred to as
the speech path. Unlike a circuit-switched environment, the
• Alerting (2)
packets may individually take different routes through the On receiving the invitation, user B’s telephone alerts
IP core network to a common exit point of the IP core user B of the incoming call, and informs user A via the
network, rather than be forced along a specific circuit. IM domain that the called telephone is ringing.
However, in reality, it is likely that the packets will follow
the same route through the network if the network is not • Answer (3)
congested. When user B answers, the telephone accepts the call by
sending an OK back to user A’s telephone via the IM
To establish a speech path, and synchronise the users
domain. This message contains the address on user B’s
and their equipment, call control functionality is
telephone on which the speech path should terminate,
programmed into the user’s equipment and network. These
as well as the agreed call description.
call control functions communicate using signalling
messages. For example, the call control enables passing of • Acknowledge (4)
the endpoint addresses for the speech paths on the user’s
User A’s telephone acknowledges acceptance of the
equipment and the negotiation of the network and user
call, and the speech path is established — both
equipment resources needed for the call, such as codecs and
telephones now know each other’s address and are able
the quality of service required. Figure 3 shows a simple call
to send speech packets to each other.
establishment to create a VoIP speech path, which is
described below. • Clear (5)

• Invite (1) When the users have finished talking to each other, the
call is cleared, for example by user A’s telephone
The calling user (A) initiates the call by inviting the sending a BYE to user B’s telephone via the IM
called user (B) to the call. This invitation contains a domain. Both telephones then free up any resources 53

BT Technol J Vol 19 No 1 January 2001


VOICE AND INTERNET MULTIMEDIA IN UMTS NETWORKS

allocated for the speech path, and user B’s telephone session initiation session description
confirms that the speech path has cleared by sending an protocol (SIP) protocol (SDP)
OK back to user A’s telephone via the IM domain.

3.2 Signalling TCP or UDP


TCP or UDP payload
header
The signalling protocol for registration and call control
in the IM domain is based on the session initiation protocol
(SIP) (see RFC 2543 [5]). In simple terms, the control of the IP header IP payload
call relates to inviting and synchronising the various
participants in the call. It also enables the participants to
Fig 4 Transport of SIP and SDP in IP.
describe and share information about the characteristics of
the terminating equipment and the speech path between the To send and receive SIP messages over the GPRS
users. This information is known as the session description, network, the user’s equipment must establish a bi-
and could include, for example, the coder used for the directional packet data session with the IM domain for the
speech and the bandwidth needed for the speech paths. SIP signalling path. This is known as a packet data protocol
essentially provides the invitation and synchronisation of (PDP) context activation and is a common GPRS procedure
the participants, and it uses the session description protocol for establishing an IP data path between the user’s terminal
(SDP) (see RFC 2327 [5]) to describe the session. and the network. The signalling path is a separate PDP
context to the speech path, and must be done before any SIP
Both of these protocols are standardised by the IETF, messages can be sent (e.g. for registration). The GPRS
and are simple to use and program, text-based, and can be quality-of-service class for signalling is interactive,
readily adapted to support a wide range of multimedia although the detailed parameters define the specific
applications. transport quality requirements (such as high priority, but
lower sensitivity to delay and jitter). The establishment of
In the SIP protocol, users are addressed by a SIP the PDP context for signalling assigns an IP address to the
uniform resource locator (URL), which has the form mobile terminal and allocates bandwidth and the required
user_name@network_domain_name, where the user_name quality of service over the UTRAN and GPRS network for
and network_domain_name are textual names (similar to an the signalling. The assigned IP address, together with the
Internet e-mail address). PSTN telephone numbers may be SIP port number is used to address the SIP client in the
textualised so that they can conform to this format, and thus mobile terminal. This IP address can also be used
allow users to be addressed from the PSTN (and vice versa). subsequently as the IP address for the speech paths.
When the mobile terminal is to be switched off or roams
The SIP URLs provide a flexible means of addressing, to another network, the PDP context for the signalling is
but can also be easily included in Web pages as hyperlinks, deactivated.
that when activated initiate a SIP session to that user.
One of the benefits of using GPRS to carry the
signalling path is that the GPRS controls the handover of
The functional entity in the IM domain that performs the
the signalling path as a user moves between the radio cells.
call control is known as the call state control function
It is therefore not essential that the IM domain be aware of
(CSCF). These have been classified into different types [6],
the geographical location of a user. However, some voice
and provide the functions of a stateful SIP proxy server, as
services may require location information (for example, to
defined in RFC 2543. Correspondingly, the user’s
restrict the user to certain cellular areas or for emergency
equipment provides the functions of a SIP user agent, as
services). In these cases, the IM domain will need to obtain
defined in RFC 2543. SIP messages are usually transported
location information from the GPRS network or home
by the transmission control protocol (TCP) (see RFC 761
subscriber server (HSS).
[5]) or user datagram protocol (UDP) (see RFC 768 [5]).
However, SIP is transport independent and other protocols, It should be recognised that SIP only supports the call
such as the stream control transmission protocol (SCTP) control procedures for the establishment of the speech path.
which runs over UDP, can be used to provide a higher level The allocation of the actual bandwidth and quality-of-
of quality than TCP. service needed to provide the IP transport for the speech
packets over the UTRAN and GPRS network is requested
SIP itself includes reliability mechanisms that can be by the user’s equipment as additional PDP contexts using
used if running over an unreliable transport, but these can be the GPRS protocols. Similarly, the control of quality-of-
omitted if a reliable protocol transport such as TCP or SCTP service mechanisms in the core IP network is independent
is used. These protocols are carried over IP over the GPRS of the call control procedures, and instead relies on other
54 network and IM domain IP core network (see Fig 4) solutions (such as Diffserv).

BT Technol J Vol 19 No 1 January 2001


VOICE AND INTERNET MULTIMEDIA IN UMTS NETWORKS

Within the IM domain, additional protocols are used AMR coded speech samples
between network elements in order to provide the full voice (e.g. 20ms of speech)
service. These include a mobility management protocol
between the CSCFs and the HSS (this could be MAP [2] or
LDAP (see RFC 2251 [5])), and media gateway control RTP
RTP payload
protocol, such as the H.248/Megaco protocol (see RFC header
2885 [5]), jointly produced by the ITU-T and IETF.

3.3 Transport of speech packets UDP


UDP payload
header

So that speech may be sent and received in IP packets,


the user’s actual speech is sampled by the user’s equipment
and coded for transmission (e.g. using the AMR coder). IP header IP payload
Once a certain number of samples have been taken, usually
between 10 ms and 40 ms, the coded samples are packetised Fig 5 Transport of speech in IP.
and sent to the network. The time taken to packetise the To provide a higher quality of service than ‘best-effort’,
speech samples adds considerable delay to the speech path the GPRS network specifies a conversational class of
and can necessitate echo cancellation devices within the service that prioritises speech packets for low delay and low
terminal equipment or network. However, it is inefficient to jitter. Similarly, mechanisms such as Diffserv or the
simply send smaller packets of speech samples, since this resource reservation protocol (RSVP) (see RFC 2205 [5])
increases the bandwidth needed and requires the IP routers may be used within the IM domain core IP network to
to route more speech packets. In addition to the end-to-end provide a high-quality service.
transmission delay, delay due to packet jitter is also
encountered at the termination of the speech path where the The AMR coder is the default codec that all Release 5
packets have to be buffered so that the digitised speech can terminals must support, although other codecs may also be
be synchronised before it is played out. supported. As transcoder-free operation is supported by the
call control signalling, AMR coding of the speech can be
The speech packets are transported between users’
used end-to-end between the users’ items of equipment,
equipment in UDP/IP packets by the GPRS network and IM
without the need to transcode to another standard. However,
with AMR coding of speech at 12.2 kbit/s− 1, a 20 ms
domain core IP network. A framing protocol is required for
the speech samples, e.g. to synchronise samples and control
sample of speech results in an RTP speech payload that is
the sampling rate. The IETF protocols for framing voice and
roughly half the size of the combined IP, UDP and RTP
multimedia are the real-time transport protocol (RTP) (see
packet headers. This makes for a very inefficient use of
RFC 1889 [5]) and RTP control protocol (RTCP), which are
bandwidth, especially in the costly radio access. Increasing
carried in UDP/IP packets. This is shown in Fig 5.
the sample size reduces the problem, but increases the end-
Currently, as RTP and RTCP do not support rate- to-end delay of the speech packets, as well as increasing the
controlled codecs such as AMR, another framing protocol likelihood of packet loss on the radio interface.
such as the Iu user plane protocol (IuUP) [2]], that is used to
frame the speech on the Iu interface in Release 1999, could One solution to this is to perform header compression
be used instead of RTP and RTCP. However, for the between the user’s equipment and the UTRAN, where the
purposes of this paper, RTP and RTCP are used to illustrate bandwidth is most expensive. This could theoretically bring
the framing and transport of speech and multimedia in the the overhead of the IP, UDP and RTP headers down to less
IM domain. than 10% (not including the overhead of the lower layer
GPRS and UTRAN protocols). Another possible solution is
For the user’s equipment to be able to send and receive to transport the speech from the user’s equipment through
speech packets to and from the IM domain, it must activate the UTRAN using AAL2, and to packetise the speech into
a bi-directional PDP context between itself and the IM IP payloads at the node B or RNC. However, both of these
domain. This allocates bandwidth and the required quality solutions require that speech be transported differently to
of service over the UTRAN and GPRS network for the other real-time and best-effort services that can be sent in
transport of speech packets. The entry point to the IM uncompressed packets all the way to the user’s equipment.
domain will contain firewalls for security and prevention of For example, media such as video have transport-quality
denial of service attacks, and these may also be controlled requirements that are similar to voice, but the higher
dynamically by the call control, on a call-by-call basis, to bandwidth nature means that the payload-to-header ratio is
prevent speech packets being sent or received before the much greater, and hence less wasteful of bandwidth.
call is established. Deactivating the appropriate GPRS PDP
context disconnects the speech path between the user’s As with the signalling path, a benefit of using GPRS to
equipment and the IM domain. carry the speech paths is that the GPRS controls the 55

BT Technol J Vol 19 No 1 January 2001


VOICE AND INTERNET MULTIMEDIA IN UMTS NETWORKS

handover of the speech paths as a user moves between the user’s equipment is switched on. Once registered, users can
radio cells. However, this does require that the GPRS make and receive IM domain calls until they deregister.
handover procedures be enhanced to ensure that the quality Before the registration procedure can take place, the user
of service required for voice is met throughout the equipment has to connect to the network and discover an
handover. entry point into the IM domain. This entry point is the proxy
call state control function (P-CSCF), and it provides a
3.4 Roaming, registration and discovery simple, generic call control function as well as potentially
providing a SIP firewall to ensure security of the IM
One of the main benefits of current GSM networks is domain.
the ability for the user to make and receive calls while
travelling abroad. To provide such a benefit, the user must The P-CSCF always resides in the network to which the
have the capability to be able to connect to a network that is UE is connected, and therefore the procedure for discovery
controlled by an operator other than that to which they are of the P-CSCF is the same, irrespective of whether the user
subscribed. This benefit is also an essential feature of the is roaming or not. Additionally, the P-CSCF could provide
Release 5 standards, although additional procedures are access to services that are not user specific but that are
required to provide a roaming capability for the IM domain. specific to the ‘roamed to’ network, such as emergency
The reasoning behind this is the fact that the user’s voice calls.
service can be controlled by one of two methods — home or
visited.
The procedure for discovery relies on GPRS signalling
In Lobley [6] it is shown that the call is controlled by an with the use of the IETF dynamic host configuration
entity known as a serving call state control function (S- protocol (DHCP) (see RFC 2131 [5]) and domain name
CSCF). The S-CSCF can be located either in the network system (DNS) (see RFC 1035 [5]) protocols. The idea of the
owned by the operator to which the user is subscribed, procedure is for any UE to be able to attach to a GPRS
known as home control, or alternatively in the network network, and be provided with an IP address of the P-CSCF.
owned by another operator if the user has roamed to that All SIP-based signalling from the UE then goes via the P-
network, known as visited control. This is the main CSCF which is responsible for routeing the messages on to
difference when roaming in an IM domain compared to the S-CSCF.
today’s GSM network and Release 1999, where the visited
network always controls a roaming user’s voice service. Figure 6 shows the sequence of events in the ‘discovery’
procedure.
In order that the user can make and receive calls, the
user equipment (UE) has to be registered with an S-CSCF. The sequence of events that make up the discovery
The registration procedure happens immediately after the procedure is described below.

user DHCP DNS


SGSN GGSN
equipment server server

activate PDP context


activation create PDP context
1 activation

create PDP context


activate PDP context response
response

DHCP DISCOVER
2

DHCP OFFER

DHCP REQUEST
3

DHCP ACK

QUERY
4
QUERY response

56 Fig 6 Discovery message sequence.

BT Technol J Vol 19 No 1 January 2001


VOICE AND INTERNET MULTIMEDIA IN UMTS NETWORKS

• PDP context activation (1) to the server(s). Each server checks the returned IP
address. If it does not match, the server considers it as
The UE activates a PDP context to the GPRS network, an implicit decline. However, the selected DHCP
which will be used for the discovery procedures, and server sends a DHC PACK to the UE.
later for the IM domain registration and call control
procedures using SIP. To achieve this, the UE sends an • DNS query (4)
activate PDP context activation request to the SGSN.
The UE sends a DNS QUERY to the DNS server for
Upon receipt of the request, the SGSN sends a create
resolution of the predefined name for P-CSCFs to an IP
PDP context activation request to the GGSN. If the
address. The DNS server replies to the UE with a
GGSN is able to establish a PDP context (e.g. after
QUERY response containing the IP address of an
checking that the UE has the necessary permission), it
appropriate P-CSCF.
creates a PDP context response to the SGSN, which in
turn replies to the UE with an activate PDP context On disconnection of the UE, such as just before the
response. This is a standard GPRS procedure, although device is turned off, the IP address can be released back to
the details, such as the GPRS address point name used the DHCP server and the signalling PDP context can be
and the nature of the PDP address returned, may be deactivated.
specific to the discovery procedure.
Now that the UE has knowledge of the proxy CSCF
• DHCP discovery (2) address, the registration procedure can take place in order
that an S-CSCF can be selected. Unlike the discovery
The UE broadcasts a DHCP DISCOVER message to
procedure, the registration procedure differs depending on
the network. Upon receiving this message the DHCP
whether the S-CSCF is to be located in the home network,
Server can respond with a DCHP OFFER message or it
or the visited network. However, the home network, the
may not respond at all. If the DHCP server decides to
network to which the user is subscribed, always carries out
respond it broadcasts the DHCP OFFER message with
the decision on whether home control or visited control is
a specified available IP address. Note: at this stage
used. Figure 7 shows the functional entities involved in
there is no agreement of an assignment between the
registration for visited network control, and Fig 8 shows the
DHCP server and the UE. The UE may receive more
message sequence required. The message sequences for
than one DHCP OFFER response (if more than one
home network control are the same except the visited I-
DHCP server responds) and therefore will have to
CSCF is not required since the S-CSCF is in the home
choose one.
network. If a user is connected to the home network rather
• DHCP request (3) than a visited network, the visited I-CSCF is not required
and the S-CSCF and P-CSCF will be in the home network.
Using the IP address received within the DHCP
OFFER response, the UE broadcasts a DHCP After the UE has obtained a signalling path through the
REQUEST message containing the chosen IP address GPRS network, it can perform the IM registration.

HSS

visited home
IM domain IM domain

serving interrogating interrogating


CSCF CSCF CSCF

home core
IP network
proxy
CSCF

user GPRS visited core


equipment network IP network

registration signalling (SIP)


mobility management signalling
discovery signalling (GPRS, DHCP, DNS)
Fig 7 Functional entities for visited network registration. 57

BT Technol J Vol 19 No 1 January 2001


VOICE AND INTERNET MULTIMEDIA IN UMTS NETWORKS

user visited visited home


P-CSCF HSS
equipment S-CSCF I-CSCF I-CSCF

REGISTER
1 REGISTER
Cx-Query
2

Cx-Select-Pull
3

REGISTER
4
REGISTER
5
Cx-Put
6

Cx-Pull
7

200 OK
8 200 OK

200 OK
200 OK
9

Fig 8 Message sequences for visited network registration.

Signalling based on SIP is used to perform the registration home network or the visited network (for example,
between the UE and the CSCFs. The protocol between the based on the user’s service profile). The HSS then
CSCFs and the HSS is as yet undefined, but is represented issues a response indicating the serving network
in this paper by information flows prefixed by the letters Cx selection back to the home I-CSCF.
(since this is the Cx reference point in the architecture). IM
domain registration for visited network control requires the • Cx-Select-Pull (3)
following steps.
At this stage, it is assumed that the authentication of
• Register (1) the user has been completed (although it may have
been determined at an earlier point in the message
The UE sends a REGISTER message to the P-CSCF. sequence). The home I-CSCF then sends a Cx-Select-
This message contains the subscriber identity and the Pull to the HSS to request the information related to the
domain name of the home network. Upon receipt of the S-CSCF capabilities required by the user. The HSS
REGISTER, it examines the home domain name to responds with the necessary information on the
discover the entry point to the home network. This required S-CSCF capabilities to the home I-CSCF.
entry point is an interrogating CSCF (I-CSCF), which
provides policing of the SIP interface to other networks • Home I-CSCF forwards message (4)
and interrogation of the home subscriber server. The P-
CSCF forwards the REGISTER message on to the I- The home I-CSCF determines the address of an I-
CSCF in the home network, adding the name of the P- CSCF in the visited network from the visited network
CSCF, a visited network contact point name, and the contact point name, and forwards the REGISTER
visited network capabilities. A name-address message on to the visited I-CSCF2.
resolution mechanism is utilised in order to determine
the address of the home network from the home • Visited I-CSCF forwards message (5)
domain name.

• Cx-Query(2) The visited I-CSCF, using its role of S-CSCF selection,


determines the name and address of an appropriate S-
When the I-CSCF receives the REGISTER message, it CSCF based on the required S-CSCF capabilities, and
queries the HSS by sending a Cx-Query containing the forwards the REGISTER message on to it.
parameters of the REGISTER message. The HSS
2
checks whether the user is already registered, and if This step is not required if home network control is selected. Instead, the
functions performed by the visited I-CSCF are performed by the home
58 not, selects whether the serving CSCF is to be in the I-CSCF.

BT Technol J Vol 19 No 1 January 2001


VOICE AND INTERNET MULTIMEDIA IN UMTS NETWORKS

• S-CSCF contacts HSS using Cx-Put (6) multimedia calls with the IM Domain, irrespective of the
location.
On receiving the REGISTER, the S-CSCF associates
the subscriber and the S-CSCF name in the HSS using A deregistration procedure is invoked by either the UE
the Cx-Put, which is acknowledged by the HSS. or the network in order to remove the registration of the user
on a S-CSCF, for example if a user roams to a different
• S-CSCF retrieves profile using Cx-Pull (7) visited network or the user disconnects from the network.
The S-CSCF then uses the Cx-Pull request/response to
3.5 Control of voice (and multimedia) calls
retrieve the subscriber’s profile for the user from the
HSS, which it then stores locally. The S-CSCF also
Once the user is registered with an S-CSCF, voice and
stores the name of the P-CSCF.
multimedia calls may be made to other users. The S-CSCF
• Serving contact name determination (8) provides the main point of control of the call and any
supplementary or advanced service features for that user.
The S-CSCF then determines whether the serving SIP signalling between the user equipment and the S-CSCF
contact name should be that of the S-CSCF or the is routed via a P-CSCF, which provides a (secure) entry
visited I-CSCF. The S-CSCF then returns a 200 OK point to the IM domain and a point of flexibility for routeing
message with this information to the visited I-CSCF. SIP messages to home or visited network S-CSCFs.
The visited I-CSCF forwards the 200 OK to the home
I-CSCF, and then releases all knowledge of the Each user will be registered with an S-CSCF, so that a
registration information for that user. Similarly, the simple voice call between two users will usually require two
home I-CSCF forwards the 200 OK to the P-CSCF, S-CSCFs to communicate (i.e. one for each user).
and then releases all knowledge of the registration Additionally, an I-CSCF is required in order to interrogate
information for that user. the HSS to find the S-CSCF on which the called user is
registered. Figure 9 shows the main functional entities
• Registration completion involved in the control of voice calls between two mobile
users on a Release 5 network. For simplicity, this scenario
On receiving the 200 OK message, the P-CSCF stores assumes that both users are connected to, and registered on,
the serving network contact name, before sending the their home network (i.e. they are not roaming). However,
200 OK to the UE and completing the registration the sequence of events is similar for roaming users, with
procedure. home or visited control.

The user is now registered with an S-CSCF in the An IM domain call comprises the following five distinct
visited network and is able to make and receive voice and phases.

HSS
(B)
IM domain IM domain
(A) (B)

serving interrogating serving


CSCF (A) CSCF (B) CSCF (B)

proxy proxy
CSCF (A) CSCF (B)

user A GPRS core IP core IP GPRS


equipment user B
network (A) network (A) network (B) network (B)
equipment

speech path (IP media stream)


call control signalling (SIP)
location management signalling

Fig 9 Functional entities in a Release 5 mobile-to-mobile call.


59

BT Technol J Vol 19 No 1 January 2001


VOICE AND INTERNET MULTIMEDIA IN UMTS NETWORKS

• Call invitation • PDP content activation


The calling user invites the called user to participate in The UE IP address for the IP speech paths may not be
a call. This is supported by the SIP INVITE method known until the GPRS PDP context for the speech path
and the 100 trying provisional response. has been activated. In this case, without a resource
reservation phase, the PDP context would have to be
• Resource reservation activated prior to the INVITE, before the session
description is sent.
The resources (such as the GPRS and UTRAN
bandwidth) are reserved so that early tones and • Tone/announcement provision
announcements can be played, and that transport for
Without early reservation of the speech path, it is not
the speech path is available when the called user
possible for the network or end equipment to provide
answers. This is currently not supported by SIP,
tones or announcements in the speech path back to user
although the use of the 183 session progress
A prior to the call being answered (such as ring tone or
provisional response and 200 OK final response with a
busy tone).
new SIP method, COMET, has been proposed in the
IETF and 3GPP. Figure 10 illustrates the SIP signalling flows for a
simple mobile-to-mobile call with a resource reservation
• Call offering phase, based on the scenario in Fig 9. It assumes that the
The called user is alerted to the incoming call. Support underlying GPRS and IP core network provides the
for informing the calling user of this event is provided necessary quality of service for the speech paths.
by the SIP 180 ringing provisional response. The message sequences in Fig 10 are described below.
• Call connection • Invite (1)
The called user answers, the speech path is connected User A initiates the call by sending an INVITE
and charging begins. This is supported in SIP by the message to the P-CSCF, which contains the names
200 OK final response and the ACK method. (SIP URLs) of the calling and called users. The session
description part of the message includes the IP address
• Call termination of user A’s UE and a description of the speech path
The call and speech path is cleared by one of the users. (e.g. AMR coded speech using RTP, with the UDP port
This is supported in SIP by the BYE method and the number). This description may include options, such as
200 OK final response. a range of codecs that could be used. Additionally, the
session description indicates that the reservation of the
The addition of the resource reservation phase of the speech path IP transport and quality of service is a
call to the SIP protocol is necessary in the mobile mandatory pre-condition to ringing.
environment for a number of reasons. The P-CSCF confirms receipt of the INVITE by
replying with a 100 trying message, and forwards the
• Path establishment prior to ringing
INVITE on to the S-SCSF, adding the name of user
The establishment of the PDP contexts for transport of A’s S-CSCF to the message. This allows tracing of the
the speech path should occur prior to the called user’s signalling route back through the network.
telephone ringing. While this may not need to be the The S-CSCF confirms receipt of the INVITE by
case for all multimedia services, there is a user replying with a 100 trying message, and then invokes
expectation that when a ringing telephone is answered, any necessary service features for user A (for example,
a speech path will be in place. Given the scarcity of outgoing call-barring). The S-CSCF then determines a
bandwidth on the radio interface, if reservation is not SIP entry point for user B’s home network from the
performed early, then in some cases a ringing SIP URL for user B, for example by performing a DNS
telephone could be answered only for the users to find query. The SIP entry point to user B’s home network
that there is no speech path available. will usually be an I-CSCF. User A’s SCSF then sends
• Quality of service the INVITE on to user B’s I-CSCF, adding the name of
user A’s S-CSCF to the message.
The service may require that the quality of service of
the speech channels be established end-to-end, using a
• Cx-Query (2)
protocol such as the IETF RSVP, or that the speech On receiving the INVITE, the I-CSCF interrogates the
paths need to be secure. If so, the procedures to reserve HSS with a Cx-Query to determine the address of user
the appropriate quality-of-service level or to implement B’s S-CSCF. It also confirms receipt of the INVITE
the security should occur prior to the call ringing and from user A’s S-CSCF by replying with a 100 trying
60 alerting user B. message. Once the HSS responds with the address of

BT Technol J Vol 19 No 1 January 2001


VOICE AND INTERNET MULTIMEDIA IN UMTS NETWORKS

user A proxy serving interrogating serving proxy user B


equipment CSCF (A) CSCF (B) HSS (B) CSCF (B) CSCF (B)
CSCF (A) equipment
INVITE 2
1 INVITE
INVITE
Cx-Query
call invitation

100 trying 100 trying


100 trying
INVITE
INVITE
INVITE
100 trying 3
100 trying
183 session
183 session
progress
183 session 183 session progress
183 session 183 session
progress progress
progress progress
resource reservation

activate PDP activate PDP


context context
COMET COMET COMET COMET
COMET COMET
200 OK
200 OK 200 OK 5
200 OK 200 OK
180 ringing
180 ringing 180 ringing
180 ringing 6
offering

180 ringing
call

180 ringing
200 OK
200 OK 200 OK 7
200 OK 200 OK
200 OK
call connection

ACK ACK
8 ACK ACK
ACK ACK

speech transmission (both way RTP media)

BYE
9 BYE BYE BYE
BYE BYE
call termination

10

deactivate PDP deactivate PDP


context context
200 OK
200 OK 200 OK
200 OK 200 OK
200 OK

Fig 10 Signalling message sequences for a simple mobile-mobile call.

the S-CSCF on which user B is registered, the I-CSCF with a 183 session progress message, indicating in the
forwards the INVITE on to that S-CSCF, adding its session description that it accepts the pre-condition,
name to the message. and requesting confirmation that user A’s UE has itself
met the pre-condition. This message traverses the
User B’s S-CSCF receives the INVITE and invokes signalling path via the CSCFs back to user A’s UE. In
any necessary service features for user B, before the meantime, user B’s UE activates a GPRS PDP
forwarding the INVITE on to user B’s P-CSCF adding context for the user-plane speech path through to an IM
its name to the message. The S-CSCF confirms receipt domain IP entry point (e.g. a firewall that protects the
of the INVITE by replying to the I-CSCF with a 100 IM domain IP core network).
trying message. The P-CSCF receives the INVITE and
forwards it on to user B’s UE. The P-CSCF confirms • PDP context activation (4)
receipt of the INVITE by replying to the S-CSCF with
User A’s UE receives the 183 session progress, and
a 100 trying message.
activates a GPRS PDP context for the speech path
• Invite acceptance (3) through to the IM domain IP entry point. As required,
the UE confirms that the speech path is reserved by
User B’s equipment accepts the call invitation, but sending a COMET message back to user B’s UE along
does not alert user B at this stage. Instead, it responds the signalling path, which also contains the address 61

BT Technol J Vol 19 No 1 January 2001


VOICE AND INTERNET MULTIMEDIA IN UMTS NETWORKS

details of the speech path on user A’s UE and can also if necessary. The circuit-switched networks generally use
confirm the agreed session description. the ITU-T SS7 integrated services user part (ISUP) to
• Acknowledgement of reserved speech paths (5) control calls. Signalling gateways (SGW) in the IM domain
map between the message transfer part levels of SS7 and the
On receiving the COMET, user B’s UE now knows SIP transport protocol (e.g. TCP/IP) used in the IM domain.
that the necessary IP transport and quality of service
for the speech paths has been reserved at both ends, It is not possible to simply map ISUP signalling
and the address to use for the speech path. It messages into SIP messages, since the service context of the
acknowledges the COMET with a 200 OK, which can messages must be known. A media gateway control
contain the the address details of the speech path on function (MGCF) is used to perform the mapping of the IM
user B’s equipment. domain voice service (and SIP signalling) to the voice
• User B alert (6) service of the other network (e.g. PSTN voice service and
ISUP signalling). The MGCF communicates with the S-
User B’s equipment alerts user B, for example by CSCF or I-CSCF using SIP. The MGCF also controls the
ringing. It indicates this back to user B’s S-CSCF using MGW, for example using the H.248/Megaco protocol,
a 180 ringing message, which is sent back via the jointly developed by the IETF and ITU-T.
signalling path to user A’s UE. User A’s UE will then
provide an indication of this back to user A, such as a
Interworking with other VoIP networks that are not
locally generated ringing tone.
compatible with 3GPP Release 5, such as those based on
• User B answer (7) ITU-T Recommendation H.323, also requires signalling and
User B answers the call. User B’s UE sends a 200 OK media gateways in order to map any differences in lower
message via the signalling path to user A’s UE. If not layer protocols and police the IM domain. An MGCF is also
already sent, this message will contain the address needed to ensure appropriate mapping of the voice service
details of the speech path on user B’s equipment. between the networks.

• User A acknowledgement (8) Figure 11 shows the functional entities involved in


User A’s UE acknowledges the establishment of the interworking 3GPP Release 5 voice calls with PSTN or
call by sending an ACK, which traverses the signalling GSM networks.
path back to user B’s UE. The UEs are now able to
send IP speech packets to each other. It is likely that Figure 12 illustrates the message sequences for a simple
the P-CSCFs will have some control over the IM mobile originated voice call that terminates in the PSTN
domain speech-path entry points (firewalls), and not network. The message sequences are described below.
permit the speech packets through until this stage, or
on receipt of the prior 200 OK (the choice may be • Invite (1)
service dependent). This control could also be the point
The UE of the mobile calling party, initiates the call to
at which the call charging commences.
the PSTN user by sending an INVITE to their P-CSCF,
• Call release (9) which contains an appropriate session description. The
To release the call, user A’s UE sends a BYE message PSTN user is identified as such by the SIP URL, which
to user B’s UE via the signalling path, and deactivates contains the PSTN telephone number encoded into the
its PDP context for the speech path. At this point, the SIP URL format. Additionally, the session description
P-CSCF may close the IM domain speech-path entry indicates that the reservation of the speech path IP
point to further traffic and cease charging. transport and quality of service is a mandatory pre-
condition to ringing.
• Deactivation (10)
The P-CSCF confirms receipt of the INVITE by
User B’s UE responds by deactivating its PDP context
replying with a 100 trying message, and forwards the
for the speech path and acknowledging the BYE with a
INVITE on to the S-SCSF, adding the name of user
200 OK. This traverses the signalling path back to user
A’s S-CSCF to the message. This allows tracing of the
A’s UE, releasing each of the CSCFs from the call.
signalling route back through the network.
4. Interworking 3GPP Release 5 with other networks
The S-CSCF confirms receipt of the INVITE by

I nterworking with circuit-switched networks, such as the


PSTN, GSM and 3GPP Release 1999 networks, require
interworking at both the speech path level and the signalling
replying with a 100 trying message, and then invokes
any necessary service features for user A (for example,
outgoing call-barring). The S-CSCF then determines
level. Media gateways (MGWs) are included in the IM that the call is destined for the PSTN, and routes the
domain to interface terminate the IP transported speech INVITE to an appropriate MGCF, adding the name of
62 paths and convert this to circuit-switched TDM, transcoding user A’s S-CSCF to the message.

BT Technol J Vol 19 No 1 January 2001


VOICE AND INTERNET MULTIMEDIA IN UMTS NETWORKS

IM domain

serving media gateway signalling


CSCF (A) controller gateway

PSTN
or
proxy GSM
CSCF (A)
media
gateway
user GPRS core IP
equipment network network

speech path (IP media stream) call control signalling (SIP)

speech path (circuit-switched TDM) call control signalling (ISUP)


media G/W control signalling (H.248/Megaco)

Fig 11 Functional entities in a Release 5 mobile-to-PSTN or GSM call.

• MGW configuration (2) so that the backward speech path from the PSTN to the
UE is switched through so that the mobile user can hear
The MGCF initially responds to the S-CSCF with a
tones and announcements from the PSTN. It may also
100 trying. It then configures the MGW for the speech
select the chosen codec if the codecs have been
path3 (for example using the H.248/Megaco protocol),
negotiated. The MGCF then acknowledges the
by seizing an already created circuit-switched trunk
COMET with a 200 OK, which can contain the address
termination on the PSTN side of the MGW, and adding
details of the IP termination on the MGW.
a new IP speech-path (e.g. RTP) termination to the IM
domain side of the MGW. This is done by the ‘add’ The MGCF now initiates the call establishment to the
command, which additionally creates a new context in PSTN by sending an initial address message (IAM) to
the MGW, and associates the IP termination and PSTN the signalling gateway (SGW). The SGW relays the
termination. The PSTN termination is configured for IAM from the IP-based transport protocol (for example
both-way speech. The MGW returns a description of SCTP/UDP/IP) to the SS7 message transfer part, and
the ports to the MGCF in response. on to the PSTN entry point (for example a PSTN
gateway trunk exchange).
The MGCF, knowing the description of the IP speech-
path port and its capabilities, sends a 183 session • PSTN call acceptance (5)
progress message back to the UE, via the signalling
path. This indicates that the precondition can be met by The PSTN accepts the call with an address complete
the MGW, and that confirmation that the UE can meet message (ACM), which is sent back to the MGCF via
the precondition is required. the SGW. So that the mobile user may now hear any
in-band tones and announcements from the PSTN, the
• PDP context activation (3) MGCF sends a 183 session progress message back to
The UE receives the 183 session progress, and the UE. This contains a session description indicating
activates a GPRS PDP context for the speech path that one-way IP speech packets may be received and
through to the IM domain entry point. As required, the the address of the RTP termination on the MGW, if not
UE confirms that the speech path is reserved by already sent. This message follows the signalling path,
sending a COMET message back to MGCF along the and may cause the P-CSCF to control the IM domain
signalling path, which also contains the address details IP speech-path entry point (e.g. firewall) from the
of the speech path on the UE. GPRS network to allow the media to be played to the
UE.
• MGW 1-way connection (4)
• PSTN alerting (6)
On receiving the COMET, the MGCF now knows both
that the necessary IP transport and quality of service The PSTN sends a call progress message (CPG) to the
for the speech paths has been reserved at both ends, SGW, indicating that the called user’s telephone is
and the address to use for the speech path. It then ringing. This is accompanied by in-band ring tone in
modifies the IP speech-path termination on the MGW the speech path back to user A. The SGW relays this
3 message back to the MGCF. The MGCF sends a 180
It is assumed that the MGW has already established a control relationship
with the MGCF and the terminations on the TDM circuit-switched side ringing message to the S-CSCF, which is forwarded
have already been provisioned and configured. via the signalling path to the UE. 63

BT Technol J Vol 19 No 1 January 2001


VOICE AND INTERNET MULTIMEDIA IN UMTS NETWORKS

mobile proxy serving media G/W media signalling PSTN


terminal CSCF CSCF controller G/W G/W entry point
INVITE
1 INVITE
INVITE
call invitation

100 trying 100 trying


100 trying 2
Add
183 session 183 session
183 session
progress progress
progress
3

activate PDP
context
resource reservation

COMET
COMET 4
COMET

call invitation
modify

200 OK 200 OK
200 OK
IAM
IAM

183 session 183 session ACM

reservation
183 session progress ACM 5

resource
progress
progress

CPG
180 ringing CPG 6
180 ringing 180 ringing
call offering

call offering
in-band ringing tone (one-way IP media) in-band ringing tone

ANM
ANM 7
8
modify
200 OK

call connection
200 OK
call connection

200 OK

ACK
9 ACK
ACK

speech transmission (both way IP media) speech transmission (both way circuit-switched TDM)

BYE
10 BYE
BYE
call termination

call termination
11
REL
200 OK REL
200 OK 200 OK
subtract

RLC
RLC 12

Fig 12 Signalling message sequences for a simple mobile-PSTN call.

• PSTN answer (7) message back to the UE, with a session description
When the call is answered, the PSTN sends an answer indication that two-way media may be sent and
message (ANM) to the SGW, which relays it back to received. This message follows the signalling path, and
the MGCF. may cause the P-CSCF to control the IM domain IP
speech-path entry point to allow both-way media. This
• MGW 2-way connection (8) is the point at which call charging commences.
At this point, the MGCF issues another modify • Acknowledgement of call establishment (9)
command to change the IP speech-path termination in
the MGW to allow both-way speech paths to be The UE receives the 200 OK. The call is now
64 switched through. The MGCF then sends a 200 OK established and two-way speech can take place

BT Technol J Vol 19 No 1 January 2001


VOICE AND INTERNET MULTIMEDIA IN UMTS NETWORKS

between the mobile and PSTN user. The UE MSC call server by an appropriate protocol such as the
acknowledges this by sending an ACK message back H.248/Megaco protocol.
to the MGCF via the signalling path.

• PDP context deactivation (10) The signalling from the GMSC call server to other
networks (including Release 5 CSCFs) is via a signalling
When the mobile user clears the call, the UE sends a gateway. The speech paths are interconnected to other
BYE to the P-CSCF and deactivates the PDP context circuit-switched and other VoIP networks via a media
for the speech path. The P-CSCF forwards the BYE on gateway. The operation of these gateways is similar to the
to the MGCF via the S-CSCF. PSTN interconnect case for the Release 5 voice service.
• Call release (11)
Additionally, interconnect of the speech paths to
The MGCF releases the call into the PSTN with a Release 5 networks may not require a media gateway if the
release message (REL) and confirms the BYE by speech paths are compatible, although additional security
sending a 200 OK message back to the UE via the measures (such as firewalls) will be required in the case of
CSCFs on the signalling path, which each release the interconnect to other operators.
call in turn.
The MSC call server supports the Release 1999 call
The MGCF then clears the speech path in the MGW by control, service features and mobility management of an
issuing a subtract command to delete the terminations MSC, while the GMSC call server performs the call control
from the call context, and the call context itself. The and HSS interrogation of a Release 1999 GMSC — both,
MGW optionally responds by sending an audit report however, using media gateways to perform the circuit-
for the call to the MGCF that contains information such switching functions, with IP providing the core transport
as the number of packets sent/received and the packet network.
loss.

• Release confirmation (12) Using this design, the Release 4 networks are capable of
supporting the Release 1999 voice service with minimal
The release message sent to the PSTN is confirmed enhancement to the network and little, if any, impact on the
back to the MGCF (via the SGW) by a release end user.
complete message (RLC), which completes the release
procedure. 6. Conclusions

5. Voice and multimedia in the 3GPP Release 4


network V oice telephony is an essential service for many mobile
network users, and one that must be supported by 3rd
generation networks. This paper has shown how the initial

T he 3GPP Release 4 standards provide a means for oper-


ators to migrate the Release 1999 circuit-switched
domain to an IP-based core network infrastructure.
3GPP UMTS standards have taken an evolutionary approach
to providing a voice service compatible with GSM to max-
imise the benefits of the new radio access technologies. It has
then described how a more innovative approach to providing
An overview of the Release 4 network is shown in Fig voice and multimedia integration with the Internet protocols
13. The MSC call server concept is more fully described in is being developed for the Release 4 and 5 standards.
Lobley [6].
As a founder member of the 3G.IP group [9], BT has
In this case, the Release 4 circuit-switched domain
played an influential role in moving the mobile network
connects to the UTRAN via the Iu-CS interface, which
standards towards an Internet solution for voice and
supports the same speech transport and signalling protocols
multimedia.
as in Release 1999. GSM radio networks can also be
connected via the GSM A interface. The Iu-CS interface
(and similarly the GSM A interface) is terminated at the BT continues to make an active contribution to 3G.IP
circuit-switched domain entry point by a media gateway. and the 3GPP standards needed to realise a mobile voice
This relays the signalling path from the ATM transport on and multimedia mobile solution on a global scale.
to IP transport (such as TCP or SCTP) and on to the MSC
call server. Speech paths in the Iu-CS interface are relayed Acknowledgements
into the IP core network from ATM AAL2 transport on to
UDP/IP transport. For the speech circuits, the media
gateway operates in a similar way to the one used for PSTN T he author would like to thank his colleagues in the
BTexaCT 3G Networks unit who provided information
for, and valuable discussion on, this paper.
interconnect (in Fig 2 and Fig 11), and is controlled by the 65

BT Technol J Vol 19 No 1 January 2001


VOICE AND INTERNET MULTIMEDIA IN UMTS NETWORKS

GSM radio
access network

application A
EIR HSS signalling
and service
gateway
environment

BSS

VLR GMSC
call B
server signalling
MSC call server
gateway

GSM A
interface
RNC
C
media media
gateway gateway

D
UMTS lu-CS
RNC
interface

UMTS terrestrial radio circuit-switched domain


access network (UTRAN) (IPv6 core network)

signalling A mobility management signalling to other networks

speech paths B call-related signalling to other networks

C circuit-switched speech circuits to other networks (e.g. PSTN and GSM)

D speech paths to Release 5 and other VoIP networks

Fig 13 3GPP Release 4 network overview.

References 8 European Telecommunications Standards Institute project — Tiphon


http://www.etsi.org/tiphon

1 Mehrotra A: ‘GSM System Engineering’, Artech House (1997). 9 3G.IP — http://www.3gip.org

2 3GPP — http://www.3gpp.org
Mel Bale is a senior technical consultant on
3rd generation networks in BTexaCT. He
3 International Telecommunication Union — Telecommunication
joined BT in 1987 after graduating from the
Standardization Sector — http://www.itu.int
University of East Anglia, initially leading a
number of Unix software developments for
4 Harris J W: ‘The future of radio access in 3G’, BT Technol J, 19, No 1, network test systems. In 1993, he moved into
pp 106—113 (January 2001). the field of intelligent networks, where he
managed a team of voice network designers
5 Internet Engineering Task Force — http://www.ietf.org and had responsibility for defining service
architectures for future networks, including
the Parlay API.
6 Lobley N C: ‘GSM to UMTS : Architecture evolution to support multi-
media’, BT Technol J, 19, No 1, pp 38—47 (January 2001). He is currently leading teams designing fixed
and mobile VoIP networks and researching
7 Cookson M D: ‘3G service control’, BT Technol J, 19, No 1, pp 67— future IP mobile network technologies. He is a Chartered Engineer and a
66 79 (January 2001). Member of the IEE.

BT Technol J Vol 19 No 1 January 2001

You might also like