You are on page 1of 8

Voice over IP (VoIP)

Hashir Zariwala – Mohammad Adil Khan – Saghir Malik


Students of Engineering (Telecommunication),
Bahria University Karachi Campus

Abstract way communications between nodes can be managed


relatively easily. Signaling protocols take additional
This report consists of information regarding VoIP that responsibilities such as address translation, bandwidth
how it is used to establish multimedia session (audio, management, and authorization and in some cases make
video) traffic to be transmitted over Internet Protocol. routing decisions, while many sets of protocols have
VoIP uses communication services for transmission of been developed over the years to handle above
voice, fax, SMS and other voice messaging applications described function.
via the internet rather than PSTN (Public Switched There are two types of Signaling protocol in VoIP.
Telephone Network). Signaling and media channel  H.323
establishment is done by two Protocols (H.323) and  SIP (Session initiation Protocol)
SIP.
2.1 H.323
Key Words- Teleconferencing, H.323, SIP, gate
keepers, gateway, and Unified communications H.323 was developed in mid nineties and was originally
established for the videoconferencing over a packet
based network, but was quickly adopted for Voice over
1. Introduction
IP. As a session layer protocol, its main function is to
perform call control and management on an IP network.
Voice over Internet Protocol (Voice over IP, VoIP) is a
The following H.323 suite of protocols is used for
general term used for the transmission of voice and
signaling purpose.
multimedia sessions over (IP) Internet Protocol
networks, such as the Internet. The other terms that can
be used with VoIP are IP telephony, VoBB (voice over  DVB - Digital Video Broadcasting
broadband), Internet telephony and broadband phone.  H.225 - Call Setup Termination
VoIP uses communication services for transmission of  H.235 - Security and Authentication
voice, fax, SMS and other voice messaging applications  RTP - Real time Transport Protocol
via the internet rather than PSTN (Public Switched  RTCP - Real time Control Protocol
Telephone Network). The steps involved in originating  H.245 - Flow control, Port Number
a VoIP telephone call are signaling and media channel  Q931 - Management of Call Setup
setup, digitization (and under some conditions
compression) of the analog voice signal, packetization
and transmission as Internet Protocol (IP) packets over The H323 specification relies on two additional
a packet-switched network. On the receiving side signaling protocols, H.225 and H.245 for call setup and
similar steps reproduce the original voice stream. management. When a session is initiated between two
Following are the steps used in establishing a VOIP H.323 devices, the H.225 standard uses the Q931 ISDN
call: protocol to perform setup and teardown functions using
 Signaling TCP for a reliable connection. H.245 then opens
 Media Channel Setup [1] another TCP connection to establish the capabilities of
the devices, negotiate the codec’s, and determine which
2. Signaling ports will be used for the session. A channel is then
opened on which the actual media will travel using
The VoIP signaling protocols use TCP to set up, UDP for transport because of its speed and relying on
manage and tear down the VoIP phone call. Signaling the upper layer Real Time Protocol (RTP) for
protocols are not concerned with the actual media sequencing and timing information. It is important to
stream of voice or video. Their basic functions are to note that for session set up, negotiation, and
first initiate a session, then to find common ground for management, TCP is used for its reliability. For time
communication between sensitive media such as voice and video, UDP is
the parties involved and to terminate the session when utilized as the transport mechanism because of its speed
the calls end. On a closed IP network, real time two and low overhead. The packet size can be further
reduced by using RTP header compression, reducing
the combined IP/UDP/RTP header from over 65% to as 2.2.4 Gatekeepers
low as 10% of the entire packet size. Smaller means
faster, and in IP Telephony, the name of the game is As an optional component that is usually found in larger
speed. Enterprise networks, the gatekeeper is the most
important component in the H.323 configuration. The
2.2 Components of H.323 gatekeeper manages all the registered terminals,
gateways, and MCUs in a single H.323 zone, which can
An H.323 network is comprised of four logical span multiple LAN/WAN segments. Services such as
components, not all of which will be needed on every addressing, authorization and authentication of H.323
network and that can reside on a wide variety of components, bandwidth management, accounting and
devices. [2] billing can all be configured on the gatekeeper. But
once they are, all endpoints must obey! The king of the
2.2.1 Terminal zone can make routing decisions, and can also simplify
management of multiple gateways by handling their call
A terminal is an endpoint device such as an IP control functions in a centralized manner. While a
telephone, a computer running an H.323 software gatekeeper can be implemented on a gateway or MCU,
application or a dedicated conferencing device. An in larger organizations you will usually find them on a
H.323 compliant terminal must support H.245 for dedicated server (such as Microsoft’s ISA server) or on
channel and capabilities negotiation, RAS (Registration, a Cisco IOS router. [2]
Admission, Status), Q931 for signaling and setup and
support for RTP and RTCP on which to stream the
media. Terminals must support audio with video and
T.120 data communications being optional. [3]

2.2.2 Multiple point Control Units

An MCU provides services to allow three or more


terminals to participate in a conference. The MCU
consists of a Multipoint Controller (MC) and an
optional Multipoint Processor (MP). The MC is
responsible for H.245 functions (negotiating common
ground) while the optional MP handles the actual
mixing of media streams, and manages the streams to
avoid bandwidth contention. H.323 supports
Centralized, Decentralized, and a Hybrid concept of
multipoint conferencing. When terminals participating
in a conference reside in both a centralized and
decentralized environment (mixed) the MCU acts as a
bridge between the two.
MC and MP functions can reside on a dedicated
component, terminal, a gateway or a gatekeeper, but
when endpoints exist off network (i.e. PSTN), it is Figure1 Implementing Gatekeepers
recommended that the MC be utilized on a gateway. [2]
One big advantage that the H.323 standard seems to
have over its nearest competitor is in the area of address
2.2.3 Gateways
resolution. The gatekeeper has the ability to use a
Gateways provide a variety of services, not the least of number of methods and protocols to resolve a
which is protocol conversion between H.323 networks destination address. First, it can ask another gatekeeper.
and non-H.232, e.g. switched circuit networks (SCN). A If that doesn’t work, it can use Annex G/H.225.0, TRIP,
gateway performs call setup and teardown, translates ENUM, or DNS protocols for address resolution.
audio, video and data formats, and can perform RAS for Security enhancements to H.323 are provided by H.235,
registration with the gatekeeper. On the H.323 side, the adding authentication, encryption and integrity to the
gateway uses H.225 and H.245 for call setup and mix. Optional password based and PKI security profiles
management, and on the circuit switched side, it utilizes can be used to authenticate the person and the call
the protocols specific to SCNs such as ISDN and SS7. signaling channel can be encrypted using TLS or IPSec.
A gateway can be implemented on a gatekeeper, an [2]
MCU, or on a voice enabled router or switch.
3. SIP (Session Initiation Protocol) description of the software/hardware/product involved.
The User-Agent field is sent in request messages, which
SIP (Session Initiation Protocol) is another signaling means that the receiving SIP server can see this
protocol used to control multimedia communication information. SIP network elements sometimes store this
such as voice and video calls over Internet Protocol information and it can be useful in diagnosing SIP
(IP). The protocol can be used for creating, modifying compatibility problems.SIP also defines server network
and terminating two-party (unicast) or multiparty elements. Although two SIP endpoints can
(multicast) sessions consisting of one or several media communicate without any intervening SIP
streams. The modification can involve changing infrastructure, which is why the protocol is described as
addresses or ports, inviting more participants and peer-to-peer, this approach is often impractical for a
adding or deleting media streams. The SIP protocol is public service. [4]
an Application Layer protocol designed to be
independent of the underlying transport layer; it can run 3.2 RFC 3261 Server Elements
on Transmission Control Protocol (TCP), User
Datagram Protocol (UDP) or Stream Control A proxy server "is an intermediary entity that acts as
Transmission Protocol (SCTP). It is a text-based both a server and a client for the purpose of making
protocol, incorporating many elements of the Hypertext requests on behalf of other clients. A proxy server
Transfer Protocol (HTTP) and the Simple Mail Transfer primarily plays the role of routing, which means its job
Protocol (SMTP). Other feasible application examples is to ensure that a request is sent to another entity
include video conferencing, streaming multimedia "closer" to the targeted user. Proxies are also useful for
distribution, instant messaging, presence enforcing policy (for example, making sure a user is
information, file transfer and online games.[4] allowed to make a call). A proxy interprets, and, if
necessary, rewrites specific parts of a request message
before forwarding it."
3.1 SIP Network Elements "A registrar is a server that accepts REGISTER requests
and places the information it receives in those requests
SIP employs design elements similar to the HTTP into the location service for the domain it handles."
request/response transaction model. Each transaction "A redirect server is a user agent server that generates
consists of a client request that invokes a particular 3xx responses to requests it receives, directing the client
method or function on the server and at least one to contact an alternate set of URIs. The redirect server
response. SIP reuses most of the header fields, encoding allows SIP Proxy Servers to direct SIP session
rules and status codes of HTTP, providing a readable invitations to external domains."
text-based format. The RFC specifies: "It is an important concept that the
A SIP user agent (UA) is a logical network end-point distinction between types of SIP servers is logical, not
used to create or receive SIP messages and thereby physical."
manage a SIP session. A SIP UA can perform the role Other SIP related network elements are
of a User Agent Client (UAC), which sends SIP Session border controllers (SBC), they serve as middle
requests, and the User Agent Server (UAS), which boxes between UA and SIP server for various types of
receives the requests and returns a SIP response. These functions, including network topology hiding, and
roles of UAC and UAS only last for the duration of a assistance in NAT traversal.
SIP transaction. A SIP phone is a SIP user agent that Various types of gateways at the edge between a SIP
provides the traditional call functions of a telephone, network and other networks (as a phone network). [4]
such as dial, answer, reject, hold/unhold, and call
transfer. SIP phones may be implemented by dedicated
hardware controlled by the phone application directly or
through an embedded operating system (hardware SIP
phone) or as a soft phone, a software application that is
installed on a personal computer or a mobile device.
E.g. A personal digital assistant (PDA) or cell phone
with IP connectivity. Each resource of a SIP network,
such as a User Agent or a voicemail box, is identified
by a Uniform Resource Identifier (URI), based on the
general standard syntax also used in Web services and
e-mail. A typical SIP URI is of the form,
sip:username:password@host:port. The URI scheme
used for SIP is sip. If secure transmission is required,
the scheme sips: is used and SIP messages must be
transported over Transport Layer Security (TLS).
In SIP the user agent may identify itself using a Figure2 RFC 3261 Server Element [4]
message header field 'User-Agent', containing a text
3.3 SIP Messages about value-added VoIP services, and cost will take the
backseat. [5]
SIP is a text-based protocol with syntax similar to that
of HTTP. There are two different types of SIP 4.2 Portability
messages: requests and responses. The first line of a
request has a method, defining the nature of the request, One important concept to understand about VoIP is that
and a Request-URI, indicating where the request should unlike it’s forefathers (let’s call them PSTN for now), it
be sent. The first line of a response has a response code. is not distance or location dependent. As far as VoIP is
For SIP requests, RFC 326s1 defines the following concerned, you could be calling your supplier 1,000
methods: miles away in Indonesia or calling your business partner
 REGISTER: Used by a UA to indicate its on the other end of town, and it doesn’t make any
current IP address and the URLs for which it difference at all, in terms of connectivity and cost.
would like to receive calls. Many of our recommended VoIP service providers have
 INVITE: Used to establish a media session this feature. A VoIP phone number, unlike your regular
between user agents. phone number, is completely portable. Most commonly
 ACK: Confirms reliable message exchanges. referred to as a virtual number, you can take it with you
 CANCEL: Terminates a pending request. anywhere you go. Even if you change your office
 BYE: Terminates a session between two users address to another state, your phone number can go
in a conference. with you. You can even take your whole business with
 OPTIONS: Requests information about the you wherever you travel.
capabilities of a caller, without setting up a Integrated Communications
call. The best business phone systems can stand the test of
The SIP response types defined in RFC 3261 fall in one time and grow with your business needs. Businesses
of the following categories. simply send all of their information over their
 Provisional (1xx): Request received and being Broadband Internet connection whether it be Internet
processed. data from PCs or voice calls from their employees.
 Success (2xx): The action was successfully  Making cheap local and international phone
received, understood, and accepted. calls
 Redirection (3xx): Further action needs to be  Audio conferencing & Video conferencing
taken (typically by sender) to complete the  Have Voice messages sent to your email
request.
 Call forwarding, call waiting
 Client Error (4xx): The request contains bad
 Fax thru e-mail
syntax or cannot be fulfilled at the server.
 Server Error (5xx): The server failed to fulfill  Send and receive multimedia files
an apparently valid request.  Sharing photos while talking
 Global Failure (6xx): The request cannot be  And much, much more. [5]
fulfilled at any server. [4]
5. Disadvantages of VoIP
3.4 SIP-ISUP Networking
Although most experts agree that the minus points of
SIP-I, or the Session Initiation Protocol with VoIP are just a “temporary” problem that will be
encapsulated ISUP, is a protocol used to create, modify, eliminated as the technology goes from strength to
and terminate communication sessions based on ISUP strength, let have a look at it anyway, and so does Voice
using SIP and IP networks. Services using SIP-I include over IP. While the pros may be overwhelmingly
voice, video telephony, fax and data. SIP-I and SIP- attractive, as a small business owner, you should know
T are two protocols with similar features, notably to the disadvantages as well. Here are some of the main
allow ISUP messages to be transported over SIP disadvantages of VoIP. [5]
networks. [4]
5.1 Power Supply Dependency
4. Advantages of VoIP
VoIP service depends on power supply. No power, no
phone calls. This is mainly because the equipment is
4.1 Cost Saving and Free Calls
hosted on your side, and not in some
telecommunications exchange like the usual PSTN
This is perhaps the most obvious. True enough, the very networks. However, in the future when most countries
nature of VoIP technology means that everyone can make a complete shift to IP based networks, and most
make significant cost savings for their business, Telco’s go 100% VoIP, this problems should by all
especially if you have multiple branches nationwide or means cease to exist. Most PSTN lines are power-
overseas. Cheap calls and free calls may be the independent simply because they have back-up power
attraction for VoIP right now, but the future will be in the exchanges, and even during blackouts you’ll be
able to make calls. This is one thing that VoIP does not which deploys greater technical sophistication and
have at the moment. This disadvantage would mean that improved fidelity of both video and audio than
you will need a back-up PSTN line in case of in traditional videoconferencing. [6]
emergencies; In fact, most businesses we have
consulted do indeed have at least one backup PSTN 6.1 Implementation
phone line. [5]
Telepresence has been described as the human
5.2 Security Issues experience of being fully present at a live real-world
location remote from one's own physical location.
Experts agree that security is a major concern when Someone experiencing video Telepresence would
choosing a VoIP solution. One of the major concerns is therefore be able to behave, and receive stimuli, as
“packet sniffing” means call can be “spy” without though part of a meeting at the remote site. The
effecting the conversation at all. However, there are aforementioned would result in interactive participation
things in your control, and you can take simple steps to of group activities that would bring benefits to a wide
make sure you’re getting the best protection that money range of users
can buy. [5] To provide a Telepresence experience, technologies are
required that implement the human sensory elements of
5.3 Less Quality Control vision, sound, and manipulation. [6]

The basics of VoIP are very different from regular 6.1.1 Vision
PSTN, which uses “C7 signals” for controlling quality
of service. Due to the nature of VoIP, your calls are A minimum system usually includes visual feedback.
streamed by packets to the destination, and any Ideally, the entire field of view of the user is filled with
inconsistency would mean issues like jitter, packet loss a view of the remote location, and the viewpoint
and echo. These problems, while posing some corresponds to the movement and orientation of the
considerable inconvenience a few years back, are being user's head. In this way, it differs from
eliminated even as you’re reading this. [5] television or cinema, where the viewpoint is out of the
control of the viewer.
In order to achieve this, the user may be provided with
5.4 Bandwidth Dependent
either a very large (or wraparound) screen, or small
displays mounted directly in front of the eyes. The latter
In any small business or home office setting, you’ll
provides a particularly convincing 3D sensation. The
typically have one broadband line which is shared by
movements of the user's head must be sensed, and
multiple users, for downloading data, sending emails,
the camera must mimic those movements accurately
and viewing web sites and multimedia applications.
and in real time. This is important to prevent unintended
Add a VoIP system to that and your bandwidth will
motion sickness. Another source of future improvement
soon be sucked dry. The only way to solve this problem
to Telepresence displays, compared by some
is to have a dedicated E1 (larger user base) or at least a
to holograms, is a projected display
dedicated broadband connection. Home offices may not
technology featuring life-sized imagery. [6]
face this problem if there are less than 3 simultaneous
users. [5]
6.1.2 Sound
6. Telepresence
Sound is generally the easiest sensation to implement
Tele presence refers to a set of technologies which with high fidelity, based on the
allow a person to feel as if they were present, to give foundational telephone technology dating back more
the appearance that they were present, or to have an than 130 years. Very high-fidelity sound equipment has
effect, via telerobotics, at a place other than their true also been available for a considerable period of time,
location. with stereophonic sound being more convincing
Tele presence requires that the users' senses be provided than monaural sound. [6]
with such stimuli as to give the feeling of being in that
other location. Additionally, users may be given the 6.1.3 Manipulation
ability to affect the remote location. In this case, the
user's position, movements, actions, voice, etc. may be The ability to manipulate a remote object or
sensed, transmitted and duplicated in the remote environment is an important aspect of real Telepresence
location to bring about this effect. systems, and can be implemented in large number of
Therefore information may be traveling in both ways depending on the needs of the user. Typically, the
directions between the user and the remote location. movements of the user's hands (position in space, and
A popular application is found in Tele presence posture of the fingers) are sensed by wired
videoconferencing, a higher level of video telephony gloves, inertial sensors, or absolute spatial position
sensors. A robot in the remote location then copies telephone, but merely talking to another person with it.
those movements as closely as possible. This ability is [6]
also known as teleportation.
The more closely the robot re-creates the form factor of 7. Advantages
the human hand, the greater the sense of Telepresence.
Complexity of robotic effectors varies greatly, from An industry expert described some benefits of
simple one axis grippers, to fully anthropomorphic Telepresence: "There were four drivers for our decision
robot hands. to do more business over video and Telepresence. We
Hap tic teleportation refers to a system that provides wanted to reduce our travel spend, reduce our carbon
some sort of tactile force feedback to the user, so the footprint and environmental impact, improve our
user feels some approximation of the weight, firmness, employees' work/life balance, and improve employee
size, and/or texture of the remote objects manipulated productivity."
by the robot. [6] Rather than traveling great distances in order to have a
face-face meeting, it is now commonplace to instead
6.2 Transparency of Implementation use a Telepresence system, which uses a multiple codec
video system (which is what the word "Telepresence"
A good Telepresence strategy puts the human factors most currently represents). Each member/party of the
first, focusing on visual collaboration solutions that meeting uses a Telepresence room to "dial in" and can
closely replicate the brain's innate preferences for see/talk to every other member on a screen as if they
interpersonal communications, separating from the were in the same room. This brings enormous time and
unnatural "talking heads" experience of traditional cost benefits. It is also superior to phone conferencing
videoconferencing. These cues include life–size (except in cost), as the visual aspect greatly enhances
participants, fluid motion, accurate flesh tones and the communications, allowing for perceptions of facial
appearance of true eye contact. This is already a well- expressions and other body language. [6]
established technology, used by many businesses today.
Telepresence to teleporting from Star Trek, and said 8. Unified Communication
that he saw the technology as a potential
billion dollar market for Cisco.
Rarely will a Telepresence system provide such a Unified communications (UC) -- also called unified
transparent implementation with such comprehensive messaging or UM -- is the new buzzword in the IT
and convincing stimuli that the user perceives no industry, but what does it really mean? In some cases, it
differences from actual presence. But the user may set depends on whom you ask; vendors tend to put their
aside such differences, depending on the application. own spin on the definition depending on what they're
The fairly simple telephone achieves a limited form of trying to sell you. But by most definitions, UC refers to
Telepresence using just the human sensory element of the ability to integrate different types of
hearing, in that users consider themselves to be talking communications -- including voice mail, e-mail, faxes,
to each other rather than talking to the telephone itself. instant messages, and video conferencing -- into one
Watching television, for example, although it stimulates common interface and/or repository.
our primary senses of vision and hearing, rarely gives Unified Communications is NOT necessarily a product;
the impression that the watcher is no longer at home. Unified Communications is really a STRATEGY!
However, television sometimes engages the senses A strategy to dramatically reduce the effort needed to
sufficiently to trigger emotional responses from viewers establish effective communications between people,
somewhat like those experienced by people who whether they are colleagues, partners or customers.
directly witness or experience events. Televised Effective communications is accomplished over the
depictions of sports events, or disasters such as most appropriate medium to reach the right person, with
the September 11 terrorist attacks, can elicit strong the right device, the very first time!
emotions from viewers. That said, there are many ways to implement a unified
As the screen size increases, so does the sense of communications solution in an organization. In its
immersion, as well as the range of subjective mental simplest form, it provides a way for users to access their
experiences available to viewers. Some viewers faxes and voice mail messages via their e-mail clients.
have reported a sensation of genuine vertigo or motion More sophisticated implementations provide advanced
sickness while watching IMAX movies of flying or features such as the ability to hear e-mail messages read
outdoor sequences. to you over the phone as well as the ability to dictate a
Because most currently feasible Telepresence gear reply and send it as an e-mail, instant message, fax, or
leaves something to be desired; the user must suspend audio message. [7]
disbelief to some degree, and choose to act in a natural You don't necessarily need VoIP to implement UC; you
way, appropriate to the remote location, perhaps using can use the regular phone system. But VoIP does make
some skill to operate the equipment. In contrast, a it easier: VoIP services already include mechanisms for
telephone user does not see herself as "operating" the forwarding voice mail to e-mail, Find Me Follow Me
(FMFM) functionality, and other features used in a UC
system. In addition, you get more scalability and better 9. References
integration with VoIP than with UC-type products that
rely on traditional phone services. [1] http://en.wikipedia.org/wiki/Voice_over_IP.htm
Combining an asynchronous communication type such [2]www.protocols.com/pbook/h323.htm
as e-mail with a real-time communication type such as [3] www.protocols.com/pbook/terminal.htm
telecommunications presents some challenges. [4]http://en.wikipedia.org/wiki/Session_Initiation_Proto
However, it gives users far more flexibility and allows col.htm
each of them to receive, process, and send messages in [5]http://www.cisco.com/en/US/products/sw/voicesw/a
the way that works best for that individual. dvantage.html
VoIP and Unified Messaging Challenges [6]http://en.wikipedia.org/wiki/Telepresence.htm
Enterprises are moving towards Voice over Internet [7]McGowan_VoIP_June17th2010.pdf (page # 4-20)
Protocol (VoIP) and Unified Communications (UC) in [8]http://www.cisco.com/en/US/tech/tk652/tk701/techn
corporate networks because of its many benefits, ologies_white_paper09186a00800d6b75.shtml
including
 Substantial cost savings by using the internet
to bypass long distance tolls
 Implementing advanced applications such as
unified messaging and presence.
 Improving employee collaboration and
productivity.

VoIP and Unified Communications Applications are


poised to become the dominant form of
communications within enterprises, replacing
traditional circuit switched telephony technology.
Enterprises must ensure there networks are secure,
scalable and reliable; dial tone reliability is paramount.
VoIP and Unified Communications applications
provide numerous challenges for enterprise networks.
All VoIP handsets require IP addresses to send and
receive data over the network; in most cases, deploying
VoIP will double IP allocation requirements overnight.
Traditional DHCP services offered on a switch or router
do not address the issues of failover, centralized
management, firmware management or management of
custom vendor options. Furthermore, services offered
on call managers and media servers are rudimentary at
best, lacking adequate service level guarantees.
Adoption of VoIP brings with it numerous potential
risks that must be addressed to maximize benefit and
minimize risk exposure. Today, compromising your
legacy phone system means someone has to physically
cut the line to your PBX or gain access to the physical
phone network. This is no longer the case with VoIP.
Modern messaging protocols such as SIP remove call
functions from the circuit switch (PBX) and place call
management (setup, teardown) at the VoIP endpoint;
greatly increasing an organization’s exposure to risk.
Like your data network, your phone system is now
vulnerable to Viruses, Trojans, Hijacking, Spoofing,
and Denial of Service Attacks. A defense-in-depth
strategy is required, yet this strategy cannot come at the
cost of serviceability and QoS. Legacy solutions are not
a survivable solution. [8]

You might also like