You are on page 1of 11

International Packet Communications Consortium

Defining SIP in the NextGeneration Network from Theory to Deployment

2694 Bishop Drive, Suite 275 San Ramon, CA 94583 +1 925 275 6635 www.packetcomm.org

SIP Overview
As emerging next-generation networks face the challenge of delivering converged services, a robust but simple-to-use and lightweight protocol is required to easily establish and tear down communications calls. The Session Initiation Protocol (SIP) is a flexible call control and signaling protocol that is driving the delivery of converged services in next-generation networks. It was developed through the Internet Engineering Task Force (IETF) for establishing real-time IP connections across diverse network infrastructure. These sessions consist of end-to-end connections between two or more intelligent endpoints. SIP brings three key capabilities to telecommunications. The most important is that it leverages the proven web development and deployment model found in todays web-based applications. This encourages innovation and it allows access to large pools of development talent and available tools. It also simplifies the ability to deploy new services through the web-based applications model. The second key capability is scale, since SIP is inherently distributed. Users bring their own inexpensive resources to the network, enabling a SIP network to scale efficiently. The third key capability is ubiquity. Since SIP is based on established protocols, it has access to a large body of open implementations, ensuring a vibrant, competitive set of offerings. SIP is an application-level protocol for setting up, changing and terminating multimedia sessions between participants. It is primarily used on IP networks but it is both network- and applicationagnostic. Although most deployments of SIP will be over IP, it can also be used for non-IP networks such as Asynchronous Transfer Mode (ATM). The IETF developed SIP to aid in the move toward convergence by allowing voice, video and data to be integrated over the same network. SIP enables integration with existing Time Division Multiplexing (TDM) networks, and also allows integration with e-mail, the World Wide Web and next-generation technologies for wireless 2.5 and 3G networks as well as broadband IP networks. It is an important element of the International Packet Communications Consortiums (IPCCs) Reference Architecture, which defines the many functional elements that constitute a softswitch and promotes interoperability of IP services. SIP supports the IPCCs Call Agent Function (CAF) for handling call control and state maintenance. IPCC reference implementations are used by wireline, wireless and broadband service providers, and they form the basis for many commercial implementations. Sessions range from multimedia conferencing to content services such as Internet telephone calls and multimedia distribution. SIP is a request-response protocol that closely resembles the HyperText Transfer Protocol (HTTP), which forms the basis of the World Wide Web. SIP provides a simple method of allowing users to establish sessions over a network. It is more akin to HTTP than to telephony protocols such as Signaling System 7 (SS7) because of its simplicity, openness, ease of deployment and compatibility with existing IP protocols. Like HTTP, SIP is a text-based protocol, which makes it easy for developers to write applications.

Copyright 2003 The International Packet Communications Consortium

January 24, 2003

Applications and Services with SIP


Using SIP, telephony or multimedia services are just like other web applications that can be easily integrated into existing and new voice communications. The SIP protocol is used to establish a session and it provides a mechanism for exchanging information. It is used to let devices access each other. SIP allows the delivery of Voice over Packet (VoP) and enables exciting new multimedia services. It uses smart endpoints where the intelligence resides at the edge versus in the core of the network. The SIP model puts most of the intelligence for call setup and features on the SIP-enabled device, such as a Media Gateway Controller (MGC); PC; or a SIP mobile, broadband or landline phone. SIP offers a more flexible approach than the traditional telephony model where the call processing and control intelligence resides on a centralized phone switch or server, and SIP devices can therefore include many more features. For example, any end user can now for the first time control the handling of inbound calls. They can determine who may call, when they may call and where they may call. SIP truly enables personal service. SIP addresses belong to the users, not to their phone as in the traditional PSTN model. With SIP, it is easy to establish conferences and converge services like telephony, data and the web. SIP enables the delivery of services in an Internet model based on simple establishment of peer-to-peer connections through the use of intelligent endpoints. It is becoming widely deployed on the desktop. For example, Microsoft has endorsed SIP by including it within the Windows XP operating system. There are many benefits to be realized by both service providers and enterprise networks as they deploy services based on the SIP protocol. These benefits include: Open Standard Like Linux, SIP evolved from the university community and its development and growth reflect a communal, collegial approach to scholarship in an environment where anyone can contribute to the advancement of the standard. SIP has therefore evolved quickly while ensuring interoperability and easy adoption. Developers do not have to pay to participate and can download new specifications and develop new features quickly and easily. This open systems standard is being rapidly adopted as a universal protocol within next-generation networks and it fully enables interoperability with other mainstream IP protocols. This text-based protocol is easy to read and understand, and since it has evolved from the globally accepted HTTP and Simple Message Transfer Protocol (SMTP) standards, a large pool of innovative web developers is available worldwide to develop new SIPbased applications. FlexibilitySIP applications can be easily developed to interoperate with the existing communications infrastructure to ensure maximum flexibility when deploying new applications and services. In the true tradition of applications development, software code can be re-used so that developers can easily migrate and extend applications and add new services using SIP. SIP is a network-independent and media-independent protocol and enables maximum deployment flexibility. In many ways, it operates similarly to HTTP. Just as an HTTP session is unaware of the underlying HTML being transported, SIP is a protocol for
Copyright 2003 The International Packet Communications Consortium 2 January 24, 2003

establishing any kind of interactive communication session as easily as setting up a simple phone call. SIP can allow developers to inject rich media services, and SIP itself is independent of any of these multimedia offerings. Reliable and SecureSIP interoperates with proven IP protocols as well as TDM infrastructure to enable the delivery of carrier-class services. Under the IPCC Reference Model, SIP interoperates with SS7 and H.323 to enable signaling interaction between different networks. SIP can be easily integrated with existing security protocols to enable secure end-to-end sessions. It borrows from established methods for security procedures, including Secure Socket Layer (SSL) for hop-by-hop encryption and PGP or S/MIME for end-to-end encryption and authentication. It was built from the ground-up as a lightweight protocol to facilitate minimal call setup delay, and it can interoperate with existing and proven IP protocols for enabling QoS control. ExtensibleThe SIP baseline standard is clearly defined to ensure interoperability and SIP supports the creation of extensions to allow creative new services to be developed. SIP defines a formal mechanism for negotiating support of enhanced features, and this mechanism allows endpoints to specify both the extensions they require as well as those that are desired but optional.

Defining SIP

SIP is a lightweight text-based signaling protocol used for establishing sessions, usually in an IP network. It re-uses many of the constructs and concepts of Internet protocols such as HTTP and SMTP. Based on principals learned from the Internet community, SIP is an applicationindependent protocol designed at the outset to be flexible and extensible. The sessions are described using a separate protocol named the Session Description Protocol (SDP). SDP is transported in the message body of a SIP message. SIP works with existing Internet protocols by enabling endpoints to discover one another and agree on a session they would like to share. It enables the creation of an infrastructure of network hosts through which end-users can send invitations to sessions. This application-layer control protocol can establish, modify and terminate multimedia sessions such as Internet telephony calls or instant messaging sessions. SIP can also invite participants to already-existing session, such as multicast conferences. Media can be added to and removed from an existing session and SIP transparently supports name mapping and redirection services to enable personal mobility. SIP provides the call control and signaling that is essential for establishing multimedia sessions and it enables the delivery of converged services over IP networks. SIP supports five facets of establishing and terminating multimedia communications: User locationDetermination of the end system to be used for communication. User availabilityDetermination of the willingness of the called party to engage in communications. User capabilitiesDetermination of the media and media parameters to be used.
Copyright 2003 The International Packet Communications Consortium 3 January 24, 2003

Session setupEstablishment of session parameters for both the calling and called parties. Session managementTransfer and termination of sessions, modifying session parameters and invoking services. It is important to note that SIP is not a vertically integrated communications system but rather a component that can be used with other IETF protocols to build complete multimedia architectures. SIP is used in conjunction with other protocols but the basic functionality and operation of SIP is not dependant on any of these protocols. Put simply, SIP does not provide services but allows the users to establish sessions and parameters for injecting media into services. There are four main entities defined in the SIP specification and they can be implemented as standalone components or combined into a consolidated platform according to the application requirements. User agents are SIP endpoints that initiate and respond to requests and communicate with other user agents to establish and release sessions. User agents can communicate with each other directly; however there is often one or more intermediate servers involved as either proxy or redirect servers. Proxy servers can be stateful or stateless and they forward messaging on to the user agents and enable functions such as location services, authorization and accounting. These are the roles of the Routing Function and the Accounting Function in the IPCC Reference Architecture. Redirect servers are always stateless, and they simply respond to requests with locations where the originating user can contact the desired party directly. Registration servers allow agents to identify their location, enabling a plentiful set of mobility features to be implemented using SIP. SIP allows for services to be delivered in an Internet model while users are connected to an IP infrastructure. It allows both carriers and enterprise networks to quickly introduce new services and applications and easily establish multimedia sessions. SIP allows simplified multimedia conferencing and allows the integration of diverse services, such as telephony and web applications for click-to-talk customer support services. SIP establishes the sessions, negotiates the media requirements, manages location and enables enhanced services such as call forwarding, call transfer, identify delivery, privacy and 3G wireless or desktop-based instant messaging.

SIP Evolution

In 1999, the IETF defined SIP in RFC 2543. Since that time, RFC 2543 has undergone numerous changes and advancements and work continues to further define and advance SIP. RFC 3261 rolls-up those changes and is now the current SIP standard. To the IETFs credit, RFC 3261 is backwards compatible with RFC 2543. Therefore older implementations will still work with newer ones. Each SIP message has two parts, a set of SIP headers and a body. The SIP body enables a great deal of application flexibility. Originally used just for carrying the SDP session parameters such
Copyright 2003 The International Packet Communications Consortium 4 January 24, 2003

as codec and endpoint IP addresses, SIP can carry multiple body parts. Examples of innovative uses of the body part are SIP-T for PSTN-SIP-PSTN interworking and the Media Server Control Markup Language (MSCML) for conference control. The IETF also has a working group on SIP Instant Messaging and Presence Logical Extensions (SIMPLE). SIMPLE defines how SIP can be used to build instance messaging and presence. Both MSN and America Online have announced they will embrace SIMPLE as the standard for their instant messaging products. This will provide the technology to allow the two largest instant messaging communities to interoperate. All of the extensions built on the SIP protocol can advertise their capabilities to other users in the network so that they can be used to enable enhanced services. SIP has also been adopted by major standardization bodies that also provide guidelines and enhancements to support adaptation to various environments. For example, the Third Generation Partnership Project (3GPP) considered the radio aspect, the compression, the security and the specificities of the existing mobile architecture and services. This work is still ongoing and is done in close collaboration with the IETF.

The Protocol Difference


SIP offers a more simple, scalable, flexible and open implementation than the older H.323 protocol. H.323 is an end-to-end smart endpoint protocol derived from the ISDN standards. The approach of creating applications using H.323 is to extend the protocol. Part of the reason for this is that much of the network intelligence is in the H.323 Gate Keeper. The Gate Keeper is an architectural bottleneck that limits the scale of H.323 networks, as well as the speed with which one can introduce new applications. Conversely, SIP is an end-to-end smart endpoint protocol derived from a data session management base. It is a light protocol, though as it has gone through the evolution to a telephony control protocol it is gaining weight and become quite robust. SIP is an alternative to H.323 as an end-to-end protocol. Some argue that it is easier to deploy and manage SIP endpoints, and SIP provides a far more scalable implementation than H.323. SIP also supports a variety of endpoints including wireline, wireless and soft phones, Personal Digital Assistants (PDAs) and instant messaging applications. Because it is based on Internet protocols, SIP is extensible and relatively simple to understand and implement. It uses text and coding easily understood by looking at the messages and is therefore adaptable by the worldwide force of Internet application developers. Like H.323, SIP is a peer-to-peer protocol. Both SIP and H.323 contrast with master/slave protocols such as MGCP and H.248, which are nearly identical in purpose and very similar in syntax to each other. Both MGCP and H.248 are used by call-aware control entities to control unaware devices in a master/slave relationship. Often, people ask about the difference between MGCP, H.248 and SIP. This question, however, misses the fundamental nature of the device control protocols (MGCP and H.248) with the session protocols (SIP and H.323). Device control protocols are needed to control devices at a low level. For example, the Media Gateway (MG) function requires the passing of many lowlevel events such as hook-flash, digit detection, A/B bit status, etc. to the MGC function. This

Copyright 2003 The International Packet Communications Consortium

January 24, 2003

decomposition is appropriate because the MG function scales at a different rate than that of the MGC function. Since an MGC can control many MGs, it is possible to distribute the MGs geographically. For relatively small networks, one can use the MGC to route traffic between the different MGs. Conceptually, this takes a mainframe switch and distributes the line, trunk and switch functions.

Telco-Centric Mainframe Model


QuickTime and a Graphics decompressor are needed to see this picture. QuickTime and a Graphics decompressor are needed to see this picture.

Device protocol

MGC
QuickTime and a Graphics decompressor are needed to see this picture.

City 1
MG

QuickTime and a Graphics decompressor are needed to see this picture.

Network Core
Device protocol Device protocol

Hard to Scale: MGC is chokepoint MGC must be aware of current and futureMG s Expensive Secure & stable

MG

MG

City 2

City 3
1

The call model for MGCP and H.248 maps more closely to the traditional PSTN call model. The call agent must supply all instruction to the dumb end-devices telling them to wait for signals, collect digits, play tones, open ports and release connections. Figure 1 shows a typical configuration. Note in this example that the MGC becomes a congestion point, both in terms of performance and flexibility. All signaling and device control traffic for the entire network transits the MGC. Moreover, the MGC must be aware of any possible type of end device. The introduction of a new device requires an upgrade of the MGC. More importantly, mixing device control with call signaling means that the introduction of new services also requires an upgrade of the MGC. Session protocols such as SIP are not suited to the direct control of devices but more suited for the ability for intelligent end-points to communicate. However SIP addresses the scale and flexibility needs of the telecommunications network. As shown in Figure 2, any class of endpoint can use the same session protocol. However, each device uses the appropriate device control protocol.

Copyright 2003 The International Packet Communications Consortium

January 24, 2003

QuickTime and a Graphics decompressor are needed to see this picture.

SIP Web-Centric IP Architecture


QuickTime and a Graphics decompressor are needed to see this picture.

PSTN

City 1
2G/2.5G Wireless
QuickTime and a Graphics decompressor are needed to see this picture.

MG
QuickTime and a Graphics decompressor are needed to see this picture.

City 2 MG
SIP
SIP Proxy

MGC

QuickTime and a Graphics decompressor are needed to see this picture.

Distributed Flexible Open Inexpensive


QuickTime and a Graphics decompressor are needed to see this picture.

QuickTime and a Graphics decompressor are needed to see this picture.

SIP

MGC

Services Core AS AS AS SIP


SIP SIP

MS

Network

SIP

CM IAD

Cable

CMSS

SIP
SIP

MGC
DSL

3G

City 3

SIP Phone

Wireless

City 4

For example, City 1 with its wireless network may use H.248 between the MG and the MGC, and City 2 may use MGCP between the MG and MGC. City 3 might also use MGCP, but it is quite likely that the cable modem in City 3 uses a different variant of MGCP than that used by City 2. This problem also occurs when a MG is added that an existing MGC does not know how to control. Non-device control access methods that have no need for an MGCsuch as a SIP Phone, PC or 3G wireless devicecan participate directly on the telecommunications network. Services can become simple SIP endpoints. No modifications or upgrades are necessary to MGCs when a provider introduces new or enhanced services. Because the SIP architecture is based on the web architecture, it is inherently distributed and scales massively and economically compared to the mainframe model.

SIP in the Next-Generation Network

SIP is an application-oriented protocol that operates within the IPCC Reference Architecture to allow network operators and enterprise networks to deliver end-to-end services over nextgeneration networks. It is used to invite participants to a session, while the SDP-encoded body of the SIP message contains information about what media types the parties can use. Within the IPCC Reference Architecture, the Applications Server Function (AS-F) provides the service logic and execution for one or more applications or services. Application servers are deployed throughout the network to enable enhanced and highly tailored SIP services. They can support powerful integrated applications or can be highly specialized, such as a voice mail server or a server to support prepaid calling applications. The Media Gateway Controller Function (MGC-F) provides the call state for the endpoints, and its primary role is to provide the call logic and call control signaling for one or more MGs. Often
Copyright 2003 The International Packet Communications Consortium 7 January 24, 2003

the combination of the AS-F and the MGC-F work together to provide enhanced call control services, such as network announcements, three-way calling or call waiting. Vendors often use an Application Programming Interface (API) to link the AS-F and MGC-F in a single system. In this scenario, the platform that supports both of these functions is referred to as a Feature Server. The Media Server Function (MS-F) provides media manipulation and treatment of a packetized stream on behalf of any application. Its primary role is to operate as a server that handles requests from the AS-F or MGC-F for performing media processing on packetized media streams. An application server can control the media server directly, or pass control to an MGC. Media servers are also used to inject rich multimedia content into sessions. Once information is exchanged about the media types and parties, all participants are aware of each others IP addresses, available bandwidth and media capabilities. Data transmission then begins using the appropriate transport protocol. Throughout the session participants can make updates by sending additional SIP messages. These updates can include adding new parties or inserting other multimedia capabilities. Destinations in SIP are represented with Uniform Resource Indicators (URIs), which have the same format as e-mail addresses. This implies the use of Domain Naming Services (DNS) to map hosts and domain names to IP addresses. Integration with DNS services is a crucial element of SIP. It facilitates interoperability with telephone systems and addressing mechanisms so that SIP servers and clients can send, receive and route telephone numbers. The user agent resides on a participants access device, which for example could be a SIP phone, a PC, a mobile telephone or a PDA. The user agent contains both server and client elements. The redirect server and SIP proxy server perform routing and search functions and both elements are typically integrated into application servers. The registration server stores and receives participant locations, and the media server provides media processing services such as playing announcements, detecting tones, recording media, real-time transcoding and multimedia conferencing. The media server works closely with the application server to enable the delivery of IP-based services, including enhanced services such as call screening or voice mail. SIP can enable a range of robust communications features, including find me/follow me services, instant messaging, teleconferencing and video conferencing, as well as Centrex-type services such as caller ID, call waiting and call holding. It can also support multiplayer gaming, real-time group authoring and remote whiteboarding, as well as traditional services such as VoP and Virtual Private Networks (VPNs). With SIP it is easy to combine conversational multimedia services with other categories of services, such as directory information, web browsing, positioning and presence. Because it is and application-layer protocol, it is access independent. Participants can be reached on any form of IP network. For example, SIP can offer seamless service capabilities for both fixed and mobile networks, which is a key requirement for making the promise of wireline/wireless convergence a reality.

Copyright 2003 The International Packet Communications Consortium

January 24, 2003

SIP Applications

Because SIP has evolved from Internet protocols such as HTTP and SMTP, it is relatively simple for existing Internet developers to swiftly create new applications. Software developers can easily write native web applications for a variety of devices ranging from SIP phones to wireless phones, PDAs and personal computers. SIP makes it easy to add new services and allows developers to leverage their expertise in developing IP applications to quickly develop multimedia services. Because SIP uses so many of the webs components, it is easy for developers to work with SIP to generate new applications. When developers seamlessly integrate a SIP standalone phone or a SIP client with a PC, they enable flexible integrated services, including instant messaging, caller and called party preferences and third-party call control. SIP is now widely deployed in the Windows Messenger real-time conferencing application that is part of the Windows XP operating system. Microsoft is creating a potential SIP client on every desktop, thus enabling developers to widely deploy SIP-based services across the network. The range of applications can be mind boggling, particularly when one considers that there may be a new generation of devices equipped with IP addresses and connected over the network. If a carrier or enterprise network has SIP as a core backbone protocol it will become inherently easier to develop new applications. Developers will be able to port and re-use code and organizations will gain increased independence. SIP supports a broad range of application types and will be the common protocol for communication between equipment and the network. The range of applications supported by SIP includes everything from voice mail, unified messaging, instant messaging, call center, customer relationship management, multimedia conferencing to voice/web integration. SIP is opening up a new era of multimedia services and allowing enterprise networks and carriers to create converged services and applications with rich multimedia integration. A very important part of SIP is its ability to negotiate capabilities between end devices, such as codecs and displays. It is also key that SIP even offers the ability to execute scripts. With some basic add-on capabilities, SIP may be used by both application and provider to exchange sessionbased information such as subscriber and user capabilities, charging information or security and integrity data.

SIP Benefits

SIP is an enabling standards-based protocol that is based on the web model to enable rapid development of innovative new services. New multimedia applications and services can be developed in Internet time with dramatically enhanced speed and flexibility. SIP can be used to easily integrate IP applications and to accelerate application development and deployment. SIP is an open protocol that allows developers to swiftly build enhanced services that leverage multimedia capabilities. Both carriers and enterprise networks can deploy SIP solutions within the IPCC Reference Architecture to ensure standards-based interoperability. Applications can be quickly developed using a large talent pool of existing web developers, and new and compelling applications can be quickly deployed that leverage rich multimedia capabilities.

Copyright 2003 The International Packet Communications Consortium

January 24, 2003

Anyone can download the SIP specifications and begin developing applications. SIP protocol stacks and services are available that can be easily adapted to further accelerate development. The IPCC offers a number of documents on SIP, the IPCC Reference Architecture and how to use SIP for applications. Just as HTTP and HTML revolutionized the Internet, SIP is changing the face of telecommunications. Unlike TDM networks that require specialized hardware, SIP networks allow organizations to easily add intelligent endpoints with minimal infrastructure investments to enable rich multimedia services. As next-generation networks evolve, SIP has already gone from theory to deployment. Major carriers such as Level 3 and WorldCom are already deploying SIP, and SIP capabilities are shipping with each new Microsoft XP computer. Building IP services based on SIP is easier and less capital intensive, and it allows developers to leverage existing standards, re-use application code and accelerate the deployment of enhanced services over the next-generation IP network.

About the International Packet Communications Consortium


The International Packet Communications Consortium (IPCC) is the premiere forum for the worldwide advancement of the next-generation networks through products, services, applications, and solutions utilizing packet-based voice, data and video communications technologies available today via any transport medium including but not limited to copper, broadband and fiber optics. The IPCC establishes a common terminology for the softswitch-based architecture, and it promotes interoperability, conducts research, and liaises with governmental and industry organizations to address industry issues that service providers and vendors face. By providing a variety of educational seminars and by fostering the open network and standard interfaces, the Consortium accelerates the advancement and usage of softswitch-based networks. The IPCC membership includes wireline and wireless service providers and carriers, governmental agencies, standards bodies, and equipment and software vendors representing all network elements involved in the softswitch-based and next-generation network.

Copyright 2003 The International Packet Communications Consortium

10

January 24, 2003

You might also like