You are on page 1of 51

Unified Communications: transitioning toward an 2011 interoperable multimediaworld

The inner workings and possibilities of XMPP and its multimedia enabling extension Jingle

Thursday, April 21, 2011 Graduation Thesis

Roel van de Wiel

Page 1

Unified Communications: transitioning toward an 2011 interoperable multimediaworld


Contents
Terms of reference.................................................................................................................................. 4 Reason for writing ............................................................................................................................... 4 Capgemini ........................................................................................................................................... 4 Research .............................................................................................................................................. 4 Target .................................................................................................................................................. 5 Build-up ............................................................................................................................................... 5 Acknowledgments............................................................................................................................... 5 Contact ................................................................................................................................................ 6 Abbreviations ...................................................................................................................................... 7 Summary ................................................................................................................................................. 8 Introduction .......................................................................................................................................... 10 Hypothesis......................................................................................................................................... 10 Explanation ................................................................................................................................... 10 1. The Why and What of UC.................................................................................................................. 11 1.1 A changing society ...................................................................................................................... 11 1.2 What is Unified Communications ............................................................................................... 14 1.2.1 History of digital communications systems ......................................................................... 14 1.2.2. UC functionality................................................................................................................... 15 2. UC market analysis ............................................................................................................................ 18 2.1 What analysts say about UC ....................................................................................................... 18 2.1.1The Market............................................................................................................................ 18 2.1.2 What Gartner says about XMPP ......................................................................................... 18 2.2 Potential of UC in commercial appliances .................................................................................. 19 2.2.1 Banks .................................................................................................................................... 19 2.3 UC vendors .................................................................................................................................. 20 2.3.1 leading UC vendors .............................................................................................................. 20 2.4 Internet services ......................................................................................................................... 21 2.4.1 Other noticeable use of XMPP ............................................................................................ 23 2.5 Institutes ..................................................................................................................................... 23 2.6 Reflection .................................................................................................................................... 25 3. Technology explained ....................................................................................................................... 26 3.1. The way SIP works...................................................................................................................... 26 Roel van de Wiel Page 2

Unified Communications: transitioning toward an 2011 interoperable multimediaworld


3.1.1 Call setup .............................................................................................................................. 26 3.1.2 SIP software architecture ..................................................................................................... 26 3.1.3 SIP network architecture ..................................................................................................... 27 3.1.4 SIP message format.............................................................................................................. 27 3.1.5 SDP ....................................................................................................................................... 28 3.1.6 Call setup .............................................................................................................................. 29 3.1.7 Why does SIP not provide the solution ? ............................................................................. 31 3.1.8 Problems with SIP ............................................................................................................... 31 3.2 XMPP ........................................................................................................................................... 32 3.2.1 Brief History of XMPP........................................................................................................... 32 3.2.2 Network architecture ........................................................................................................... 33 3.2.3 Extensibility .......................................................................................................................... 33 3.2.4 Roster ................................................................................................................................... 33 3.2.5 Stream .................................................................................................................................. 34 3.2.6 Security ................................................................................................................................ 35 3.2.7 XMPP jingle ......................................................................................................................... 35 3.2.8 Advantages of XMPP ........................................................................................................... 38 3.3 Technology to guarantee the smooth delivery of real-time data............................................... 39 3.3.1 Overprovisioning .................................................................................................................. 39 3.3.2 QOS ...................................................................................................................................... 39 3.3.1 Codecs .................................................................................................................................. 40 3.3.2 Call admission control .......................................................................................................... 41 4. Examples of possible XMPP use in practice ...................................................................................... 42 5. Final recommendations .................................................................................................................... 47 List of Figures ........................................................................................................................................ 49 Bibliography .......................................................................................................................................... 50

Roel van de Wiel

Page 3

Unified Communications: transitioning toward an 2011 interoperable multimediaworld


Terms of reference
Reason for writing
In November of 2010 I accepted a traineeship at Capgemini financial devision under the supervision of Arnoud Vons with Ron Mandjes as my personal mentor. The Traineeship is the last part of my Bachelor study Network Infrastructure Design at the University of Applied Science in Heerlen. One of the requirements of the degree was to write a thesis. The subject of the thesis set by Cap Gemini was Unified Communications (UC) and its potential application in the banking sector. Corporations are increasingly implementing UC system as part of the New Way of Working. UC is an important corner stone in the New Way of Working. The New Way of Working is a new popular philosophy regarding organizing the way work is done and it has gained many followers. It has become a huge accelerator of implementing UC. Numerous reports are available that list and prove the benefits of implementing The New Way of Working. The New Way of Working will be discussed in more detail later in the first chapter.

Capgemini
Capgemini was founded in 1967 by Serge Kampf in Grenoble and it started under the name of Sogeti - Socit pour la Gestion de l'Entreprise et le Traitement de l'Information. Its present name Capgemini is a result of merger between CAP in 1974 and Gemini in 1975. Capgemini is one of the world's leaders in information technology with a workforce of over 100.000 people in 39 countries. Capgemini has four divisions: Consultancy, Outsourcing, Technology Services and Financial Services. Consultancy gives business advice to companies which are facing important decisions. Outsourcing provides substitutes for the internal services of companies that dont belong to their core business. Technology services are focused on delivering and supporting the physical side of IT. Financial services provide all kind of services for the Finance industry. The department where I was stationed, TDI, was a subsection of B60 which is a section of FS. During my internship, the Capgemini structure was reorganized and TDI became a part of TS.

Research objectives
My research objective was to investigate the possibilities for Unified Communications within The New Way of Working philosophy for the Banking industry. The assignment was pretty open-ended and I added the future of Unified Communications regarding interoperability to the research plan, focusing primary on the interdomain interoperability. In that regard I ended up looking further into the potential of XMPP as a standard for UC, including Real-time media as voice and video. During an interview I had with Daniel Hilster, I got the impression that the facilitation of Real-time streams is often neglected when it comes to the corporate network architecture. So I added the facilitation of Real-time streams to my list of research objectives, to give an overview of techniques that could be used to safeguard Real-time streams.

Roel van de Wiel

Page 4

Unified Communications: transitioning toward an 2011 interoperable multimediaworld


Target audience
The thesis is written for consultants working for Capgemini who are in some way involved in UC technology in general and opportunities for using UC in the banking sector specifically and other fellow students who are interested in UC. Specific technical knowledge of IT is assumed.

Build-up
After reading available material and looking at the fundamentals of most UC systems, I came to the conclusion that there is no universal standard in use today to connect different UC systems together, thereby making them interoperable. An new old protocol XMPP looked promising in becoming the new standard for IM due its simplicity and well-thought-through design . In chapter two, I researched its current and potential use and which companies use it for their internet services. Also the potential use of UC in the banking sector in directly communicating with its customers has been given a closer look in chapter two In chapter three , to ensure the reader would get a better understanding of the technical working of XMPP and sip and how they compare against each other. In an interview I had with Daniel Hillster I got the impression that handling RTP traffic is still a problematic. So included ways of handling RTP traffic and how to ensure the quality of the media stream is up to par. In chapter four, a theoretical case is built to show how a XMPP infrastructure could look like with examples of usable cases and references to extensions. So the reader of this thesis could get an impressions of how XMPP could be applied in an corporate environment to suit business needs The thesis ends with the conclusion in chapter five where my findings will be reported.

Acknowledgments
During the writing of this thesis I received help and support from people whom I would like to thank for their contribution. I interviewed two colleagues, which might not seem much but they gave me a great deal of material to work with. First I interviewed Daniel Hillster an employee of Didacticum who was involved in the introduction of The New Way of Work at SNS Reaal. He gave me an insight into what is involved when implementing UC systems, how it is used and the needs and opportunities when it comes to improving current UC systems. I had my second interview with Thiago Camergo. He is an experienced SIP / XMPP jingle engineer currently working at Nimbuzz. He is also in the process of developing a NAT-traversal extension for XMPP Jingle and is a strong advocate for XMPP Jingle which can be seen from his blog, XMPPjingle. He helped me to understand the concept of Jingle and gave me an insight into the developments currently in the UC field.

Roel van de Wiel

Page 5

Unified Communications: transitioning toward an 2011 interoperable multimediaworld


Contact
Author

Name: R. (Roel) van de wiel Function: Intern at Capgemini, student at HsZuyd Address: Theems 70, 5152 SN Drunen Tel: 06-28079738 Corporate email: Roel.vande.Wiel@capgemini.com Private Email: wielrvd@gmail.com

Graduation Committee
Mentor

Name: R. (Ron) Mandjes Function: managing consultant Email: ron.mandjes@capgemini.com Phone: 00 31 (0)3 68 99 115
Supervisor (Due illness, not in function)

Name: A. (Arnoud) Vons Function: Prinicpal consultant Email: arnoud.vons@capgemini.com Phone: 00 31 (0) 6 150 303 43
Hszuyd supervisor

Name: J.C.C. (Jean-Paul) Brands Email: j.brands@hszuyd.nl Phone: 00 31 (0) 45 400 6765

Roel van de Wiel

Page 6

Unified Communications: transitioning toward an 2011 interoperable multimediaworld


Abbreviations
FEC: Forward Error Correction IM: instant messaging H.264 SVC: Scale video Coding MCU: Multipoint control unit NAT: Network adres translation QOS: quailty of service RTP: Real-time Protocol SASL: Simple Authentication and Security Layer SIP : session initiation protocol SIMPLE: SIP for Instant Messaging and Presence Leveraging Extensions TLS: Transport Layer Security UC: Unified Communication URI: Unified Resource Identifier XMPP: Extendible Message and presence protocol

Roel van de Wiel

Page 7

Unified Communications: transitioning toward an 2011 interoperable multimediaworld


Summary
It is a known economic law that new industries more often than not are subjected to forces that segment the industry and as a result make its components non-interoperable. This is the same for UC. The current state of UC is that there are still no common standards in place that facilitate the exchange of real-time media streams across company boundaries. SIP is considered as the standard for setting up real-time communication but has fallen apart into different kinds of dialects. The reasons for this are explained in more detail in this thesis. The looseness of the SIP standard, in combination with the business motives of the UC industry leaders, could be seen as the main reason. Investing in interoperability as the industry leader costs money; it opens up the playing field for challengers and makes the customer less dependent of their UC supplier. But like any maturing industry, there is a certain point where mystery disappears, the novelty wears off and the products become commoditized. When that point is reached, vendors better be on the bandwagon of interoperability or it will hurt the business by not being able to let their product interwork. It is hard to predict the future of UC regarding interoperability, but there is hope. Microsoft and Cisco, two important UC leaders, have committed themselves to interoperability. Microsoft is head of Unified Communication Interoperability Forum in the presence of Bernard Aboa. Cisco's current approach is to be adaptive and being able to support any dialect of SIP with their IME solution, at the same time adapting XMPP for native communication and increasing the level of supported XMPP functionality . This approach is also used very often by consumer oriented UC Internet services on the Internet. Challengers like Nimbuzz and Fring make their solutions interconnectable with multiple service providers, e.g. Skype, Gmail and SIP providers. And they use XMPP as the foundation for their own services. One company worth mentioning is Google which uses XMPP for different services; the most notable one being Gtalk. Interoperability is very likely travelling down the same path as the network protocols did at the end of the 80s and beginning of the 90s. First, there were separate network devices for each network protocol, e.g. Token Ring, IPX, IP etc. After that, network devices became available that could handle several network protocols. In the end, IP was the popular choice for network traffic and all other network protocols became unsupported and obsolete. It is likely that future communication applications will offer XMPP Jingle as SIP support or other protocols for that matter. After a while, protocols that are not actively used will be dropped from support. XMPP is already seen by Gartner as the IM protocol of the future and the most popular protocol on the web. When used by a service provider, it is only a small step for the service provider to allow and actively support Jingle on their network. In the current situation however Jingle has not proven itself in the same way as SIP. The Jingle standard is mature enough and ready for implementation, but there is not much experience with implementing it in enterprise environments, as a result little support can be expected from UC vendors. SIP was meant to replace PSTN and has several RFCs defined that make it backward compatible with SIP. Vendors have a long history of making SIP backward compatible with PSTN. There are some XMPP extensions that could provide Jingle compatibility with PSTN, but they are still experimental and have not been widely implemented. Jingle will therefore be used mostly at first Roel van de Wiel Page 8

Unified Communications: transitioning toward an 2011 interoperable multimediaworld


for transmission of multimedia directly over the internet. In the same way, the recent launch of Cisco Jabber client which can setup a voice or video conference with Gtalk. XMPP in itself has ample potential, not only for Unified Communications, but due to its flexibility, push model and inherent security, it can be used as a very good base for other appliances that need a secured connected network, e.g. a home automation system, financial systems or industrial environments for machine to machine communication.

Roel van de Wiel

Page 9

Unified Communications: transitioning toward an 2011 interoperable multimediaworld


Introduction
Problem statement
At this moment there is no industry standard which can deliver true interoperable unified communications. There is still a great deal of difficulty involved in making video conferencing work between devices of different vendors, let alone between two different companies or separate networks. The lack of interoperability does not only affect video conferencing but also other forms of communication, like Instant messaging and Presence Status. Anyone can call and email anyone, but it is still not possible to videocall or use IM in the same way. Enterprise IM is usually only deployed companywide; the same for videocall. To make Telephony over internet possible, a telecom provider needs to provide a link to the isolated virtual telephone network. Only email is sent directly to the receiving party.

Hypothesis
XMPP will be the protocol of choice for instant messaging, presence and video communication in the private domain, and the public domain (the internet). It will function as the lingua franca of the UC field. Jingle will be added to make video or voice communication possible. Explanation Jingle is an extension to XMPP that enables the setup of real-time media streams between two hosts. XMPP jingle will coexist with SIP and in the future UC vendors will include XMPP Jingle in their products, SIP will most likely be kept to maintain backwards compatibility with the PSTN network. Jingle can already be used with Ciscos CUCM (Cisco Unified Communication Manager) and the recently released Cisco Jabber client. XMPP jingle is used for Google talk. Google Talk is presently available for Android (only in the US) and with their large presence in the mobile market with Android, it is only a matter of time before Google Talk is available worldwide. The introduction of 4g technologies is probably going to be a accelerator of true IP enabled voice services. The advantage of XMPP over SIP is the integration of IM functionality, such as presence, resource identification in URI and a lesser complicated and clearer process of extending its functionality, due to the setup of its managing organization as well as the technical architecture. The IT environment is evolving into a multi-screen environment where the distinction between the personal and professional environment is blurred. The XMPP protocol with its versatility and wellthought-through design has the right architecture to fill in the requirements of this new environment. There could be question marks placed about the maturity level reached by UC vendors when it comes to XMPP integration into their products needed for Jingle support. The reasons for this are that Jingle is not yet common in implementations in enterprise environments. At this moment, work is being done to meet those criteria. But the XMPP fundamentals are solid and clear and, most importantly, they are complete. Roel van de Wiel Page 10

Unified Communications: transitioning toward an 2011 interoperable multimediaworld


1. The Why and What of UC

1.1 A changing society


To explain the increased popularity and need for UC systems, merely looking at the advances made in the technical field is not enough. The changes in society as a whole also need to be taken into consideration. In this section some of the changes that influence UC directly or indirectly are described. This helps to see UC as a part of the puzzle and to see the whole picture which that puzzle represents. Twenty-four-seven economy Service expectance has changed when compared to the past. People were used to a less hectic and more regular lifestyle, working five days in the week and relaxing in the weekend. When needed, they took a day off to take care of their personal matters. Now people have a more chaotic lifestyle and consider their spare time more valuable. They try to make the best use of their spare time by planning tasks as efficiently as possible. Accelerators of these changes are new means of communicating. Mobile devices and the Internet have made an impact on the day-to-day life of people by making instant contact and instant answers to questions possible. Society as a whole is subjected to those changes including organizations like the government or the retail industry that provide commercial or public services. These organizations ought to adapt to the contemporary way of life. Increasing the opening hours of shops is a prime example of adaptation to modern life. Another way of adaptation is the increase in services that are made accessible through the Internet. These could be services for consumers or business tools for employees. New way of working The new way of working has as its main goal to change the way we work by making it flexible by replacing or complimenting the physical with the virtual world. New way of working is a whole new approach to how highly skilled labor is done more efficiently and effectively. It is a combined approach to new technology, differentiating the work environment and a new style of management.

The office New ways of multimedia communication will change the way the office is used. It will be no longer be required to be physically present in the office in order to be a productive member of a team. That does not mean that offices will be a thing of the past; they will still fulfill a role. They will become more a type of meeting centre with a relaxed and productive atmosphere catering to the needs of the employees. These meeting centers will have a smart building system with more natural light, improved climate control and a more thought out design, so it will feel more as a natural Roel van de Wiel Page 11

Unified Communications: transitioning toward an 2011 interoperable multimediaworld


habitat for people. The presence of employees will no longer be measured with the physical presence of employees, instead virtual presence will be used as a metric. Virtual presence is more than just a simple on/off switch. Employees can choose between different statuses for example Available, Busy or something else. Also the device or location can be a part of this status, and the status can be taken into account when contacting the person and shielding him from any unnecessary distraction.

New devices Because of the commoditization of consumer electronics, new types of electronic devices will be brought into the business, with or without the permission of management. Instead of prohibiting the use of these devices in a business environment, companies should take advantage of the added value that these devices could bring to the office. The hardware of devices is also an important factor in the equation when transitioning to the new way of working. There used to be a time when the criteria for choosing a computer was the raw processing power, now the criteria is shifting more toward the form factor, the quality of its sensory input like microphones and video and the quality of its output (screen en speakers.) If we use voice only communication, a small device with high quality recording and a microphone is considered to be the best. A small form factor get precedence over video output, so a 3-inch screen will do. If you need something small to take notes, edit graphic data with the touch of your finger or read some books on e-ink enabled screen, then the Tablet is your best choice. When writing a lengthy report or designing software, a laptop with a big 17 inch screen is the best. For important videoconferences, sit down and relax in front of your display screen with an HD camera. For any brainstorming sessions, use a 100-inch smart board and immediately publish the end product when you are finished. Devices will be used in conjunction with each other. Edit and share data together on one screen and seeing each other face-to-face on the other screen. The cloud A popular word often heard in the IT-world is Cloud. The Cloud is a broad concept, but in short it is about offering IT services through the Internet. In the same manner as electricity or water, services are provided without the need to invest in infrastructure on site. Access to data and applications will not be limited by one device that just happens to have the right software installed and the data on the hard drive. They will be accessible from every device being serviced hardware independent through rich internet applications, streaming virtualization or combination of the both. It is referred to as the Martini Principle: anytime, anyplace, any device. The problems In order for a new way of working to be successful, there are number of problems which must be dealt with. Most of them are not of a technical nature. People are creatures of habit; they find it hard to change the way they have been doing things. These technologies will change the way we work and influence the business processes and organizations themselves. It will become easier to work across work boundaries, scale a process or Roel van de Wiel Page 12

Unified Communications: transitioning toward an 2011 interoperable multimediaworld


project up or down when needed by adding people or computing power to the pool. That will require a different attitude by the people, especially management. Importance of open standards Having said that most problems are not of a technical nature, there will still be some technical questions that give food for thought. To prevent a legacy problem in the future it will be important to choose Open standards when possible. Data saved in the Cloud should be accessible across Cloud providers. Also for communication it is important to adopt an Open standard policy. For example, in a project group with external partners; should they have a local account or can they simply use their own account from their own company and join the project? The same question applies to video conferencing. The ability to communicate without any boundaries depends on the acceptance of open standards. Intuitive communications The new way of working is all about an intuitive way of communicating. We use our five senses when interacting with objects and people in our daily lives. We smell, touch, look and listen; we process it in our brain and draw conclusions about what is actually happening and how we should react. During normal conversation, we use 55 % body language, 38 % tone of voice and 7 % is the actual content of the message. When we replace real life contact with virtual contact, we miss out on the input provided by our senses. People will experience this as an unnatural way of communicating and will prefer real life contact for most of our communications. The way to mitigate this problem is to design our communication system as intuitive as possible. This means that whenever possible, the best video and audio codecs should be used. The more realistic a virtual meeting will become the more it will replace live meetings. But it does not stop with high fidelity codecs.

Hardware independent As stated above, devices will be brought into the corporate domain by employees who want access to the same functionality as they have at home. But there are some issues that need to be taken into consideration. Some of these are securing company data in a (semi)-controlled environment and maintaining the application landscape of these devices. One way of looking at it, is to regard the devices as merely a temporary container for the users applications and data. The IT department should make their support independent of the underlying hardware and should only focus on delivering added value. By detaching the OS from the device using virtualization (could be a laptop, tablet or mobile phone) and creating an isolated runtime environment to safeguard the data and the updates, makes it more secure and easy to maintain the software. Another approach is to use the Cloud to provide all the necessary applications, including rich internet applications which interact with the user and provide all the functionality local applications would normally provide. These internet applications will be based on the new HTML5 standard. Cloud based infrastructure demands that devices will always have access to internet.

Roel van de Wiel

Page 13

Unified Communications: transitioning toward an 2011 interoperable multimediaworld


1.2 What is Unified Communications

1.2.1 History of digital communications systems The ways of communicating have drastically improved in the last 200 years. In the year 1800 it could take up to a year to send a message through the postal system from Europe to a colonial country. But within a century, the time was reduced to a couple minutes by phone. Nowadays we use email widely to communicate with friends, family and colleagues. The first truly electronic web of communications was the telegraph system. With the commercial use of the first intercontinental undersea cable in 1866, countries could react quickly to important matters and businessman to trade fluctuations. Messages were brief due to the high cost involved when communicating over a single cable. Cost decreased later as the capacity and speed of transmission over the cable increased. The next big electronic web was the telephone, with an intercontinental link completed in 1915. In the beginning it was only used for verbal communications but with the passage of time text messages were transmitted through a telex system. Later a fax became another popular means of communication. In the 60s the basis was being laid for the internet. Firstly only through the declaration of theories in 1968 with the start of a digital network called Arpanet. This network eventually evolved into the Internet as we know it today. In the 90s the Internet became generally available to the public. Nowadays we use the World Wide Web for a large part of our communicating. This communication can be different types, e.g. email, IM, VOIP and video, but in essence it is the same thing: a large volume of bytes moving over a wire, going from one place to another at very high speed passing through networks that serve as intermediaries. When designing the fundamentals for the Internet, a packet-switched network was chosen over a circuit-switched network. Data is sent as a packet full of bits with a destination address attached to it. That packet will pass through a series of network devices, e.g. a router etc, and each time it goes through the device, it will look up the destination address of the packet and decide in which direction it should be sent. This increases the flexibility of the network but makes it unsuited for real-time data like voice and video. This was so until a decade ago when the rapid increase of connection quality and the introduction of Quality of Service made voice, and later videoconferencing, viable over the Internet. These days there is still a separation of the current Internet and the telephony network. Even though the telephone network has converged with the current data network in the background, they still function as separate networks. A good example is Internet-enabled mobile phones. A person is able to reach his email on any mobile device as long as it has Internet access. But the owner can only call and be called with his personal number at the same time on one mobile phone which has his personal SIM card inserted. Roel van de Wiel Page 14

Unified Communications: transitioning toward an 2011 interoperable multimediaworld


The next step in communications advancement is one platform supporting all communications protocols, regardless if it is voice, video or IM. This new category of software solution is called Unified Communications and this has started to appear recently. Eventually UC systems will evolve into a platform that also provides collaboration tools, document management, social media and other applications. According to Gartner this transformation will be available in 2013. Business end-users typically treat the UCC components voice, messaging, conferencing, instant messaging (IM)/presence, applications, clients, social networks and collaboration tools in silos, said Jeff Mann, research vice president at Gartner. They can no longer work this way as UCC represents a fusion of different communications cultures and work styles. The artificial separation they are used to will become a thing of the past. (Gartner Says Distinctions in Unified Communications and Collaboration Will Disappear by 2013)

1.2.2. UC functionality A UC system is made of up different components that could be used, and are used, as autonomic systems in their own right. The concept of Unified Communications exploits these separate systems by combining them and offering them as one unified communication experience to users. At the heart of this unified communication experience is the UC core system itself. The UC core system manages the incoming and outgoing direct connections like phone calls, a video conferences or IM conversations. The core system should provide plugins, Open standards and APIs so that it is able to integrate them into other enterprise applications like Sharepoint and Outlook. This integration will complete the UC experience. The connections made by the UC core system can be divided into realtime and near-real-time. Real-time Real-time communication is voice and video. With real time communication there are two separate channels: the signaling channel, like SIP, h323, XMPP Jingle or something else, and a data channel. The signaling channel is responsible for setting up a connection between two end points and negotiating the data stream and its parameters used to transport the actual voice or video data. RTP is used for the data stream. Most of the time it is directly routed between the end points, whereas the signaling protocol itself is relayed through different stations This is due the low tolerance allowance of latency and jitter when transporting real time data. VOIP Phone infrastructure was managed originally by a PBX. The PBX was a large piece of machinery which used ISDN as standard for setting up phone connections. With the rise of the Internet and the increased reliability of IP networks, IP-PBX became popular from 2000. IP-PBX uses SIP in an IP network instead of ISDN thereby eliminating the cost associated with a separate ISDN infrastructure. Cost reduction was the primary reason for implementing ISDN. Now IP-PBX is gradually evolving into a UC manager with more functionality present than simply managing phone calls. There is an import distinction that needs to be made when it comes to how VOIP is used. Phone calls are connections made with E.164 numbers (standard telephone number) that use the legacy PSTN infrastructure for interconnecting domains and it needs to conform to the specification from Roel van de Wiel Page 15

Unified Communications: transitioning toward an 2011 interoperable multimediaworld


the ITU-T that applies to the PSTN (Public Switched Telephone Network), for example Early media and DTMF. There are also voice calls initiated by a soft phone or communicator that uses an URI (unique resource identifier, for example an email address to directly connect to the other party. It uses the Internet instead of the PSTN infrastructure to transfer the voice call and as such does not need to be backward compatible with the PSTN network or to comply with any specification which applies to the PSTN phone call. Video The use of video for real-time communication has seen a slow rise through the years. This is due to the steep costs that were involved with real-time video technology. First introduced at the Worlds Fair in 1964 in New York, it was not until the eighties that video conferencing became commercially viable. The cost and the demands it put on the infrastructure made it a rare product to be used only in the company boardroom. With the rise of IP networks, Video conferencing systems became more accessible for the masses. Webcams became more common and with software like MSN Messenger or Yahoo Messenger, a video connection (all are still of poor quality) with a friend was easily set up. Currently the quality of the infrastructure has improved so much that the quality is acceptable for use on PCs in corporate settings. There is still hardware available that is dedicated to video conferencing. It is been used in conference rooms and offers better video and audio quality and the cameras used are advanced and known as PTZ: plane to the left or right, till up or down, and zoom in. Near real-time Near-real-time is presence and text-based communication like IM and email. Text-based and presence communications have a higher tolerance to bad network conditions involving Jitter or Latency. As such there is no need to set up a separate channel to accommodate Real-time media. Instant Messaging Instant messaging has been in use for three decades as a true internet-technology. IM is as old as networking itself. Notable milestones are IRC (Instant Relay Chat) introduced in 1988 and still in use today and the rise of IM Messengers like MSN Messenger, AOL Messenger in the nineties. Today there are clients available which support multiple IM networks. Email Email is a separate system that can be used in conjunction with UC. It completes the UC experience and examples are Add-Ons to Outlook. The integration with email makes it easier to decide how to contact the other party. Collaboration Sharepoint, Quadnet and such like products where teamwork is required, can facilitate real live team meetings and presence checks by using UC APIs. The expectation is that Collaboration and Communications will eventually be integrated into one.

Roel van de Wiel

Page 16

Unified Communications: transitioning toward an 2011 interoperable multimediaworld

Roel van de Wiel

Page 17

Unified Communications: transitioning toward an 2011 interoperable multimediaworld


2. UC market analysis
The UC market is subjected to certain forces which drive the development of UC products and determine the adaptation of standards. These forces arise in organizations which have different interest and different views regarding the future of UC and XMPP. Some of these organizations are discussed in this chapter. First, the views of analyst are discussed. In the second and third sections, UC vendors and Internet services which provide some form of UC are listed. There is a clear distinction between those two: the former provides infrastructure for enterprises, the latter UC services that are mostly consumer oriented. After this, institutes that play an important part in guiding the development of UC are mentioned. And finally, an appraisal of the current market will be made.

2.1 What analysts say about UC


2.1.1The Market Forrester predicted that the market will grow from $1.2 billion in 2008 to $14.5 billion in 2015. One of the inhibitors of the UC market is bad interoperability issues. This makes business prudent when it comes to implementing UC in their own corporate infrastructure. They do not want to run the risk of investing in technology that might become obsolete tomorrow. Forrester predicts that when UC software becomes more standard based, managers responsible for IT-infrastructure will lose their reservations and start investing in UC infrastructure. (Dewing, 2009) Gartner sees problems when it comes to intercompany multi channel UC use. None of the current UC vendors can facilitate intercompany multi-mode channel communication. There are some initiatives by vendors and service providers to offer some kind of directory where URI multi-mode channel communication is possible. But they rely on the support and understanding of the UC market for success. QOS is difficult to guarantee when using plain Internet. Without QOS, the connection could deteriorate to below the acceptable norm and could lead to a negative experience by both users. Studies have shown that negative perception of a connection tends to wear off in the judgment of a conversation partner. Gartner also expresses its concern about privacy. When opening up one's corporation UC infrastructure, Gartner wonders how much personal information employees would want to share with customers, partners or other third parties. It also thinks it is likely that employees will be interrupted more frequently and this could be found undesirable. As a result it thinks that in 2015 only 1 % will become interoperable. (Predicts 2011: Adoption of Unified Communications Creates New Sourcing and Deployment Challenges, 2010) 2.1.2 What Gartner says about XMPP Recently Gartner published an interesting report concerning XMPP (see below). It believes XMPP has the potential to become the standard for near-real time communication in 2015. Previously it thought XMPP would simply be the first choice for the UC industry, but IETF has not ratified SIMPLE for presence information while XMPP has already been ratified. Also, XMPP has seen wider Roel van de Wiel Page 18

Unified Communications: transitioning toward an 2011 interoperable multimediaworld


acceptance by vendors. Twitter, Facebook and Google use XMPP for their services and so does the government and defense industry in the US (jabber XCF). One reason for adopting XMPP is that, due the increasing heterogeneity of IT infrastructure, companies would prefer one standard that could be used as middleware to share advanced presence information of the users. (Take Four Steps to Prepare for XMPP Becoming an Universal Standard, 2010)

2.2 Potential of UC in commercial appliances


Before corporations can decide to implement new technology, there needs to be a clear business case which can garantee a cost reduction or an income improvement. They then only choose proven technology because they are uncomfortable with taking any risks outside their primary business process. One of the things Capgemini was interested in was how UC could be used in the banking environment and if there was any need for it initially. 2.2.1 Banks What sets banks apart from other businesses is the type of products they offer. Financial products are all non-material and information based. The complexity of those financial products makes it hard to understand and sell. Trust is therefore absolutely necessary to sell these complex products. To achieve that level of trust, there needs to be enough authentic communication between both parties - that can be a telephone call, a face-to-face meeting or letters. The possibility of adding rich media communication, like video conference and IM, makes it easier to gain the trust of the customer. This has already been proven by research. Mirjam Schmidt in her report "Finance advise online" concludes that virtual interaction with an online human expert adds to the trust that the customer has in the advice given. Not only is virtual interaction important, the context of these interactions adds to the trust people have in the human expert. (FINANCIAL ADVICE ONLINE, 2009) A small community bank in Missouri USA has made a connector to the AOL service where its employees are directly listed in the contact list of their customers. As mentioned in the above article, this places the bank in the same contact list as the customer's friends. (Instant messaging helping Mass. bank build report, 2006) In an interview I had with Daniel Hillster who was involved with the New Way of Working project at SNS Reaal, he said he could imagine a request being made to use UC technology to contact consumers. He also said there is a need for ways to connect UC software with partners who are only involved with bank for a short period of time, for example, in projects. (Hilster) In the book "Bank 2.0" written by Brett King in 2010, numerous reports are mentioned and examples are given about how the rise of Internet use has changed the customer expectancy of service delivery and how banks could further benefit from this paradigm change. Dutch banks are often mentioned, like ABN AMRO for their introduction of a videoconference teller who looks after multiple branches at a time. It has been so successful that they expanded their teleconferencing services. Furthermore, new technology and the development of the past age is often referred to. (King, 2010)

Roel van de Wiel

Page 19

Unified Communications: transitioning toward an 2011 interoperable multimediaworld


The vision of the ultimate contact centre currently appears to be a convoluted mix of unified messaging platform, IP-based architecture, automated voice response systems and first-call resolution KPIs. Page 145, (Bank 2.0, 2010) From research done, it can be concluded, that UC can be used in commercial appliances. In a changing world, customers can choose between different forms of communication. A bank could benefit from this by offering customers more communication channels and by so doing, increase the accessibility of the bank by its customers.

2.3 UC vendors
This section has listed some interesting vendors and their UC experience. They are interesting because of their support for XMPP in their leadership position. The list of active UC vendors is too long to discuss them all. There is a clear distinction between UC vendors who provide UC infrastructure for a business environment and Internet service providers who deliver mostly consumer oriented services. 2.3.1 leading UC vendors

Cisco Cisco says its policy is to use open standards as much as possible. This is partially marketing hype but in the field of UC they are living up to that promise by fully supporting jabbers XMPP technology. They bought jabber XMPP server in 2008 and the former CEO of jabber and president of the XMPP foundation, Peter Saint-andre, is leading the jabber technology division at Cisco. Cisco UC technology natively supports XMPP for IM and presence. They currently have an extensive portfolio of UC related products. One of their products (Interoperability Media Engine) can be used as a mediator between different SIP dialects and as such can facilitate interoperability. The latest news is that there is a new release on 1st march 2011 of the XMPP client called Cisco Jabber with support for Jingle . Microsoft Lync technology does not use XMPP natively. Recently Microsoft has added a XMPP gateway to their product portfolio. This allows it to work with IM services like Gmail. The H.264 video codec from Microsoft uses proprietary techniques to make it error resistant but this makes their video solution difficult to interoperate.

2.3.2 Challengers/innovators
Process-one Process-one is the developer of XMPP software including server and clients. Their open source XMPP server, Ejabberd, is based on Erlang which is a programming language developed by Ericson to build robust telecommunication applications. It is even possible to do a hot code loading of the server which means that the server can continue to run even if code needs to be added to it.

Roel van de Wiel

Page 20

Unified Communications: transitioning toward an 2011 interoperable multimediaworld


Jive software The openfire server of Jive Software is an open source XMPP server written in Java. It is well documented and easy to extend with extra modules. And it has a loyal community whose members assist in its development. Ezuce Ezuce was established by the founders of the open source initiative of SIPfoundry. They present their product as an open standard based alternative to Microsoft UC. Ezuce is based on SIPfoundry which has recently added XMPP support. They offer a migration route for Nortel which is also SIPfoundry based. After investigating the UC vendor field, it can be concluded that multimedia extensions are not actively supported and promoted by UC vendors. Cisco has XMPP support but uses SIP for setting up real-time media (update: 1st march Cisco launched Cisco Jabber which seems to support Jingle). The challengers consider the ability to use Jingle as a bonus rather than a primary business function. Jingle can be used for setting up multimedia streams in business streams, but without a clear business case and support from UC vendors, it is ill-advised.

2.4 Internet services


Due to the commoditization of electronic devices, consumers have more media-rich enabled devices at their disposal than an average employee. Another trait which sets them apart in companies is that they are not bound by the business processes and the investments needed when adopting new technology. As a consequence, they are more flexible when it comes to adopting new technologies. These days the consumer has access to many different ways of communicating. IM, email telephony, VOIP and SMS to name a few. The majority of these communication tools are segregated into separate channels. With IM alone, one user can have accounts at several IM services. Some developments that can be noticed are the evolution of these consumer oriented communication services from providing only a single channel like only IM or only voice to a full-rounded communication platform for IM, voice, video or other plausible ways of communication for that matter. Many of these upcoming communication platforms are XMPP based. Gtalk Gtalk is a product from Google. It uses Jingle for voice and video and is the most open product available. Other XMPP networks can connect to Google and use Jingle for voice or video setup. Google is a big supporter of XMPP. Besides Gtalk, XMPP is used in Google wave and in Google Chrome. Google can be considered to be a leader in the area of XMPP development. Nimbuzz A Dutch startup company that uses XMPP Jingle to enable voice calls, as well as phone calls to the PSTN network. It can connect to different IM networks as well to SIP providers. There are many similar products like Fring and Talkonaut which use XMPP Jingle, e.g. Nimbuzz.

Roel van de Wiel

Page 21

Unified Communications: transitioning toward an 2011 interoperable multimediaworld

Figure 1 Nimbuzz on nokia

Whatsapp Startup lets you chat on your smartphone with a phone number as a user identifier. It adds contacts who are already listed in your phonebook and have whatsapp installed into your contact list. Yammer Due the nature of their service, i.e. microblogging, there is heavy traffic between the client and its server. With XMPP implemented, updates get pushed to the client rather than polling the server every time. Nokia ovi Nokia chat service is based on XMPP. It is a little comparable to Blackberrys ping service. Facebook Facebook is the biggest social media website. It added XMPP support in February 2010. Currently Facebook uses XMPP for letting users connect with their IM services. It blocks any XMPP service that is not IM-related like Jingle, and does not allow interdomain communication. The reasons for this Roel van de Wiel Page 22

Unified Communications: transitioning toward an 2011 interoperable multimediaworld


could be technical but this is unlikely. What is more plausible is that they limit it because of business motivated reasons. If Jingle is enabled, it would pose a threat to mobile service providers. Mobile service providers are of strategic importance to Facebook because they control the mobile phone market. With the tight integration that Facebook has with some mobile phones, they depend on mobile service providers to push those mobile phones and promote Facebook. It is a similar story with Google. It depends on providers to promote phones with the Android OS installed on it. In Europe, Gtalk cannot be used for making phone calls. It is only in America where it is possible to use Gtalk on the mobile phone. The primary motivation for communication services to adopt XMPP seems to be a shorter time-tomarket. XMPP makes for a shorter-time-to-market due to the well-thought-through design with the possibility of adding your own customized extensions. Making cooperation possible with other parties seems to be of lesser importance. The most likely cause for this is possible interference with the business model. Cooperation is only considered in cases where the other party represents a substantial user base. Gtalk seems to be an exception to this as it is open for small-time particular initiatives. Conventional mobile phone functionality is lacking. For instance, it is only possible for each unique mobile number to be connected to only one mobile phone in the operators network. When evolving to a multi device environment, people will expect to be available on multiple devices with just one account or number. That is why there is paradigm shift taking place from calling and texting with mobile phones in the traditional way towards using Internet for voice, video and text-based communication. This paradigm shift will most likely accelerate with the introduction of 4g networks. In developing countries like India, mobile service providers already promote the use of voice over internet services like Nimbuzz. Because of the cost involved with a mobile phone call, people wanted alternative ways of setting up a telephone call with less costs. At first, mobile phone service provider tried to prevent the use of the internet for phone calls, but realizing it could not be stopped they adopted the motto: if you cant beat them then join them and they use VOIP now to actively promote their network. 2.4.1 Other noticeable use of XMPP The US government and US defense department implemented a highly adapted version of Jabberd (now part of Cisco) called Jabberd XCP. It meets the strict requirements of the US Government. Prorail uses a XMPP version for sending updates to the train drivers about rail traffic. Furthermore, there are many Applications in the financial world and other sectors where XMPP is used.

2.5 Institutes
Institutes are responsible for guiding technology development and defining open standards. There is a need for a common agreement on the specifications of standards. So when standards find their way into the vendor's products, it is still possible to combine these products with exchange data services. Roel van de Wiel Page 23

Unified Communications: transitioning toward an 2011 interoperable multimediaworld


The mandate that these institutes have varies and depends on the level of recognition given by the industry or authority given by government institutes. These institutes address different needs and focus on different areas in which they are active. These areas of focus can overlap. IETF The Internet Electronic Task Force is an organization that is responsible for standardizing protocols used in Internet communication. It has standardized over 100 RFCs for SIP and has formed a Work group to standardize simple extensions. IETF has also formed a workgroup to ratify the core standard of XMPP. UCIF The Unified Communication Interoperability Forum was founded in 2010 by Logitech, HP Microsoft, Juniper and Polycom among others. It has the goal of making UC technology compatible, not by defining standards but by interoperability testing, sponsoring other organizations, and providing a platform for vendors to align their products. That is why the absence of Cisco, Google and Avaya could undermine the credibility of the organization. The UCIF has incorporated XMPP Jingle into its testing framework even though the testing was done in an ad-hoc manner. The head of UCIF is Bernard Aboba who is a principal architect at Microsoft. ITU The ITU regulates information and communication technology. It has a long history that dates back to xxx and received its mandate from the United Nations to be the leading international institute for standardizing telecommunications protocols. It has three subgroups: ITU-T, ITU-R and ITU-D. ITU-R manages the international radio spectrum, while ITU-D promotes and supports ICT in developing countries. ITU-T is the best known subgroup. It standardizes ICT technology. International standards are given names which begin with a letter followed by a three numbers, e.g. Codecs called g.711, h.264 and h.323. 3GPP The 3rd Generation Partnership Project is an organization consisting of multiple partners that came together in an effort to specify the requirement for a universal 3G standard acceptable for mobile phones. Since then it has continued defining more standards. Rather than designing a standard itself from the beginning, it listed the requirements that need to be met in order for any standard to be ratified as a 3G or -more specifically- a 4G standard. One of its current standards is IMS (IP Multimedia System) which has become a core component of multimedia transmission in 3G and 4G networks. IMS was meant to make a differentiation between services like video calling and instant messaging possible so that each service can be separately billable and to provide QOS when required. SIP has a prominent place in IMS, but so far IMS has not been implemented unsuccessful. Source: (10-years-of-SIP-dominance) XSF The XMPP standard forum looks after the standardizing process for XMPP and its extensions. Its an open forum and anyone can participate. The current president of the organization is Peter Saint Andre who is also head of Jabber technology at Cisco. They formulate the core specifications Roel van de Wiel Page 24

Unified Communications: transitioning toward an 2011 interoperable multimediaworld


of XMPP and hand them to WG which focuses on making the XMPP specification compliant with IETF requirements. SIPforum This organization facilitates the adaptation of SIP by vendors, carries out interoperability testing and provides a platform for vendors to meet. One of their efforts is developing a common approach to SIP trunking in the form of SIPconnect. In 2008, SIPconnect 1.0 was ratified. This is a set of recommendations which should guarantee interoperability if vendors and service providers implement them in their products.

2.6 Reflection
There is still a clear distinction between service providers on the Internet which have evolved from standard text based communication (like IM and email) to a more advanced form of communication, and the telecom industry responsible for the mobile phone network and phone infrastructure as used in businesses. The former has less restrictions placed on it and is more flexible when implementing new functionality than the latter. Reason for this is that the telecom industry is very institutionalized and needs to comply with many standards. Some of those are in place to remain backwards compatible with PSTN or legacy equipment. This distinction will most likely slowly disappear and can already be observed by looking at the increase in the number of applications on mobile phones that enable the customer to use the Internet for phone calls rather than using the mobile phone network. Mobile phone network operators are struggling to adapt to these events. Banning these applications from the network or limiting the network access for these applications means risking customer dissatisfaction. One possible solution to this could be to implement IMS which allows mobile operators to include those services (their own or of that of a third party service provider) into their mobile phone network making them chargeable and providing QoS. But as yet, there has been no successful introduction of IMS. Due to their adaptability and rich feature set, it is most likely that the future of communication will be dictated by service providers like Gtalk, Fring and Nimbuzz which are all XMPP based (but are also SIP capable). Industry leaders in the UC section, like Cisco, Microsoft and Avaya, are already moving towards XMPP support and UC interoperability. Cisco is ahead in this effort as it recently (1 march 2010) announced Cisco Jabber with a UC client based on XMPP that supports Jingle and ultimately Gtalk.

Roel van de Wiel

Page 25

Unified Communications: transitioning toward an 2011 interoperable multimediaworld


3. Technology explained
In this chapter, an overview is given of the standards that are related to UC use and their inner workings. These standards are SIP which is the most dominant standard in use by UC servers today, and XMPP which is a protocol used mainly for Instant messaging, including multimedia capabilities. The last part of this chapter looks further into safeguarding the quality of video and voice.

3.1. The way SIP works

3.1.1 Call setup SIP is a session protocol for managing the connection for as long as it is required. A SIP session is set up when one end-point, e.g. a mobile phone, initiates contact through a invitation and the other party send an acknowledgement. The whole process will be explained in more detail later in this chapter. After this initial contact, the connection is considered established. While the connection is active, Real-time data (mostly voice) is transmitted through a separate Real-time channel using the RTP protocol. Voice and Signaling channels are separated into two different channels which can be independently routed. When a session is to be ended, a termination command will be sent by one of the parties to request the end of the session. In the main RFC for SIP, RFC3261, a SIP session is broken down into five facets: User location: determining which end system will be used for communication. User availability: determining whether or not the called party is willing to engage in communications. User capabilities: determining the media and media parameters to be used for this communication. Session setup: establishing the session parameters at both the called and calling parties. Session management: including the transfer and termination of sessions, the modifying of session parameters, and the invoking of session services.

3.1.2 SIP software architecture The SIP software architecture consists of two elements: a SIP client (User Agent Client) and a SIP server (User Agent Server). A SIP client sends SIP requests and receives SIP responses. A SIP server receives the requests and gives responses. For example, a SIP client sends a request in the form of an invitation, the server receives this request and determines whether to send an acknowledgement or deny the request with an error or unavailable message.

Roel van de Wiel

Page 26

Unified Communications: transitioning toward an 2011 interoperable multimediaworld


3.1.3 SIP network architecture The SIP network architecture is by default a P2P model. But the majority of SIP deployments is used in a quasi client-server model. There are three types of servers: Registrar, redirect and proxy. A registrar server is used to register the IP address of the SIP client and linking it to its URI. The URI stands for Unique Resource Indicator and identifies the user. The redirect server responds to SIP redirect requests with the IP address of the next server in line. The proxy server serves as an intermediary between two clients. It resends messages from UAC to UAS and vice versa. The reason for using a proxy server is to centralize SIP network traffic and to make the network management easier. The proxy server has two modes of operation: Stateless and statefull. Stateless. The proxy server only resends traffic and is unaware of the state of the session itself. Statefull. The proxy server is aware of the state of the session. This enables functionality like Forking and Forward Call Busy.

These are just server roles and most of the time they reside on one physical server.

3.1.4 SIP message format There are six types of request messages defined and they are referred to as Methods. These are sent by UAC: REGISTER: Is used by a client to register an address with a SIP server. INVITE: Indicates that the user or service is being invited to participate in a session. The body of this message would include a description of the session to which the callee is being invited. ACK: Confirms that the client has received a final response to an INVITE request, and is only used with INVITE requests. CANCEL: Is used to cancel a pending request. BYE: Is sent by a User Agent Client to indicate to the server that it wishes to terminate the call. OPTIONS: Is used to query a server about its capabilities.

The response messages contain Status Codes and Reason Phrases that indicate the current condition of this request. These methods are used by the UAS. The status code values are divided into six general categories: 1xx: Provisional: The request has been received and processing is continuing. 2xx: Success: An ACK, to indicate that the action was successfully received, understood, and accepted. 3xx: Redirection: Further action is required to process this request. 4xx: Client Error: The request contains bad syntax and cannot be fulfilled at this server. Page 27

Roel van de Wiel

Unified Communications: transitioning toward an 2011 interoperable multimediaworld


5xx: Server Error: The server failed to fulfill an apparently valid request. 6xx: Global Failure: The request cannot be fulfilled at any server.

The inspiration for this code schema is the code schema used in the HTTP protocol, with its most famous 504 page not found error.

3.1.5 SDP As mentioned earlier, the signaling and the data channels are separated. The SIP protocol is responsible for setting up the data channel and it uses the SDP protocol to negotiate this. SDP originated from its implementation in SAP, but has been reused in SIP and is defined in RFC4566. SDP is basically a standard which specifies the parameters for a real-time data channel that needs to be published to other parties. Mandatory parameters are marked with an asterisk.
Session description v= (protocol version) o= (owner/creator and session identifier) s= (session name) i=* (session information) u=* (URI of description) e=* (email address) p=* (phone number) c=* (connection information - not required if included in all media) b=* (bandwidth information) One or more time descriptions (see below) z=* (time zone adjustments) k=* (encryption key) a=* (zero or more session attribute lines) Zero or more media descriptions (see below) Time description t= (time the session is active) r=* (zero or more repeat times) Media description m= (media name and transport address) i=* (media title) c=* (connection information - optional if included at session-level) b=* (bandwidth information) k=* (encryption key) a=* (zero or more media attribute lines)

And this is an example how SDP is used to set up a RTP connection [Offer] Roel van de Wiel Page 28

Unified Communications: transitioning toward an 2011 interoperable multimediaworld


v=0 o=alice 2890844526 2890844526 IN IP4 host.atlanta.example.com s= c=IN IP4 host.atlanta.example.com t=0 0 m=audio 49170 RTP/AVP 0 8 97 a=rtpmap:0 PCMU/8000 a=rtpmap:8 PCMA/8000 a=rtpmap:97 iLBC/8000 m=video 51372 RTP/AVP 31 32 a=rtpmap:31 H261/90000 a=rtpmap:32 MPV/90000 [Answer] v=0 o=bob 2808844564 2808844564 IN IP4 host.biloxi.example.com s= c=IN IP4 host.biloxi.example.com t=0 0 m=audio 49174 RTP/AVP 0 a=rtpmap:0 PCMU/8000 m=video 49170 RTP/AVP 32 a=rtpmap:32 MPV/90000

3.1.6 Call setup After explaining the basics of the SIP operation, it is easier to visualize now how a call is made. When a voice call is being made, the phone (UAC part) sends out an INVITE request with a SDP offer to the proxy server. The proxy server checks if the URI is known local and is registered with the Registar server, or interdomain. If it is an interdomain URI, the redirect server will be queried and it response will be a 3XX redirection method message that points to the next server in line. The proxy server of the other domain will receive a INVITE with a URI identifying the receiving party and will pass it on to the phone. The phone (UAS part) will reply with a response message that will most likely be a 1XX provisional acceptance. When phone call is accepted, by picking up the phone for example, a 2XX ACK is sent back. A RTP stream that directly connects the caller to the callee is set up according to the approved SDP offer. It is also possible for the callee to respond with a 3XX redirection message. When the phone line is busy, a redirection needs to be sent. If the connecting attempt is unsuccessful a 4XX, a 5XX or a 6XX error code is returned as a response.

Roel van de Wiel

Page 29

Unified Communications: transitioning toward an 2011 interoperable multimediaworld

Figure 2 SIP architecture: (Understanding SIP)

Figure 3 SIP call flow (Internet Communications using SIP, 2006)

Roel van de Wiel

Page 30

Unified Communications: transitioning toward an 2011 interoperable multimediaworld

In this day and age, phone calling or video calling is simply not enough. There is a need for IM communication, or chatting as it is known, where people can have a text conversation with their colleagues. This has proven to be ideal in some situations, for example asking a quick question and receiving a short answer. It should also be possible to see the current availability status of a person to determine if they are open for communication. To service these needs, there has been an extension defined to the SIP protocol called Simple. Simple is the acronym of Session Initiation Protocol for Instant Messaging and Presence Leveraging Extensions. The Simple protocol adds two distinguished features to the SIP protocol: Instant messaging and presence sharing. But unfortunately it is considered by many to be too complex for implementation.

3.1.7 Why does SIP not provide the solution ? Understanding the deficiencies of SIP is easier when comparing it to another standard like SMTP. When sending an email, SMTP is used between two servers. SMTP is a straightforward protocol with only two different versions implemented in email servers worldwide: ESMTP (Enhanced Simple Mail Transport Protocol) and SMTP (Simple Mail Transport Protocol). ESMTP is the standard for email in use nowadays. When writing email server software, the SMTP protocol is followed exactly without making any twists or changes, resulting in full compatibility with the standard. Therefore when sending a email from one server to another server there is no risk of the email being rejected by the receiving server. SIP is based on HTTP and SMTP. It uses the same schemas and grammar but SIP is designed for voice communications whereas HTTP and SMTP are for one way and two-way text-based communications. The main difference between SIP and the other protocols is the looseness of the specifications, which result in vendor implementing SIP according to their interpretation of the protocol. All the functionality that has been defined in the IETF standard in relation to SIP can be found in RFC5411 - the hitchhikers guide to SIP. It serves as a reference guide to the 100 SIP related RFC.

3.1.8 Problems with SIP SIP was developed in the late nineties to make voice over internet possible. At that time there was a digital voice standard, namely ISDN. ISDN is a great protocol by itself but it was not designed for IP networks like the internet. So there was still two segregated infrastructures being used alongside each other - the packet-switched IP enabled network called the Internet and the circuit-switched telephone network. Maintaining two independent infrastructures is more expensive than maintaining one, hence the need for voice transportation over the Internet. There were two protocols competing to becoming the standard for voice over Internet - H323 developed by ITU-T and SIP developed by the IETF. Roel van de Wiel Page 31

Unified Communications: transitioning toward an 2011 interoperable multimediaworld


The ITU-T is an official body of the UN dealing with the standardization of telecommunications protocols. It is often considered to be very bureaucratic and slow. IETF on the other hand is more loosely organized. SIP eventually succeeded in becoming the de facto standard due to the simplicity of its design in comparison to H.323. SIP became the dominant standard for voice communication, but the developers of SIP software implemented SIP in different ways in their products. This made internetworking very difficult between the software implementations and compatibility was often impossible. The reason for this difficulty can be seen in the way the standard came about. The IETF is a democratic organization where many different points of views have to be accommodated in the final proposal in order to receive approval by the majority of voters. This results in a protocol which leaves it open for individual interpretation. To give an illustration of the open-endedness of the SIP RFC 3261, a simple word count is enough.

Weak Terms Can = 475 Option = 144 Should = 344 May = 381

Strong Terms

Shall = 4 Must = 631

The excessive use of weak terms is a good indicator of how open the SIP standard is for interpretation by developers. As a result, when using SIP in a heterogeneous environment, great effort has to be made to maintain compatibility with even the most basic functionality. The common architecture used in enterprises when using SIP as replacement for telephony is that of a central PBX system that is used to communicate with the internal clients in the vendor dependent dialect of SIP, and a SIP gateway that is used to make SIP interworking possible and provide SIP trunking capabilities. One of the Internet advantages over the original PSTN network is its mesh design and the ability to setup a connection on a peer-to-peer basis. With the need for SIP-trunking for connecting SIP infrastructure to telephony service providers nothing has changed. The SIP infrastructure as a whole mimics the PSTN network over the internet. Sources: (sip interoperability ) & (Real-world SIP Interoperability: Still an elusive quest , 2007)

3.2 XMPP
3.2.1 Brief History of XMPP XMPP technology was invented by Jeremie Miller in 1998. His motivation came from the desire to open up IM services. The first release of a working product was in January 4 1999. Soon there was a whole group of developers designing clients and libraries for languages. Work on the XMPP protocol Roel van de Wiel Page 32

Unified Communications: transitioning toward an 2011 interoperable multimediaworld


was still in progress in May 2000 when Jabberd was released. To organize the development of XMPP and its extensions, the Jabber foundation later renamed it to XMPP standard foundation was formed August 2001. To increase the adoption, effort was made to make XMPP an official standard of IETF. Within two years the IETF published the CORE XMPP specifications in its RFV series as RFC 3920 and RFC 3921. In 2005 Google Talk was launched with VOIP support based on XMPP Jingle. Jingle was at first a proprietary extension of Google, but committed to open standards. Google eventually decided to publish the specifications and supported further development of Jingle. In 2008 Cisco purchased Jabber Inc. - the company that was responsible for developing Jabbar and Exploitation. Cisco now uses XMPP as a base for communication with clients. 3.2.2 Network architecture XMPP uses the server-client model. A XMP client contacts the server by sending it a message and the server resends it to the receiving XMPP client. Keeping track of the presence information of the clients is also done by the server. XMPP uses the push model for sending and receiving information rather than the pull model. When the presence status of someone who is the clients roster changes the new presence status is pushed by the server to the client. This decreases the load on the server by not having the client poll for updates. XMPP uses a URI to identify users in the same manner that SMTP and SIP does. The format of a URI is: name@domain.tld. New to this URI is the addition of Resources. Resource labels the different connections of the user so different open connections with different characteristics can be used in different ways. The resource can be anything distinguishing like the location, type of internet connection or client software. Examples are: John@jabber.org/mobile or John@jabber.org/home. 3.2.3 Extensibility The architecture of XMPP is designed to allow the addition of extensions to it. Extensions add functionality to either the server, as a service to the domain, or/and to the client. The MUC (Multi User Chatroom) could for example be added to the domain example.com. The service could then be found for example on conference.example.com. Also, clients could have multiple extensions active. To find out what extensions are supported, a discovery request can be sent directly to the client. All extensions are clearly described on the XSF web page, but not all specify additions to the XMPP protocol. Some have a procedure of informational nature and are used to describe a best practice or a way how events should be handled. How extensions are handled and defined is described in the first extension to XMPP, XMPP-0001. 3.2.4 Roster One of the corner stones of the XMPP protocol is the inclusion of Presence status in the protocol. Presence is used to keep track of a users availability. Different presence statuses can be selected so that the other parties can see if the user is available and decide what the best way is to contact him. For example, when the user is fully available, he could be called directly or be contacted using IM. The presence status is stored on the server. Access to other people is based on the presence subscription. When user A wants access to the presence information of user B, user A requests for an opt-in for the presence information and user B can then choose whether he grants user A access to his presence status. Roel van de Wiel Page 33

Unified Communications: transitioning toward an 2011 interoperable multimediaworld


When a user comes online, he receives a roster (similar to a contact list) where all his contacts and their presence status is shown. When a change of the presence status occurs, it will be updated through the principle of the push mechanism that is built into the XMPP protocol.

3.2.5 Stream XMPP is in essence a streaming XML protocol using Stanzas to communicate. When the negotiation of XML stream is complete, stanzas are used to exchange messages. There are three types of stanzas: <message> A message stanza is used to send an IM message; a message is pushed to the other party. There are five different types of messages:Normal: similar to an email message where a reaction might be given. Chat: near real-time message communication. Groupchat: communication in a chat room. Headline: used for alert and notification Error: for error notification. <presence> Presence stanzas are used to indicate the presence of the client. It also offers the possibility of including standard status signs like Away or Available or with personal information, e.g. Im in the train Example: <presence from="alice@wonderland.lit/pda"> <show>xa</show> <status>down the rabbit hole!</status> </presence>

<iq> The Info Query stanza is used to receive and send information. A request maybe a roster <iq> a dialog between a client and server would look like this: C = client and S=server. C: <stream:stream> C: <presence/> C: <iq type="get"> <query xmlns="jabber:iq:roster"/> </iq> S: <iq type="result"> <query xmlns="jabber:iq:roster"> <item jid="alice@wonderland.lit"/> Roel van de Wiel Page 34

Unified Communications: transitioning toward an 2011 interoperable multimediaworld


<item jid="madhatter@wonderland.lit"/> <item jid="whiterabbit@wonderland.lit"/> </query> </iq> C: <message from="queen@wonderland.lit" to="madhatter@wonderland.lit"> <body>Off with his head!</body> </message> S: <message from="king@wonderland.lit" to="party@conference.wonderland.lit"> <body>You are all pardoned.</body> </message> C: <presence type="unavailable"/> C: </stream:stream>

The difference between a XML streaming protocol like XMPP and a SMTP and HTTP inspired protocol like SIP is that XMPP sets up a long lived TCP connection that is better suited for near-real-time IM communication. This is in contrast to SIP which needs to set up a TCP session for every information exchange. 3.2.6 Security One of the requirements stated by the IETF in order to ratify XMPP as a RFC, was that it must have security built into its design. As a result, TLS and SASL have been incorporated into the specifications of the core XMPP RFC. So communication from server to client and from server to server can be secured by TLS, and credentials can be checked by using SASL. This does not make XMPP secure from end- to-end because messages are unencrypted as they pass through the server. Work is in progress to make XMPP end-to-end secure, however this would make the messages unreadable at the server itself. End-to-end security can have its drawbacks because it obscures the stream, making it hard to be controlled and audited. Another security feature is the option to use CAPTCHA. CAPTCHA can be used to mitigate SPIM (SPAM at IM networks). When a XMPP account request the addition of another XMPP account to the domain, the server has the possibility to send a CAPTCHA as a data form to identify the user as a real person. 3.2.7 XMPP jingle XMPP uses ASCII in its communication so it is well suited to sending and receiving text messages. But it is not well suited to sending binary data like a file or voice communication. It must first be converted to Base64 and that makes the process inefficient. Another issue is the fact that XMPP uses a client-server model and the data is sent indirectly via a path through the server. This is also a reason why sending a large amount of data or data with QOS, is more efficient using a different protocol. So there was a need for an extension to solve these problems. When Google launched Google Talk in 2005 with voice support over XMPP, the XMPP community became serious about Roel van de Wiel Page 35

Unified Communications: transitioning toward an 2011 interoperable multimediaworld


developing a voice extension to XMPP. With the help of Google engineers Jingle was born. Jingle is a protocol extension that provides clients with a way to set up a data channel either as a direct peerto-peer connection or mediated through a proxy server. Jingle is backwards compatible with SIP and this is shown in its strong resemblance to SIP. But Jingle uses the XMPP framework. The extensions describing Jingle are XEP-0166 Jingle and XEP-0167 Jingle RTP sessions.

An offer is two sided. Application type: States the type of data that is going to be exchanged and the protocol used, e.g. voice data over Real Time transmission protocol. Transport method How data is to be sent and which IP address it is going to, e.g. UDP on port 4043. Each Jingle stanza has an action type, which is quite similar to the different actions types of SIP: Session-initiate Session-accept Session terminate Session-info Used to give additional information through the session

There are some additional action types that could be sent through Jingle. Content-add Can be used to add another content type like video or voice to the stream. Content-remove Opposite of content-add Content-modify Change the direction of the media exchange, so sender-only or receiver-only. Description-info Additional information, e.g. suggested height and width. Transport-replace Suggest a change in transport method, e.g. IP address or port. This can be accepted or rejected by the other party.

When an offer is being made by the initiator, it starts a process which generates a large amount of XMPP traffic being sent back and forth to negotiate offering details like Codecs, IP addresses and port numbers. As copied from 1. The initiator sends an offer to the responder. 2. The offer consists of one or more application types (voice, video, file transfer, screen sharing etc.) and one or more transport methods (UDP, ICE, TCP, etc.). 3. The parties negotiate further parameters related to the application type(s) and work to set up the transport(s). Roel van de Wiel Page 36

Unified Communications: transitioning toward an 2011 interoperable multimediaworld


4. The responder either accepts or declines the offer. 5. If the offer is accepted, the parties exchange data related to the application type(s) using the negotiated transport method(s). 6. If needed, the parties can modify certain parameters during the life of the session (e.g., by adding video to a voice chat or switching to a better transport candidate). Eventually, the session ends and the parties break contact. This is what a Jingle negotiations looks like: First there is a session-initiate: <iq from="alice@wonderland.lit/rabbithole" id="v73hwcx9" to="sister@realworld.lit/home" type="set"> <jingle xmlns="urn:XMPP:jingle:1" action="session-initiate" initiator="alice@wonderland.lit/rabbithole" sid="a73sjjvkla37jfea"> <content creator="initiator" name="voice"> <description xmlns="urn:XMPP:jingle:apps:rtp:1" media="audio"> <payload-type id="96" name="speex" clockrate="16000"/> <payload-type id="97" name="speex" clockrate="8000"/> <payload-type id="0" name="PCMU"/> <payload-type id="8" name="PCMA"/> </description> <transport xmlns="urn:XMPP:jingle:transports:raw-udp:1"> <candidate candidate="1" generation="0" id="a9j3mnbtu1" ip="10.1.1.104" port="13540"/> </transport> </content> </jingle> </iq>

The offered payloads are copied from the profile offering of SDP. This makes XMPP Jingle compatible with SIP/SDP. Jingle is mostly used for voice or audio but it could be used for setting up other streams, like gaming or app sharing. If everything goes well, the responder answers with a session-accept stanza: <iq from="sister@realworld.lit/home" id="b18dh29f" to="alice@wonderland.lit/rabbithole" type="set"> <jingle xmlns="urn:xmpp:jingle:1" Roel van de Wiel Page 37

Unified Communications: transitioning toward an 2011 interoperable multimediaworld


action="session-accept" initiator="alice@wonderland.lit/rabbithole" responder="sister@realworld.lit/home" sid="a73sjjvkla37jfea"> <content creator="initiator" name="just-an-example"> <description xmlns="urn:xmpp:jingle:apps:stub:0"/> <transport xmlns="urn:xmpp:jingle:transports:stub:0"/> </content> </jingle> </iq> It could be that error conditions occur or further negotiations are required. These conditions are handled as described in the XMPP Jingle specifications: XEP-0166.

Bypassing NAT The problem with setting up a real time media stream is the inherent difficulty in bypassing Network Address Translation (NAT). NAT hides the receiving IP address which makes it hard to route the stream to the senders IP address. XMPP Jingle supports all the traditional ways of bypassing NAT, e.g. supporting TURN and STUN. The negotiation of NAT Bypassing is done by the ICE. Recently there has been an extension added to XMPP called XMPP Jingle relay node (XEP-0278). It is still in the experimental phase but is already the most supported NAT bypass technique in XMPP software. It works by relaying the real time stream. When a client notices it is located behind a NAT device, it will do a service discovery to find any clients, or servers for that matter, which support a Jingle relay node and have direct access to the Internet without NAT. When an appropriate relay node is found, a request for a Jingle relay channel is made. The Jingle relay node responds (if all goes accordingly) with a Jingle relay node channel accept and parameters like maximal kbps, public IP address and port. The client includes the IP address and port number into the Jingle negotiation process. The sending client does not have to support the XMPP Jingle relay node. It has only to transmit the stream to the IP address and port given to it. This will be resent to the receiving party. 3.2.8 Advantages of XMPP XMPP is an easy to understand protocol. At the core is a set of principles that are formalized in a very straightforward, standard and explicit way. Due the restrictions of the protocol, the risk of it degenerating into separate dialects is small. One of XMPP principles is that it should be easy to add extension to the protocol making it easy to add functionality to any existing XMPP infrastructure. Another distinct advantage is the way it uses URI to identify resources. A user can have several clients logged in at the same time. The URI would look something like this user@domain.nl/mobile and user@domain.nl/laptop . Every client receives an 8-bit priority number between -128 and 128. This enables the user to select as default a client that he prefers to be addressed by. When needed the conversation can be moved to a different client and changed to video mode, e.g. chatting with a user@domain.nl/laptop could be moved to a video conversation with the client user@domain.nl/video. Security is also incorporated into the design by using TLS and SALS.

Roel van de Wiel

Page 38

Unified Communications: transitioning toward an 2011 interoperable multimediaworld


One big difference with SIP is that XMPP is connection orientated. A user needs to log in to a XMPP domain first. SIP is connectionless; it uses the server only to set up the connection. Most current SIP servers have advanced features which allow for more control, but the SIP architecture does not provide the functionality natively. IETF tried to extend the SIP functionality by developing Simple, but up to now Simple has not been standardized and it is considered to be too complex. The fact remains that connection-oriented traffic is easier to control and manage. For example, XMPP needs no firewall bypassing technique because it uses a constant TCP connection. However, a constant TCP connection could be problematic when it comes to resource exhaustion.

3.3 Technology to guarantee the smooth delivery of real-time data


As previously mentioned in this document, setting up a voice, video or any other real-time stream is a two-sided process. There is a negotiation stream step for setting up a real-time stream and for the real-time stream itself. The negation stream can be considered trivial and can either be SIP with SDP or XMPP with Jingle. The difficult part is setting up and handling the real-time stream itself. Real-time media like voice or VOIP is prone to be easily affected when unfavorable conditions occur on a connection like Jitter or Latency. This is due to the very nature of the Internet which is packet switched instead of circuit switched. Data is broken up into smaller parts, placed into packets, transmitted over the next connection and rebuilt at the receiving end. On its way, the packets pass through many routers and switches which do nothing else but retransmit the packet to the next hop. At any point in the path, when the transmission line to the next station becomes congested, the forwarding station buffers incoming packets in a queue and handles them according to the FIFO (first in, first out) principle. When the buffers eventually become full, packets are discarded and the quality of the data signal in effect deteriorates. However there are techniques available to mitigate these problems. 3.3.1 Overprovisioning The simplest way of guaranteeing a smooth video or voice data delivery is over provisioning. This technique makes more network bandwidth available for transmission than what can be consumed at peak times. This extra capacity prevents transmission delays and guarantees on-time delivery of data. However over provisioning is an expensive way of preventing delays. 3.3.2 QOS Another way of guaranteeing acceptable delivery times for real-time media is applying Quality of Service to the data. Quality of Service is a generic term which covers a wide spectrum of different techniques used to guarantee the timely delivery of real-time media. The most common strategy is to tag each packet with a code that corresponds to the appropriate priority value given to the data. This is done at the network layer through DSCP or at the data link layer with protocols 802.1q and 802.1p. When such tagged packets arrive at a network device, the network device gives priority of transmission to the higher tagged packets than the ordinary traffic. The network device places the packets in separate transmission queues. Simple traffic scheduling mechanisms service the queues according to the round-robin principle. There are more complex mechanisms which allow for a better handling of priority packets.

Roel van de Wiel

Page 39

Unified Communications: transitioning toward an 2011 interoperable multimediaworld


QOS in Wifi Wifi has become an important part of the network architecture within companies, but it has proven to be very problematic when transmitting real-time media over Wifi. This is due to the fact that Wifi is a shared medium. Devices connected to an Access Point (AP) have to share the entire available bandwidth. Wifi can be compared to an assembly line. There is only a limited number of packets that the Wifi assembly line can process each second. The 802.11e standard or WMM can provide QOS over Wifi, but to understand 802.11e there needs be an understanding of how WIFI works. 802.11 has two modes of operation: Point-coordinated-function and distributed-coordinatedfunction. In PCF, a beacon is broadcast every 01.seconds by the AP. Between each beacon being sent, there is a Content Free Period (CFP) and a Content Period (CP). The CFP is the first period that starts, and in this period the AP sends content-free poll packages from the AP to each station to allocate the station a window to send its data. In the CP, DCF is used which is a free-for-all mode to send data to the AP, needless to say this decreases reliability. In DCF between each transmission, there is a fixed IFS in place after which stations are allowed to transmit data to the AP. To prevent stations transmitting at the same time, a random Back Off timer is used. The Back Off timer is a random integer multiplied by the contention window size (CW). In a conventional 802.11 network where 802.11e is not applied, there are no mechanisms in place to favor high priority real time data. 802.11e defines two mechanisms which replace the old PCF and DCF function. EHDC replaces DCF. EDHC uses a shorter arbitrary interframe spacing for real time data and a lower CW value. This results in video and voice data being favored and there is a longer time period (TXOP) to transmit the data. The 802.11e alternative for PCF is Hybrid Controlled Channel Access. With HCCA, CPF can be initiated when there is a need for it, in contrast to PCF where CPF and CP are fixed. HCCA allows fine-grained QOS due its ability to apply QOS to sessions rather than to each station. HCCA works by allocating more window space and more frequent time to real-time streams.

3.3.1 Codecs codecs (COdingDECoding) are algorithms used to encode analogue audio and video signals into a digital form for transmission. There are many codecs for audio and video streams varying in complexity and serving different purposes. The increase in network bandwidth and the computing power in the last few years have made it possible to employ advanced codecs which lead to a higher quality video and audio. Until recently, there was no provisioning in place to make codecs more robust and resilient for unfavorable network conditions. A new variation of the H.264, H.264 appendix G or H.264 SVC, standardized by ITU-T, is specially designed to produce an acceptable video quality over slow and error-prone links. This is achieved by splitting the signal into several layers.

Roel van de Wiel

Page 40

Unified Communications: transitioning toward an 2011 interoperable multimediaworld

Figure 4 h.264 SVC from nojitter.com

The base layer of the protocol delivers a low-resolution low frame-rate video. The layers on top of this base layer supplement the data by enhancing the quality of the video which the base layer is made up of. When the data of one of the enhancement layers is incomplete, the resolution and frame rate of the video is dropped for a very short period and it falls back to the resolution and frame rate of the next lower layer. The base layer has modest bandwidth requirements compared to the upper layers so it is less likely that the reconstruction of the base layer will fail due to incomplete data. Another advantage of SVC is seen with multi party video conferencing. With this type of video conferencing, multiple streams are sent to several devices that might differ in their ability to render video, for example devices that only can handle 480p like mobile phones. Instead of recoding the original video for that one special device type, only the layers that the device supports have to be retransmitted. Another way of making the stream extra resilient is by applying the Forward Error Correction algorithm. When parts of the data in packets are damaged along the way, the original information can be recreated by using the added FEC information, provided there is enough data to reconstruct the missing parts. FEC has the disadvantage of adding overhead to the data stream, but used in combination with SVC, and applying FEC only to the base layer, overhead is kept to a minimal while still providing a video stream that is resilient enough to withstand unfavorable network conditions. 3.3.2 Call admission control Call Admission Control is a last resort for guaranteeing acceptable quality in the data stream. With call admission control active, capacity in the network is closely monitored and guarded. When there is a danger of running out of bandwidth, call admission control steps in and prohibits the setup of another (video) call. Call Admission Control is not incorporated in the XMPP protocol because it is regarded as a function to be dealt with outside the scope of XMPP. Currently there is no XMPP server software which supports Call Admission Control due the fact that these XMPP servers' primary function is to provide IM communication, and Jingle support is considered to be less important.

Roel van de Wiel

Page 41

Unified Communications: transitioning toward an 2011 interoperable multimediaworld


4. Examples of possible XMPP use in practice
In the long term it is likely that XMPP/Jingle will become more dominant at the expense of SIP/simple, with SIP used only in limited cases. To make a better case, a case will be built to provide a better image of how XMPP use in an enterprise UC environment would look like. The functionality necessary will be described with references to any XMPP extensions. In case XMPP lacks any needed functionality then this will also be described and a possible solution will be offered. First, a general network UC infrastructure will be designed. Second, three cases will be described to give a better impression of how the infrastructure can be used. The first case demonstrates the setup of a multi-media stream between XMPP Jingle showing the negotiations between two clients in the same domain. The second case is the setup for two clients located in different domains. The third case is an example of how XMPP could be used in retail banking to provide customers with new ways to contact their bank.

4.1 The UC infrastructure.


At the heart of Unified communications is the UC manager, which, in this case, will be a XMPP server. It keeps track of clients, maintains a roster with presence status and communicates with other domains bases on s2s (server to server) XMPP. Clients can initiate real-time media streams on their own by using Jingle for setup. The server does not have to be Jingle aware, it only propagates the Jingle negotiations instead of actively taking part in them. But ideally the XMPP is Jingle-aware so more control and monitoring can be put into place. A new experimental extension Jingle relay node, which is meant for NAT traversal can be used as a base for a multimedia gateway. The RTP stream will pass through the gateway and can be recoded if needed. In the following text, examples will be given, and together with illustrations, will show a logical XMPP architecture rather than a physical one. That means multiple devices can be represented by one function shown by a symbol in the form of a colored dot. Intradomain Jingle call In a simple case, User 1 wants to start a video conference with User 2. They are both within the corporate network. From his location or through a VPN, the User 1 client sends a Jingle request to the server and the server forwards it to the User 2 client. The RTP stream is set up directly between the clients. If the User is logged in with multiple resources, he can choose which resource he prefers, for example, a video screen or a mobile phone.

Roel van de Wiel

Page 42

Unified Communications: transitioning toward an 2011 interoperable multimediaworld

Example.com

Jinglerelay.Example.com

2& 3

3 2&

1
Media stream user1@Example.com\laptop user2@Example.com\mobile user2@Example.com\videoscreen

Figure 5: intradomain Jingle

1: XMPP jingle request with offer 2: XMPP jingle accept 3: XMPP termination Extensions used: XEP-0166 Jingle, XEP-0167 Jingle RTP session Interdomain jingle call When setting up an interdomain video call, NATs from both sides have to be traversed. Different techniques can be applied to overcome this. An interesting one is Jingle relay node. A Jingle relay node is a Jingle client connected to the Internet with a public IP address and which has direct access to the Jingle client it is serving with no NAT taking place in the path. When User 1 wants to set up a Jingle call with User 3 who is from a different domain, but notices it is behind a NAT, it initiates a service discovery to search for a XMPP Jingle relay node. When a XMPP Jingle relay node is found the client asks for an available channel to facilitate a RTP stream. If there is enough capacity available, a channel will be provided. The RTP stream of the external client will be redirected through the Jingle relay node. For a multiparty video or audio conference there is currently no standard agreed. There is a company named Peoplelink which has based its video MCU product on XMPP. At this moment, it is not clear if it is based on XMPP with an open standard or if it has chosen its own proprietary solution for its MCU based on XMPP. It should be very easy to add a multiparty video conferencing standard to XMPP. The possibility to redirect RTP to a different destination is included in Jingle. The Jingle relay node specifications could be used to provide such functionality. At this moment the author of Roel van de Wiel Page 43

Unified Communications: transitioning toward an 2011 interoperable multimediaworld


XMPP Jingle relay, Thiago Camargo, is working on a standard for MCU which will use the fundamentals of XMPP Jingle Relay.
user3@test.com\videoscreen

test.com

am Media stre

2+3 Example.com

1+ 2+ 3

m ea str ia ed M

user1@Example.com\laptop

Figure 6: Interdomain Jingle

The process is quiet similar like in the first example in intradomain communication. Only there are some extra steps involved. 1: search for XMPP jingle relay node that is accessible 2: request XMPP jingle relay channel from available XMPP relay node 3: send XMPP jingle relay channel details Extensions used: XEP-0166 jingle, XEP-0167 jingle RTP session XEP-0278 jingle relay node

Specific banking application

Roel van de Wiel

Page 44

Unified Communications: transitioning toward an 2011 interoperable multimediaworld


XMPP design makes it well-suited to base custom applications on. An example will be shown of how XMPP can be used to develop an application for retail banking sector. In this case a XMPP application is specified with a goal of enabling secure conversations with customers. It starts with a customer logging in to his personal banking page. In the page there is an integrated web client. If the customer would like to receive customer support he simply has to start typing and this action will activate the web client. The web client is a based on XMPP and logs in with a user name unique to the customer and to a domain which is dedicated to customers. Tier one customer support answers the call and tries to fulfill the service request. If necessary an extra customer service employee who is specialized in a particular subject, can be added to the conversation. This can even be a regular employee from a local branch if the communication infrastructure of the organization and the contact centre are linked. When the service request is complete, the customer service employee or the customer ends the conversation. If any requests made by the customer needs to be confirmed, a data form can be sent by the customer service employee where all request done by the customer are listed. The customer has to sign the agreement by using his security token. All the conversations are secured and can be logged if needed to remain compliant with banking regulations. If the infrastructure allows it, video and voice could be added to the solution. So when needed instant video conversations can be set up when desired by the customer. Another possibility is to allow the customer to add customer service to their IM contact list. In this way the customer can directly contact the bank and the access level to the bank service is lowered. Authentication of the user can still be done based on current XMPP specifications.

Roel van de Wiel

Page 45

Unified Communications: transitioning toward an 2011 interoperable multimediaworld


Customer

webserver

customerID@customercenter.com/ applicationserver

Customercenter.com

Example.com

Jinglerelay.Example.com

user1@Example.com\laptop

Figure 7: Example XMPP contactcenter

One of the choices made in this design is to separate the XMPP domains. One for the web site which is used for the user account of the customer, and the other for the company. The reason for this separation is that it increases the manageability of the solution. This way multiple sites can be managed by one company and vice versa. Extensions: XEP-0004 data form, XEP-0166 jingle, XEP-0167 jingle RTP session XEP-0278 jingle relay node

Roel van de Wiel

Page 46

Unified Communications: transitioning toward an 2011 interoperable multimediaworld


5. Final recommendations
Capgemini is often involved with projects regarding the implementation of UC systems in customers infrastructure. In the past years, the ability to communicate between different companies using different UC systems in an ad-hoc fashion was often not considered as a viable function when it came to designing UC systems. This is due to the lack of a common language that could make this feasible. But looking at the current adaptation rate of XMPP by companies as a base for their Internet services, it is highly likely that XMPP, in combination with Jingle, will serve as the lingua franca for UC systems in the future. Capgemini could benefit from this development by incorporating XMPP in its approach towards UC systems. But XMPP is not only well-suited as standard for the UC-industry. Other systems previously mentioned in this thesis could be designed with XMPP as foundation for communication. A concrete example of a service that could be enhanced with XMPP is the immediate cloud service of Capgemini. Whichs offers companys a social media platform to communicate with customers. The banking sector could benefit from XMPP by using it to open up new ways of directly communicating with the customer. The flexibility provided and being an open standard, XMPP allows setting up a communication architecture that is ubiquitous so different channels can be used to communicate with the customer from all points in the organization. In its research report, (Take Four Steps to Prepare for XMPP Becoming an Universal Standard, 2010) Gartner gives four recommendations with regard to XMPP becoming a universal standard. Most of them are fairly self-evident. The advice is: Take an inventory of communication and collaboration technologies, including those deployed by the business, those to be implemented in the future, and those used by partners and customers. Consider what the communication and collaboration environment should look like in three to five years. Which technologies should work together, and what level of integration they should have. IT leaders should push vendors for real interoperability with other vendors' XMPP-based technologies. Ask vendors what their road map is, and make support for XMPP a key consideration before choosing communication and collaboration products for strategic initiatives.

The advice given by Gartner is sound. In addition to this advice which is fully supported by the author of this thesis, the capabilities of XMPP multimedia extension Jingle should be taken into account when developing a communication architecture. That advice would be to consider XMPP Jingle, when and where possible, to be the protocol of choice when it comes to interdomain multimedia communication such as voice or video conferencing with partners or customers. Another important part of UC systems is the transmission of real-time traffic. Real-time traffic is very prone to suffer from unfavorable network conditions. Usually adaptations to the network infrastructure are made to mitigate this problem, e.g. implementing QoS. But often key parts of the Roel van de Wiel Page 47

Unified Communications: transitioning toward an 2011 interoperable multimediaworld


network, e.g. wireless, are neglected when it comes to implementing QoS. To guarantee smooth delivery of real-time traffic, measurements should be taken that do not only affect a certain part of the network but influence the transmission of real-time traffic positively across the whole network. It can be described as a more holistic approach. Some of those measurements that can be taken to improve quality are described in chapter three. When developing an UC infrastructure, the whole network should be evaluated and redesigned if needed.

Roel van de Wiel

Page 48

Unified Communications: transitioning toward an 2011 interoperable multimediaworld


List of Figures
Figure 1 Nimbuzz on nokia .................................................................................................................... 22 Figure 2 SIP architecture: (Understanding SIP) ..................................................................................... 30 Figure 3 SIP call flow (Internet Communications using SIP, 2006) ...................................................... 30 Figure 4 h.264 SVC from nojitter.com .................................................................................................. 41 Figure 5: intradomain Jingle.................................................................................................................. 43 Figure 6: Interdomain Jingle ................................................................................................................. 44 Figure 7: Example XMPP contactcenter ................................................................................................ 46

Roel van de Wiel

Page 49

Unified Communications: transitioning toward an 2011 interoperable multimediaworld


Bibliography
10-years-of-SIP-dominance. (n.d.). Retrieved from http://blog.tekelec.com/blog/bid/13136/10-yearsof-SIP-dominance 3GPP.org. (n.d.). Retrieved from 3GPP.org Andre, P. S. (2009). XMPP: Lessons Learned from ten years XML programming. IEEE Communications Magazine . Baird, S. (2009). Government Role and the Interoperability ecosystem. I/S: A JOURNAL OF LAW AND POLICY FOR THE INFORMATION SOCIETY , 222-286. Bakas, A. (2010). Het Nieuwe werken gaat van AU. TeleCom jaargan 7 editie 43 , 58-60. Camargo, T. (n.d.). Xmppjingle.blogspot.com -random articles. Retrieved from Xmppjingle.blogspot.com. Camero, T. (n.d.). XMPP jingle. (R. v. Wiel, Interviewer) Dewing, H. (2009). Market Overview: Sizing Unified. Forrester. Gartner Says Distinctions in Unified Communications and Collaboration Will Disappear by 2013. (n.d.). Retrieved from gartner.com: http://www.gartner.com/it/page.jsp?id=1212613 Hilster, D. (n.d.). Banks & UC. (R. v. Wiel, Interviewer) Ingo1, H. Session Initiation Protocol (SIP) and other VOIP protocols. Instant messaging helping Mass. bank build report. (2006). American Banker . Jay Lassman, B. H. (2010). Predicts 2011: Adoption of Unified Communications Creates New Sourcing and Deployment Challenges. gartner. Johnston, H. S. (2006). Internet Communications using SIP. wiley. King, B. (2010). Bank 2.0. Marshall Cavendish. Knotter, R. B. (2010). Het nieuwe werken ontrafeld. Gorcum b.v., Koninklijke Van. Nojitter.com - random articles. (n.d.). Retrieved from Nojitter.com: Nojitter.com Percy, A. D. (n.d.). sip interoperability . Retrieved from blog.tmcnet.com: http://blog.tmcnet.com/sip-invite/sip/sip-interoperability---why-is-it-so-hard-to-achieve-part-i.html Saint-Andre, P. (2009). XMPP: The Definitive Guide. O'Reilly . Schmidt, M. (2009). FINANCIAL ADVICE ONLINE. Universitaire Pers Maastricht. Schulzrinne, A. R. (2007). Real-world SIP Interoperability: Still an elusive quest . Colombia University .

Roel van de Wiel

Page 50

Unified Communications: transitioning toward an 2011 interoperable multimediaworld


Smith, D. M. (2010). Take Four Steps to Prepare for XMPP Becoming an Universal Standard. Gartner. Ubiquity. (n.d.). Understanding SIP. Ubiquity . Unified Communication Interoperability Forum. (n.d.). Retrieved from UCIF.com. Vogel, K. R. (2009). eCollaboration: On the nature and emergence of communication and collaboration technologies. Institute of Information Management, University of St. Gallen 2009. XMPP.org. (n.d.). Retrieved from XMPP.org -servel extensions: XMPP.org

Roel van de Wiel

Page 51

You might also like