Professional Documents
Culture Documents
with
Multimedia Capabilities
Janet Adams
April 2005
BACHOLOR OF ENGINEERING
IN
TELECOMMUNICATIONS ENGINEERING
Supervised by Dr. Derek Molloy
Acknowledgements
I would like to thank Dr. Derek Molloy, who supervised me on this project, for his
enthusiasm and guidance. I would also like to thank Edward Casey, whom I
collaborated with on certain areas of the project, and his supervisor, Dr. Gabriel
Muntean, for his support and advice. My thanks also go to my friends Edward Casey,
Edel Harrington and Hector Climent for listening to me and guiding me through my
initial presentation. I would like to dedicate this project to my parents, who have
supported me throughout all my time in college and especially during this, my final
year.
Declaration
I hereby declare that, except where otherwise indicated, this document is entirely my
own work and has not been submitted in whole or in part to any other university.
Signed: ......................................................................
ii
Date: ......................................
Abstract
This document will describe the development of a video conferencing system with
multimedia capabilities. The concept of multicasting will be explored as this was used
in the development of the video conferencing. Other concepts, which were used in the
development of the system such as Java Media Framework, Real-time Transport
Protocol and a number of encoding schemes, will also be investigated.
The design of the system will explain how each of the features was planned for and
developed, and will provide the user with an understanding of video conferencing,
client server communications, motion detection and much more. The implementation
section is read like a user manual. On completion of this section, the reader should be
able to make full use of all of the features within the application and should
understand the depth to which each of the features can be used.
When this document has been read, the reader will fully understand both how the
system was developed and how it can be used, as well as understanding the necessary
technical information to understand how the different features work.
iii
Table of Contents
ACKNOWLEDGEMENTS.........................................................................................II
DECLARATION..........................................................................................................II
ABSTRACT ................................................................................................................ III
TABLE OF CONTENTS........................................................................................... IV
TABLE OF FIGURES............................................................................................ VIII
TABLE OF TABLES...................................................................................................X
1
INTRODUCTION.................................................................................................1
1.1
1.2
1.3
1.3.1
1.3.2
1.3.3
Laptop .....................................................................................................2
1.3.4
2.1.1
Introduction.............................................................................................3
2.1.2
2.1.3
2.1.4
2.1.5
2.1.6
Alternatives to JMF...............................................................................15
2.1.7
Summary................................................................................................15
2.2
2.2.1
Introduction...........................................................................................16
2.2.2
2.2.3
2.2.4
2.2.5
2.2.6
Summary................................................................................................25
2.3
2.3.1
Introduction...........................................................................................26
2.3.2
Encoder Principles................................................................................26
2.3.3
Decoder Principles................................................................................27
2.3.4
2.3.5
Summary................................................................................................28
2.4
2.4.1
Introduction...........................................................................................28
2.4.2
2.4.3
2.4.4
Summary................................................................................................29
2.5
2.5.1
2.5.2
2.6
MULTICASTING ..........................................................................................32
2.6.1
2.6.2
2.7
3
SUMMARY ..................................................................................................34
3.1.1
3.1.2
3.2
3.2.1
3.2.2
3.3
3.4
CONFERENCING ..........................................................................................42
3.5
3.6
3.6.1
Login .....................................................................................................47
3.6.2
3.6.3
Call Teardown.......................................................................................49
v
3.6.4
3.7
4
Logout ...................................................................................................50
OTHER FEATURES WITHIN THE APPLICATION .............................................50
INTRODUCTION...........................................................................................51
4.2
LOGGING IN ...............................................................................................51
4.3
CALLS ........................................................................................................52
4.3.1
4.3.2
4.3.3
4.3.4
4.4
MESSAGES..................................................................................................58
4.4.1
4.4.2
4.4.3
Videomail Messages..............................................................................63
4.5
4.5.1
4.5.2
Adaption ................................................................................................64
4.6
6.2
6.3
6.4
REFERENCES............................................................................................................78
7
APPENDIX 1 .......................................................................................................79
7.1
7.2
7.3
7.4
7.5
7.6
7.7
7.8
7.9
APPENDIX 2 .......................................................................................................89
8.1
vii
Table of Figures
FIGURE 2.1 - MEDIA PROCESSING MODEL .......................................................................4
FIGURE 2.2 - SYSTEM PROCESSING MODEL .....................................................................5
FIGURE 2.3 - JMF BASIC SYSTEM MODEL.......................................................................6
FIGURE 2.4 - RTP AND THE OSI MODEL .......................................................................17
FIGURE 2.5 RTP PACKET HEADER FORMAT ...............................................................20
FIGURE 2.6 - RTCP SENDER REPORT STRUCTURE ........................................................24
FIGURE 2.7 - RTCP RECEIVER REPORT STRUCTURE .....................................................25
FIGURE 2.8 - G.723.1 ENCODER ....................................................................................27
FIGURE 2.9 - G723.1 DECODER .....................................................................................27
FIGURE 2.10 - H.263 BASELINE ENCODER ....................................................................29
FIGURE 2.11 - MACROBLOCKS WITHIN H.263 ...............................................................31
FIGURE 2.12 - MOTION PREDICTION ..............................................................................31
FIGURE 2.13 - ORIGINAL CONFERENCING PLAN ............................................................33
FIGURE 2.14 - MULTICASTING THROUGH ROUTER ........................................................34
FIGURE 3.1 - CLIENT TO SERVER COMMUNICATION ......................................................35
FIGURE 3.2 CLIENT TO CLIENT COMMUNICATION ......................................................37
FIGURE 3.3 - SERVER CLASS DIAGRAM .........................................................................38
FIGURE 3.4 - EXAMPLE OF PUSH PULL MESSAGE SETUP ...............................................39
FIGURE 3.5 - CLIENT CLASS DIAGRAM ..........................................................................41
FIGURE 3.6 - ALLOCATING A CONFERENCE POSITION ...................................................43
FIGURE 3.7 - MESSAGE SEQUENCE CHART FOR CONFERENCE CALL .............................44
FIGURE 3.8 - CONFERENCING SETUP .............................................................................45
FIGURE 3.9 - IMAGE OBSERVATION AVERAGES.............................................................46
FIGURE 3.10 - MESSAGESEQUENCE CHART FOR LOGIN ................................................47
FIGURE 3.11 - MESSAGE SEQUENCE CHART FOR CALL SETUP ......................................48
FIGURE 3.12 - MESSAGE SEQUENCE CHART FOR CALL TEARDOWN ..............................49
FIGURE 3.13 - MESSAGE SEQUENCE CHART FOR LOGOUT.............................................50
FIGURE 4.1 - LOGIN SCREEN .........................................................................................52
FIGURE 4.2 - HOME SCREEN ..........................................................................................53
FIGURE 4.3 - MAKING A P2P CALL ...............................................................................54
FIGURE 4.4 - DURING A CALL........................................................................................55
FIGURE 4.5 - CALL ACCEPT /REJECT .............................................................................56
FIGURE 4.6 - INITIATING A CONFERENCE CALL .............................................................57
FIGURE 4.7 - CONFERENCE REQUEST ............................................................................58
FIGURE 4.8 - MMS SCREEN ..........................................................................................59
FIGURE 4.9 - ATTACH BUTTON FILE CHOOSER .............................................................60
FIGURE 4.10 - MMS SCREEN READY TO SEND ..............................................................61
FIGURE 4.11 UNIFIED INBOX SCREEN .........................................................................62
viii
ix
Table of Tables
TABLE 2.1 JMF COMMON VIDEO FORMATS ...............................................................10
TABLE 2.2 JMF COMMON AUDIO FORMATS...............................................................11
TABLE 5.1 - TESTING SCENARIOS: LOGIN / LOGOFF ......................................................70
TABLE 5.2 - TESTING SCENARIONS: MAKING A CALL ...................................................71
TABLE 5.3 - TESTING SCENARIOS: SENDING A MESSAGE ..............................................72
TABLE 5.4 - TESTING SCENARIOS: CONFERENCE CALL .................................................73
TABLE 5.5 - OTHER TESTING SCENARIOS ......................................................................74
Chapter 1
1 Introduction
All business organisations, for example, office blocks, colleges and shopping centres,
have telephone systems installed in them. These telephone systems allow features such
as call forward, call divert, voicemail, free extension dialling to other users within in
the same network, etc. Another object that is found in almost all of these facilities is
computers, usually one per user. Therefore, in the majority of establishments, you will
find that every employee has a telephone handset and a computer. A cost effective and
space saving idea would be to combine these two everyday utilities so that the
computer can also be used as a phone. People want their lives and work to be as
simple and time efficient as possible and one way to achieve this is to have a software
based telephony system on their computers. Why do they need a physical telephone
handset when it is possible to attain all the same features on their computers, cutting
out the expense of the handset?
1.3.3 Laptop
Testing was difficult as very few of the features could be tested alone. Almost all
testing required two computers. For this reason, it was most efficient to use two
laptops connected to two webcams.
Chapter 2
2 Technical Background
In this chapter, the various standards used in the design of this system will be
discussed. The standards chosen were based on what was supported by the Java Media
Framework. There were possibly some more suitable options out there, for example
with the encoding schemes, but the choice was limited by what was supported by the
Java Media Framework and the Real-Time Transport Protocol. The standards
discussed within this chapter were the basic building blocks that this project was built
on.
Pull data source: Here the data flow is initiated by the client and the data flow
from the source is controlled by the client.
Push data source: Here the data flow is initiated by the server and the data flow
from the source is controlled by the server.
Several data sources can be combined into one. So if you are capturing a live scene
with two data sources: audio and video, these can be combined for easier control.
Capture Device
A capture device is the piece of hardware that you would use to capture the data,
which you would connect to the DataSource. Examples would be a microphone or
a webcam. The captured media can then be sent to the Player, converted into
another format or even stored to be used at a later stage.
Like DataSources, capture devices can be either a push or a pull source. If a
capture device is a pull source, then the user controls when to capture the image, if it is
a push source, then the user has no control over when the data is captured, it will be
captured continuously.
Player
As mentioned above, a Player takes a stream of data and renders it to an output
device. A Player can be in any one of a number of states. Usually, a Player would
go from one state to the next until it reaches the final state. The reason for these states
is so the data can be prepared before it is played. JMF defines the following six states
for the Player:
Unrealized: In this state, the Player object has just been instantiated and
does not yet know anything about its media.
Realizing: A Player moves from the unrealized state to the realizing state
when the Player's realize() method is called. In this state, the Player is
in the process of determining its resource requirements
Realized: Transitioning from the realizing state, the Player comes into the
realized state. In this state the Player knows what resources it needs and has
information about the type of media it is to present. It can also provide visual
components and controls, and its connections to other objects in the system are
in place. A player is often created already in this state, using the
createRealizedPlayer() method.
Prefetched: The state where the Player has finished prefetching media data
- it's ready to start.
Started: This state is entered when you call the start() method. The
Player is now ready to present the media data.
Processor
A Processor is a type of Player, which has added control over what processing is
performed on the input media stream. As well as the six aforementioned Player
states, a Processor includes two additional states that occur before the
Processor enters the realizing state but after the unrealized state:
AudioFormat
VideoFormat
H261Format
H263Format
8
IndexedColorFormat
JPEGFormat
RGBFormat
YUVFormat
As will be discussed in more detail later on in this report, the formats that were chosen
for this project were H.263 for the audio and G.723 mono, for the audio.
Manager
A manager, an intermediary object, integrates implementations of key interfaces that
can be used seamlessly with existing classes. JMF offers four managers:
Manager:
Use
Manager
to
create
Players,
Processors,
Looking at Table 2, for the audio, it can be seen that there are two formats that meet
the low bandwidth requirement. These are GSM and G.723.1. Of these two, the former
has a low quality while the latter has medium quality. It therefore made more sense to
choose G.723.1. I have highlighted the chosen encoding schemes.
Format
Cinepak
MPEG1
H.261
Content Type
AVI
QuickTime
MPEG
AVI
RTP
Quality
CPU
Bandwidth
Requirements
Requirements
Medium Low
High
High
High
High
Low
Medium
Medium
QuickTime
H.263
AVI
Medium Medium
Low
High
High
RTP
QuickTime
JPEG
AVI
High
RTP
Indeo
QuickTime
AVI
Medium Medium
Medium
10
Format
Content
CPU
Bandwidth
Requirements
Requirements
High
Low
High
Low
Low
High
Quality
Type
AVI
PCM
QuickTime
WAV
AVI
Mu-Law
QuickTime
WAV
RTP
ADPCM
(DVI,
IMA4)
AVI
QuickTime
WAV
Layer3
GSM
G.723.1
Medium
High
High
High
High
High
Medium
Low
Low
Low
RTP
MPEG-1 MPEG
MPEG
Medium Medium
MPEG
WAV
RTP
WAV
Medium Medium
RTP
Low
media streams that have been captured from a local capture device using a capture
DataSource or that have been stored to a file using a DataSink. Similarly, JMF can be
extended to support additional RTP formats and payloads through the standard plugin
mechanism.
[JavaTM
Media
Framework
API
http://java.sun.com/products/java-media/jmf/2.1.1/guide/index.html,
Guide,
November
19,
Session Statistics: The session manager maintains statistics on all of the RTP
and RTCP packets sent and received in the session. The session manager
provides access to two types of global statistics:
o GlobalReceptionStats: Maintains global reception statistics for the
session.
o GlobalTransmissionStats:
Maintains
cumulative
transmission
12
RTP Events
RTP-specific events used to report on the state of the RTP session and streams. To
receive notification of RTP events, you implement the appropriate RTP listener and
register it with the session manager:
stream
that's
being
received.
You
can
implement
2.1.7 Summary
As can be seen from the above sections, JMF is a very powerful tool. It is very easy to
work with and the best way to understand it is to use it. It is fair to say that there is a
lot of information, such as forums, help-sites etc. on the World Wide Web regarding
this subject. However, there is not a lot of information on using JMF for projects
similar to this one. Perhaps one of the best features of JMF is that it does not require
one to learn everything about it before using it. With a basic understanding of Java, it
is possible to teach yourself as you go along.
15
The real-time transport protocol (RTP), to carry data that has real-time
properties,
The RTP control protocol (RTCP), to monitor the quality of service and to
convey information about the participants in an on-going session.
The diagram that is shown below in Figure 2.4 - RTP and the OSI Model below,
shows how RTP is incorporated into the OSI model. RTP fits into the session layer of
the model, between the application layer and the transport layer. RTP and RTCP work
independent of the underlying Transport Layer and Network Layer protocols.
Information in the RTP header tells the receiver how to reconstruct the data and
describes how the codec bit streams are packetized.
16
RTP payload: The data transported by RTP in a packet, for example audio
samples or compressed video data.
RTP packet: A data packet consisting of the fixed RTP header, a possibly
empty list of contributing sources, and the payload data. Some underlying
protocols may require an encapsulation of the RTP packet to be defined.
Typically one packet of the underlying protocol contains a single RTP packet,
but several RTP packets may be contained if permitted by the encapsulation
method.
17
though all the audio packets contain the same SSRC identifier (that of the
mixer).
Mixer: An intermediate system that receives RTP packets from one or more
sources, possibly changes the data format, combines the packets in some
manner and then forwards a new RTP packet. Since the timing among multiple
input sources will not generally be synchronized, the mixer will make timing
adjustments among the streams and generate its own timing for the combined
stream. Thus, all data packets originating from a mixer will be identified as
having the mixer as their synchronization source.
19
X is the Extension bit, when set, the fixed header is followed by exactly one
header extension with a defined format.
CSRC count contains the number of CSRC identifiers that follow the fixed
header.
M is the Marker, whose interpretation is defined by a profile, is intended to
allow significant events such as frame boundaries to be marked in the packet
stream.
Payload type - Identifies the format of the RTP payload and determines its
interpretation by the application. A profile specifies a default static mapping of
payload type codes to payload formats. Additional payload type codes may be
defined dynamically through non-RTP means.
Sequence number increments by one for each RTP data packet sent, and may
be used by the receiver to detect packet loss and to restore packet sequence.
Timestamp reflects the sampling instant of the first octet in the RTP data
packet. The sampling instant must be derived from a clock that increments
monotonically and linearly in time to allow synchronization and jitter
calculations.
SSRC is an identifier that is chosen randomly, with the intent that no two
synchronization sources within the same RTP session have the same SSRC
identifier.
CSRC identifies the contributing sources for the payload contained in this
packet. This is another layer of identification for sessions that have the same
SSRC number, but the data in the stream needs to be differentiated further.
CC
2
1
PT
3
1
Sequence Number
TimeStamp
Payload Packet
.
by having each participant send its control packets to all the others, each can
independently observe the number of participants and this number is used to
calculate the rate at which the packets are sent,
Functions 1-3 are mandatory when RTP is used in the IP multicast environment, and
are recommended for all environments. RTP application designers are advised to avoid
mechanisms that can only work in unicast mode and will not scale to larger numbers.
RTCP Packet Format
As mentioned above, RTCP packets are sent periodically to all participants as well as
the data packets. There are a number of types of RTCP packets:
Sender Report
Receiver Report
Source Description
Bye
Application-specific
All participants in a session send RTCP packets. A participant that has recently sent
data packets issues a Sender Report (SR). The sender report contains the total number
of packets and bytes sent as well as information that can be used to synchronize media
streams from different sessions. The structure of the RTCP SR is shown in Figure 2.6
below. It consists of three sections, possibly followed by a fourth profile-specific
extension section if defined.
21
The first section, the header, is 8 octets long, with the following fields:
The version (V) is 2 bits and identifies the version of RTP, which is the same
in RTCP packets as in RTP data packets.
The padding (P) is 1 bit, if the padding bit is set, this RTCP packet contains
some additional padding octets at the end which are not part of the control
information. The last octet of the padding is a count of how many padding
octets should be ignored.
The reception report count (RC) is 5 bits and represents the number of
reception report blocks contained in this packet.
The packet type (PT) is 8 bits and contains the constant 200 to identify this as
an RTCP SR packet.
The length is 16 bits, the length of this RTCP packet in 32-bit words minus one
including the header and any padding.
The SSRC is 32 bits and is the synchronization source identifier for the
originator of this SR packet.
The second section, the sender information, is 20 octets long and is present in every
sender report packet. It summarizes the data transmissions from this sender and has the
following fields:
The NTP timestamp is 64 bits and indicates the wallclock time when this
report was sent so that it may be used in combination with timestamps returned
in reception reports from other receivers to measure round-trip propagation to
those receivers.
The RTP timestamp is 32 bits and corresponds to the same time as the NTP
timestamp (above), but in the same units and with the same random offset as
the RTP timestamps in data packets.
The sender's packet count is 32 bits and is the total number of RTP data
packets transmitted by the sender since starting transmission up until the time
this SR packet was generated. The count is reset if the sender changes its
SSRC identifier.
The sender's octet count is 32 bits and is the total number of payload octets
(i.e., not including header or padding) transmitted in RTP data packets by the
sender since
generated. The count is reset if the sender changes its SSRC identifier. This
field can be used to estimate the average payload data rate.
22
The third section contains zero or more reception report blocks depending on the
number of other sources heard by this sender since the last report. Each reception
report block conveys statistics on the reception of RTP packets from a single
synchronization source. Receivers do not carry over statistics when a source changes
its SSRC identifier due to a collision. These statistics are:
The SSRC_n (source identifier) is 32 bits and is the SSRC identifier of the
source to which the information in this reception report block pertains.
The fraction lost is 8 bits and is the fraction of RTP data packets from source
SSRC_n lost since the previous SR or RR packet was sent, expressed as a fixed
point number with the binary point at the left edge of the field.
The cumulative number of packets lost is 24 bits and is the total number of
RTP data packets from source SSRC_n that have been lost since the beginning
of reception. This number is defined to be the number of packets expected less
the number of packets actually received, where the number of packets received
includes any which are late or duplicates.
The extended highest sequence number received is 32 bits. The low 16 bits
contain the highest sequence number received in an RTP data packet from
source SSRC_n, and the most significant 16 bits extend that sequence number
with the corresponding count of sequence number cycles.
The last SR timestamp (LSR) is 32 bits and is the middle 32 bits out of 64 in
the NTP timestamp received as part of the most recent RTCP sender report
(SR) packet from source SSRC_n. If no SR has been received yet, the field is
set to zero.
The delay since last SR (DLSR) is 32 bits and is expressed in units of 1/65536
seconds, between receiving the last SR packet from source SSRC_n and
sending this reception report block. If no SR packet has been received yet from
SSRC_n, the DLSR field is set to zero.
23
RC
2
1
PT = SR = 200
3
1
Length
SSRC of Sender
second source)
.
profile-specific extensions
24
RC
2
1
PT = SR = 200
3
1
Length
SSRC of Sender
profile-specific extensions
2.2.6 Summary
The Real Time Transport Protocol is a lot more expansive than described above.
However, for what it was used within this project, the detail given above is more than
adequate. It is important to understand the different packet structures that are shown,
as these form the basis by which all data within the system was sent.
25
actual time spent processing the data in the encoder and decoder,
26
1) *
". /
) *
) *
) *
. /
#
) *
0) *
&) *
$. /
!&
$
) *
%
+
'
$#!
23
,) *
"
) *
) *
) *
) *
!
27
2.3.5 Summary
This format was well chosen as it is ideal for the purpose that it will be used for within
this application, which is basically the voice part of the video conferencing. Although
it is possible to go very deep into the workings of the coder and decoder, it is not
necessary for this project. It is sufficient to know the basics of how it works and what
it is suitable to be used for.
28
2.4.4 Summary
H.263 can be used for compressing the moving picture component of audio-visual
services at low bit rates. It is ideal for uses in video conferencing as there is not much
movement involved and low bit rates are used. This makes it the ideal encoding
scheme for this application.
Fixed Size Block Matching: each image frame is divided into a fixed number
of blocks. For each block in the frame, a search is made in the reference frame
over an area of the image for the best matching block, to give the least
prediction error.
After examining the specification for H.263, it was discovered that there were motion
detection and compensation algorithms built into it. This meant that the algorithm did
not have to be coded, it was already there and available to use. RTCP reports were
used to show the byte rate of the video stream, which was then used to implement the
image observation.
30
31
The way that the above was used for the image observation is as follows. When a
frame hasnt changed, a reference to a previous frame is sent. Basically, the image
observation feature exploits the temporal redundancy inherent in a video sequence.
The redundancy is larger when a camera is focused on an image that does not contain
a lot of movement. This is the case when a user leaves the shot. This redundancy is
reflected in a reduced RTCP byte rate.
This reference frame is then displayed which requires less byte rate than if a new
frame is sent. The RTCP reports monitor the byte rate of the video stream. If the byte
rate drops, and stays dropped for a certain period of time, then the call is ended. The
procedure to end the call is explained in more detail in section 3.5.
2.6 Multicasting
For the conferencing feature of this application, multicasting was used. All of the
participants within the conference transmit to a multicast address.
32
224.0.0.0 to 239.255.255.255. The diagram in Figure 2.14 shows how the data is
distributed to all members of the group.
2.7 Summary
The information contained within this chapter has been an invaluable asset in
developing this application. A firm understanding of all the standards was required
before coding could even begin. JMF placed a lot of restrictions on the standards that
could be used. JMF does provide the ability to implement custom packetizers and
custom encoders, however to so would have been time consuming and unnecessary for
this application.
34
Chapter 3
3 Design of the System
3.1 System Architecture
The system as it stands consists of two different communication architectures. One is
client to server and the other is client to client. The reason that there are two different
methods is to make the system as efficient as possible. There was the possibility of
using client to server for all communication; however it was felt that this would be
inefficient as the server did not need to be part of a call between two clients, it would
have been an unnecessary use of system resources. For this reason, calls between
clients are peer to peer and all other communication goes through the server.
The connections between the server and the clients are bidirectional TCP connections.
It was not necessary to use RTP here as they are not real time connections. RTP is
described in section 2.2 as being ideal for real time communication. The messages that
are sent between the clients and server will include login, logoff, messages to be sent,
calls to be made etc. which are not time dependent. The server plays an integral part in
the system. Basically, all communication between any two clients must first go
through the server. So if a client wishes to call another client, they must send a call
request to the server. The server will then proceed to set up the call between the
clients. The code for this is shown in Appendix 1 in section 7.1. Also included in
Appendix 1 are the code extracts for login request (section 7.2), logoff request (section
7.3), call end request (section 7.4), conference setup request (section 7.5), request to
add a participant to a conference (section 7.6), request to end a conference (section
7.7), request to send a message (section 7.8) and request to receive a message (section
7.9).
The purpose of including these code extracts is to show that the server really does
control everything that the clients want to do. It will be the server that will check if the
other party is online and available, and the server that will set up the call. If a client is
unavailable when a message is sent, the server will store the message until they
become available and will then forward it on. Some might ask why a server is
required, why not just let the clients communicate directly. This was basically a design
choice. It was the opinion of the developer that direct client to client communication
for all tasks would mean that the load on the clients will be quite large, which was
unnecessary. If it was up to the clients to do everything, then the system would be
slowed down sufficiently. The server will act as a centre point, where clients can
contact each other. Without the server, the clients would have difficulty in contacting
another client. It was also a lot more efficient to let the server take some of the load
and leave all administration to the server, leaving the clients free to partake in calls,
send messages etc. It also meant that messages could be sent while clients are on calls
because the server can store the message, and messages can also be sent when the
receiving client is offline and stored until their next login, something that would not
have been possible without a server.
36
37
pull message, and both these messages can be sent or received. There are push and
pull links on all clients and on the server. The reason is that normally in a client server
application, only the client can start communication with the server, but by using push
and pull either can initiate communication. A push message is sent by either the client
or server, depending on who initialises communication, and the response to a push
message is a pull message. The person who sends a push will receive back a push, and
the person who sends a pull will receive a pull. An example is shown below, in Figure
3.4. This type of communication would be used in a situation where the user presses a
button on the client side that initializes communication with the server. However,
within this application, there will usually only be one send and one receive per task
(request and confirmation / error), as opposed to two of each, as shown in the diagram.
39
The next child class is the ServerSideUtilities. This class is responsible for
all the requests which were discussed in section 3.1.1 and whose code extracts are
included in Appendix 1. So basically this class is responsible for all of the things that
the server does. All calls made or received, all messages sent or received and all
system requests such as login and logoff, will go through this class. It is also within
ServerSideUtilities that messageObjects, profileObjects and
mappingObjects are compiled. There will only ever be one of these classes
created for any server.
The final child class of the server is ServerSideStorage. Here the server will
store anything that needs to be stored. Examples include messageObjects,
confernceObjects, mappingObjects etc. It is here that messages will be
stored if the receiver is unavailable and it is this class that will store conference
information, for example how many people are involved in a particular conference.
Once again, there will only ever be one ServerSideStorage class related to any
one server.
four
child
classes,
ClientHandle,
MediaManager,
40
41
senderID: This is the phone number of the client who sent the message
destinationID: This is the phone number of the client that the message is
being sent to
payloadType: This is the type of message that is being sent. It identifies the
purpose of the message and/or the payload object
3.4 Conferencing
The conferencing capabilities built into this application allow up to four people to take
part in one call at the same time, using multicasting, which is described above in
section 2.6.2. There is also a limit of four conferences taking place at one time. The
reason behind these limitations was so the feature could be tested and these limits
could be extended without difficulty.
The conference function is based around a conferenceObject, which consists of
the following fields:
conferenceID: this is the ID of the conference and will be the same as the
phone number of the client who initiated the conference
participantPortBase: this will be the port that the individual participants will
transmit from, and the port that they will not have to listen for
conferenceAddress: this will be the multicast address that the conference will
be transmitted to, which will always be 224.122.122.122
42
43
The setup for the conference is quite simple. All media will be sent to a multicast
address, which will be the same for all conferences. The destinationID field of
the conference object will be set to this multicast address. A reference port will be set
for each conference, which can be found in the conferencePortBase field of the
conference object. From this reference it is known that this and the next seven ports
will be used for this particular conference. Each user is allocated a port to transmit to;
44
this is in the participantPortBase field of the conference object. The user will
then know that it is does not have to listen to this port base, only to the other six, as it
does not have to listen to itself. This is shown in detail in Figure 3.8 below.
45
Byte Rate
35000
30000
25000
20000
15000
10000
5000
0
0
20
40
60
80
100
Sample
Someone talking
46
midrange value between the two averages in the above graph. If the byte rate was less
than 77% of the average, then it was not used in the calculation of the next average. A
count was incremented each time the threshold was reached. The call was cut off when
the count reached ten. The code for the image observation is shown in Appendix 2. If
the mark increments without the person leaving the conversation, it will unlikely get
anywhere near ten, and usually drops back to zero after one or two.
47
49
3.6.4 Logout
The users can logout at any time, once they are logged in, by selecting the logout
option from the file menu.
50
Chapter 4
4 Implementation of the System
4.1 Introduction
This chapter aims to give a graphical and textual description of how to use each of the
features within the system. On completing this chapter, the reader should be familiar
with all of the available features and should be able to make full use of these features.
The chapter will be written in step-by-step manner, beginning with the login, including
all possible actions that can be carried out once logged in and ending with the logoff
process. Images will be shown to describe graphically what is being described in the
text.
4.2 Logging In
Once the application has been started, the first thing that can be seen is the login
screen. There are username and password fields, which must be correctly completed in
order to login. The username and password must be obtained from the system
administrator in advance of attempting to login. Once the username has been allocated,
it will be the same for every subsequent login. The server IP and server port fields
should not, in normal usage, need to be changed. However, if these fields do need to
altered, then the IP must be padded, i.e. any of the sections of three numbers with less
than three digits must be padded with preceding zeroes. Figure 4.1 below shows the
login in screen as it will be seen when the application is first run. As can be seen, the
username and password fields are empty, and the server IP and port fields are hard
coded.
If an invalid username or incorrect password is entered, an error message will pop up
informing the user of the exact reason for the failed login. If for some reason the
server cannot be contacted, and error message will also inform the user of this. If all
information is entered correctly and there are no problems connecting to the server, the
user will be brought to the home screen.
51
"
4.3 Calls
4.3.1 Making a Peer to Peer Call
Once the user is correctly logged in, they will see the home screen, which is shown in
Figure 4.2. From here, they can now access all of the screens, except for the login
screen. If at any time, they wish to logout, this can be done from the file menu. It is
important to logout of the system correctly, before exiting.
52
#
#
53
54
#
&
55
56
57
4.4 Messages
4.4.1 Sending an MMS Message
To send a message, first go to the MMS screen, which is described below in Figure 4.8
58
#
$
&
'
59
If a picture is desired, then do not press the send button yet. Press the attach button and
a file chooser will open as follows,
The required file can then be chosen and will be attached to the image. A preview of
the picture can then be seen in the image preview box, in the top right hand corner of
the window. Once all information has been entered correctly and the desired image has
been attached, the message can now be sent as described above. The MMS screen is
shown below, in Figure 4.10, when it is ready to send a message with an image
attached,
60
message, the message will be moved from the inbox. If the user selects to open,
(which can also be done by double clicking), then a window will open showing the
content of the message, including the image. This window is shown in Figure 4.12.
62
63
4.5.2 Adaption
Another feature that is part of this application is automatic adaption. This will not
really affect the user as such, in that they do not have to do anything and the changes
made will not affect the calls or messages. Basically, if packet loss (congestion) is
detected, steps are taken to improve this and the image will therefore be clearer.
64
65
66
67
#
%
68
Scenario Details
Results /
Comments
Error
message displayed
Error
message displayed
Error
incorrect password
message displayed
Pass
Error
message displayed
6
7
Error
another location.
message displayed
Fail
Code
P1 attempts to logoff
Pass
Fail
Error
message displayed.
Code
needed
server
to
on
detect
handle destroyed
Table 5.1 - Testing Scenarios: Login / Logoff
Making a Call
Number
Scenario Details
Results /
Comments
Sent
to
Sent
to
Videomail
70
Invalid
Videomail
number displayed
timeout
required
Videomail of P2
Required
Timeout
10
11
12
13
14
71
Sending a Message
Number
1
Scenario Details
Results
10
11
72
Conference
Number
Scenario Details
Results /
Comments
73
Others
Number
Scenario Details
Results /
Comments
74
Chapter 6
6 Conclusions and Further Research
6.1 The Benefits of this Project
The obvious benefit of this project is that it provides advanced communication over a
network. It enables users to have video conferences, with one or more people, send
and receive messages and send Videomail all over a simple LAN connection. The
users do not need to be connected to the internet and once set up; there is no cost for
calls or messages. The advanced image observation and adaption features bring
characteristics to the application that have not been seem in similar applications that
have been developed. The application is coded using Java programming language.
There is not really any similar applications that have been coded in Java, the majority
use C++. This project provides a good learning tool by showing the extent to which
Java and Java Media Framework (JMF) can be used for telephony and video
conferencing applications. The application is portable, in that it can be used in
different network situations (although it will perform best on a wired network, it will
also work on a wireless). In addition, the application can be moved between different
platforms, an inherent benefit of Java.
76
the play. The likes of the adaption and the image observation could be made optional
at the higher levels.
77
References
[1]
Guy Cote (Student Member, IEEE), Berna Erol, Michael Gallant (Student
Member, IEEE) and Faouzi
Low
Technology, VOL.
[2]
[3]
Systems,
Dual
Rate
Speech
Coder
for
Multimedia
International
Telecommunication
Union,
Series
H:
Audiovisual
and
78
7 Appendix 1
7.1 Call Setup Request
protected synchronized void processCallSetupRequest(CallObject tempCallObject,
ServerHandle tempServerHandleSender) {
MappingObject tempMappingObject;
ProfileObject tempProfileObject;
MessageObject tempMessageObject;
ServerHandle tempServerHandleDestination;
int tempDestinationID = tempCallObject.getDestinationID();
int tempSenderID = tempCallObject.getSenderID();
tempMappingObject =
this.server.serverSideStorage.getMapping(tempDestinationID);
if(tempMappingObject == null) {
this.updateSystemLog("Call setup failed");
tempMessageObject = this.compileMessageObject(0000000, tempSenderID, 302, "The
number you dialled is incorrect, please try again!");
tempServerHandleSender.sendPullMessage(tempMessageObject);
return;
}
if (tempMappingObject.getClientStatus() != 1) {
this.updateSystemLog("Call setup failed");
tempMessageObject = this.compileMessageObject(0000000, tempSenderID, 302, "The
number you dialed cannot be reached at this time, please try again later");
tempServerHandleSender.sendPullMessage(tempMessageObject);
return;
}
tempProfileObject = tempMappingObject.getClientProfile();
tempServerHandleDestination = tempMappingObject.getServerHandle();
tempCallObject.setDestinationInetAddress(tempProfileObject.getInetAddress());
tempMessageObject = this.compileMessageObject(0000000, tempDestinationID, 300,
tempCallObject);
tempServerHandleDestination.sendPushMessage(tempMessageObject);
tempMessageObject = tempServerHandleDestination.receivePushMessage();
if(tempMessageObject.getPayloadType() == 301) {
this.updateSystemLog("Call setup complete");
this.server.serverSideStorage.updateMapping(tempDestinationID, 2);
this.server.serverSideStorage.updateMapping(tempSenderID, 2);
79
tempServerHandleSender.sendPullMessage(tempMessageObject);
return;
} else if(tempMessageObject.getPayloadType() == 302) {
this.updateSystemLog("Call setup complete");
tempServerHandleSender.sendPullMessage(tempMessageObject);
return;
}
}
80
return;
} else {
tempMessageObject = this.compileMessageObject(0000000, 0000000, 102, "ERROR:
Password incorrect");
tempServerHandle.sendPullMessage(tempMessageObject);
this.updateSystemLog("Login Failed");
return;
}
}
81
82
83
84
85
this.server.serverSideStorage.updateMapping(tempConferenceParticipant.getParticipantID(
), 1);
this.updateSystemLog("Participant" + tempConferenceParticipant + " removed
from conference");
} else {
this.updateSystemLog("Participant removal error");
}
}
tempConferenceParticipant =
this.server.serverSideStorage.getConferenceParticipant(tempConferencePosition, null);
this.server.serverSideStorage.updateMapping(tempConferenceParticipant.getParticipantID(
), 1);
tempMessageObject = this.compileMessageObject(0000000,
tempConferenceParticipant.getParticipantID(), 313, tempConferenceParticipant);
tempServerHandleSender.sendPullMessage(tempMessageObject);
}
}
86
87
88
8 Appendix 2
8.1 Image Observation Code
int imageObservationAverage = 0;
this.imageObservationLastSampleTime = System.currentTimeMillis();
System.out.println("Bytes Sent: " + bytesSent);
System.out.println("Byte Rate: " + byteRate);
System.out.println("timeDelay Rate: " + timeDelay);
this.streamTotalBytesSent = byteSentTotal;
for (int j = 0; j < imageObservationArray.length; j++) {
imageObservationAverage = imageObservationArray[j] +
imageObservationAverage;
}
imageObservationAverage = (int) (imageObservationAverage /
imageObservationArray.length);
int temp = (int)(imageObservationAverage * 0.77);
if (byteRate < temp) {
imageObservationMarkCount++;
} else {
System.arraycopy(imageObservationArray, 1, imageObservationArray, 0,
imageObservationArray.length - 1);
imageObservationArray[3] = byteRate;
imageObservationMarkCount = 0;
}
System.out.println("AV: " + imageObservationAverage + " Mark: " +
imageObservationMarkCount + " Temp: " + temp);
if (imageObservationMarkCount > 10) {
this.client.clientSideUtilities.processCallTeardown();
this.imageObservationMarkCount = 0;
}
89