You are on page 1of 56

Evaluation of Two SCTP

Implementations in OSE and Linux

















S O P H I A L I
































Master of Science Thesis
Stockholm, Sweden 2006





Evaluation of Two SCTP
Implementations in OSE and Linux









S O P H I A L I




Masters Thesis in Computer Science (20 credits)
at the School of Computer Science and Engineering
Royal Institute of Technology year 2006
Supervisor at CSC was Olof Hagsand
Examiner was Stefan Arnborg

TRITA-CSC-E 2006:056
ISRN-KTH/CSC/E--06/056--SE
ISSN-1653-5715





Royal Institute of Technology
School of Computer Science and Communication

KTH CSC
SE-100 44 Stockholm, Sweden

URL: www.csc.kth.se









Abstract
This thesis aims to compare the performance of Stream Control Transmission
Protocol (SCTP) in the two different operating systems, OSE and Linux. The
CPU load of the performances was measured.
SCTP traffic was sent between two regional processors in the OSE
environment. The same traffic scenarios were tested between another two
regional processors in the Linux environment. The same algorithm was
designed to calculate the CPU load of SCTP in OSE and Linux. The
measurement results were then compared and analyzed.
The final conclusion is that the SCTP performance in OSE is better than that
in Linux with small bit rates while worse than that in Linux with big bit rates.














Utvrdering av tv SCTP-implementationer i OSE och Linux

Sammanfattning
Uppgiften i detta exjobb r att jmfra CPU-belastning av SCTP(Stream
Control transmission Protocol) i tv operativsystem, OSE och Linux.
SCTP trafik skickas mellan tv regionala processorer i OSE. Samma
trafikscenario testas ocks mellan tv processorer i Linux. En algoritm
designas s att CPU-belastning av SCTP i OSE och Linux kan berknas.
Mtningsresultat jmfrs sedan och analyseras.
Slutresultatet r att SCTP i OSE har mindre CPU-belastning n i Linux med
lgre bitrate och strre CPU-belastning n i Linux med hgre bitrate.















Acknowledgement
This is a master thesis in Computer Science in Nada, Royal Institute of
Technology (Kungliga Tekniska Hgskolan). The supervisor in Nada is Olof
Hagsand and the examinator is Professor Stefan Arnborg. This thesis work is
implemented at Ericsson AB in lvsj and the supervisor in Ericsson AB is
Anders Magnusson and Stefan Mattson.
I would like to express my gratitude to Anders Magnusson for his excellent
guidance. His advice has been of vital importance during my work.
I also wish to thank my partner Qi Zhang in IMIT, KTH for his
companionship and uncountable fruitful discussions and all the colleagues at
the UZ/DK in Ericsson AB for their companionship, with special thanks to
Stefan Mattson, Richard Tham and Per Holmgren for their kind help during
this work.
Last but not least, I wish to thank my family My husband, Min Qiu and my
two kids, Katarina and Henrik, for their loves and supports.














Content

1. Introduction .......................................................................................1
1.1 Overview of the thesis......................................................................1
1.2 Problem .......................................................................................2
1.3 Assigner .......................................................................................2
1.4 Assignment and objectives...............................................................2
1.5 Tasks .......................................................................................3
1.6 Limitations .......................................................................................3
2. SCTP .......................................................................................4
2.1 Introduction of SCTP .......................................................................4
2.2 Message Format................................................................................5
2.3 Multihoming.....................................................................................6
2.4 Multistreaming.................................................................................7
2.5 Conclusion of SCTP and TCP..........................................................8
3. Operating System....................................................................................9
3.1 General .......................................................................................9
3.2 Processes .......................................................................................9
3.3 Scheduling .....................................................................................10
3.4 Memory Management.....................................................................10
3.5 Inter-process Communication (IPC)...............................................11
3.6 Process Synchronization.................................................................12
3.7 Profiler .....................................................................................12
4. IP Stack .....................................................................................15
4.1 TIP stack .....................................................................................15
4.2 Linux IP stack.................................................................................17
5. Test Application....................................................................................19
5.1 General .....................................................................................19
5.2 Traffic sending format....................................................................20
5.3 The CPU load of test applications..................................................21
6. Hardware Environment.........................................................................22
7. CPU Load Measurement Algorithms....................................................24
8. Measurement Results............................................................................25
8.1 Measurement results of SCTP in OSE and Linux..........................25
8.2 Discussions about the measurement results...................................27
8.3 Conclusion .....................................................................................27
9. Analysis .....................................................................................28
9.1 Buffer Copying...............................................................................28
9.2 Context Switch...............................................................................28
9.3 Chunk bundling for SCTP..............................................................31
9.4 Other Aspects.................................................................................32
10. Final Conclusion.................................................................................33
11. Future studies .....................................................................................33
REFERENCES .....................................................................................34
APPENDIX A Terms and Abbreviations................................................36
APPENDIX B Test Procedure in OSE....................................................38
APPENDIX C Implementation of Memory profiler in OSE ..................41
APPENDIX D SCTP Measurement Data for Linux...............................42
APPENDIX E TCP Measurement Data in Linux...................................46


1 Introduction
This report is a master thesis report in Computer Science at Nada, the Royal
Institute of Technology. It is carried out at Ericsson AB in lvsj. The
supervisor at Ericsson is Anders Magnusson and Stefan Mattson and the
supervisor at Nada is Olof Hagsand. The detailed information of literature
background and how the work is implemented is provided in the report. After
reading this report, the reader will be able to get a general idea of the
performance of SCTP in OSE and Linux.
1.1 Overview of the thesis
The whole report is divided into 11 sections. .
Section 1 is the introduction part. It contains the disposition of the report and
the general information about the assigner and this thesis work.
An introduction of the Stream Control Transmission Protocol (SCTP) is
provided in section 2. The basic and outstanding features of SCTP, such as
message format, multihoming and multistreaming are discussed by the way
of comparing SCTP with TCP.
In section 3, the two systems that are used in this thesis work, OSE Delta and
Montavista Linux 3.1, are introduced and compared from the aspects of
processes, scheduling, memory management, inter-process communication,
process synchronization and profilers.
The basic structures of the two IP stacks, TIP stack and Linux IP
stack(Linux kernel implemented TCP/IP stack and OpenSS7) are described
in detail in section 4.
In section 5, the general information of the two test applications, TEIP in
OSE and Iperf in Linux, are described. The traffic generating format is the
same in TEIP and Iperf and presented in this section.
The hardware environment for OSE and Linux is very similar so that the
results are comparable. In section 6 the hardware environment is presented
with the help of figures.
In section 7 the algorithm to calculate the CPU load is explained in detail.
The measurement results and some discussions of the results are shown in
section 8.


1

Section 9 aims to analyze buffer copying, context switch, chunk bundling and
fast path etc. in OSE and Linux. These factors usually have big influence on
the CPU load.
Section 10 aims to draw a conclusion of the whole project.
Section 11 aims at outlining possible future work.
1.2 Problem
OSE Delta is an operating system kernel produced by ENEA Systems.
Ericsson has a large number of applications that are built to run on OSE
Delta. However, there are alternatives, such as Linux, to build their own
tailor- made applications, especially within the IP domain. Linux has a
mature and standardized IP stack with a lot of open source code available.
Ericsson AB therefore wants to analyze the open source of Linux versus the
real time characteristics of OSE Delta. This thesis work, Evaluation of the
SCTP performance in OSE and Linux is a major part of that analysis.

1.3 Assigner
The assigner of this master thesis is Ericsson AB and the contact person is
Anders Magnusson in section UZ/DK in lvsj. UZ/DK is responsible for
protocol handling. They use the IP stack from Ericsson Telebit in Denmark
and produce their own adaptation layer to import the IP stack to GARP (the
Regional Processor).
1.4 Assignment and objectives
The scope of this thesis work is to measure and compare the CPU load of the
SCTP/IP stack on GARP (the Regional Processor) for both OSE and LINUX.
On the two platforms, the hardware is identical while the software differs
much. Not only the SCTP/IP stack, the OS and the test application are
different, but also the system design of the stack. There are also differences
in buffer handling and copying of data. Note that these differences are to a
large extent not something that is implied by the OS itself but to the actual
implementation.


2

The assignment consists of two parts. The first part is to send SCTP and TCP
traffic between two GARPs (Regional Processors) running OSE and then
send the same traffic between two GARPs running Linux. The CPU load of
the IP stacks is measured. TCP is not really interesting for this project.
However, TCP is already a mature protocol in OSE while SCTP is just a
prototype. Ericsson wants to estimate the comparison result for a mature
SCTP in OSE and SCTP in Linux. It is believed that the comparison result of
SCTP should be the same as that of TCP since SCTP and TCP are two
similar protocols. So the results from TCP tests are good inputs to the
estimation of comparison results between the future mature SCTP in OSE
and that in Linux. The project is trying to make the traffic scenario as similar
as possible for both OSE and Linux so that the results are comparable.
The second part is to analyze the results from all the tests. The main goal of
this part is to find why the CPU load is different for the SCTP in OSE and in
Linux.
1.5 Tasks

1. Design Test Application to generate SCTP and TCP traffic.
2. Run TCP and SCTP traffic.
3. Design an algorithm to measure the CPU load of SCTP in OSE and
Linux.
4. Prototype and ideas might have surfaced during the tests.
5. Analyze and present the results.

1.6 Limitations
This thesis was carried out by Qi Zhang at KTH and me. Qi is responsible for
Linux part and I am responsible for OSE part. As a result, this report mainly
focuses on OSE.
Due to the business securitys reason, the measurement data for OSE are not
attached in this report.



3

2 SCTP

The stream control transmission protocol (SCTP) [13, 14] is a new standard
for general-purpose transport proposed by the Internet Engineering Task
Force. SCTP addresses application and security gaps left open by its
predecessors, TCP and UDP. This section will provide an overview of SCTP
with focusing on the outstanding features by the way of comparing SCTP and
TCP.

2.1 Introduction of SCTP

With the exponential growth of the Internet, IP technology has established
itself as a cornerstone for modern digital communication. Until now, there
have been two general purpose transport protocols widely used for
applications over IP networks User Datagram Protocol (UDP) and
Transmission Control Protocol (TCP). Each provides a set of services that
cater certain classes of applications. However, the service provided by TCP
and UDP are disjoint and together do not satisfy ideally the needs of all
network applications. SCTP is designed to bridge the gap between UDP and
TCP, and addresses shortcomings of both. SCTP was originally developed to
carry telephony signalling messages over IP networks for
telecommunications and e-commerce systems. With continued work, SCTP
evolved into a general purpose transport protocol. Today it is a proposed
Internet Engineering Task Force standard (RFC 2960) [14, 15].
Like TCP, SCTP provides a reliable, full-duplex connection and mechanisms
to control network congestion. Unlike both TCP and UDP, however, SCTP
offers new delivery options that are particularly desirable for telephony
signaling and multimedia applications. Table 1 compares SCTPs services
and features with those of TCP and UDP.



4




Table 1: Comparison of SCTP services and features with those of TCP and
UDP [21]
An SCTP connection, called an association, provides novel services such as
multihoming, which allows the end points of a single association to have
multiple IP addresses, and multistreaming, which allows for independent
delivery among data streams. SCTP also features a four way handshake to
establish an association, which makes it resistant to blind denial-of-service
attack and thus improves overall protocol security.
2.2 Message Format

TCP provides a byte-stream data delivery service, whereas SCTP provides a
message-oriented data delivery service. SCTP packets are structured to
provide a message-oriented service, and allow flexible message bundling.
Figure 1 illustrates a generalization of the SCTP packet format.




5



Figure 1: SCTP packet format. The common header is followed by one or
more concatenated chunks containing either control or data information [13].



The packets always begin with an SCTP common header. The common
header is a minimal structure that provides three basic functions:

1. Source and destination ports. Together with the IP addresses in the IP
header, the port numbers identify the association to which an SCTP
packet belongs.
2. Verification tag. Vtags ensure that the packet belongs to the current
incarnation of an association.
3. Checksum. This computed value maintains the entire packets data
integrity.
The remainder of an SCTP packet consists of one or more concatenated
building blocks called chunks that contain either control or data information.
This format differs from TCP and UDP packets, which include control
information in the header and offer only a single optional data field. SCTP
control chunks transfer information needed for association functionality,
while data chunks carry application layer data.

SCTP is extensible, allowing new control chunk types to be defined in the
future. Each chunk has a chunk header that identifies its length, type, and any
special flags the type needs. SCTP has the flexibility to concatenate different
chunk types into a single data packet. The only restriction is on packet size,
which cannot exceed the destination paths maximum transmission unit
(MTU) size.

2.3 Multihoming



6

SCTP has the functionality of multihoming. A multihomed host is accessible
through multiple IP addresses. If one of its IP addresses fails, the host can
still receive data through an alternative interface.

Currently, SCTP uses multihoming only for redundancy but not for load
balancing. Each end point chooses a single primary destination address for
sending all new data chunks during normal transmission. Continued failure to
reach the primary address ultimately results in failure detection, at which
point the end point transmits all chunks to an alternate destination until the
primary destination becomes reachable again.

SCTP keeps track of each destination addresss reachability through two
mechanisms: ACKs of data chunks and heartbeat chunkscontrol chunks
that periodically probe a destinations status. All chunks which require a
response can be used to determine a destinations reachability. Usually,
DATA chunks represent the majority of such chunks, but DATA chunks are
generally sent to only one destination. Regardless of the chunk sent, if an ack
is received, the sender can conclude that the address the chunk was sent to is
reachable. If an ack is not received, however, the sender cannot conclude
immediately that the destination is unreachable. Instead, the sender credits
that destination address with a loss. If a significant number of consecutive
chunks are lost to the same destination, the sender concludes that that
destination is unreachable, and an alternate destination IP address is chosen
dynamically [15].

Because SCTP/IP stack in OSE is just a prototype without the functionality
of multihoming by now, this function is out of the range of this thesis work.

2.4 Multistreaming

One of the unique features that SCTP brings to the transport layer is
multistreaming. An SCTP stream is a unidirectional logical data flow within
an SCTP association. Streams are specified during association setup and exist
for the life of the association. Each stream is allocated independent send and
receive buffers. In Figure 3, Host A and B have a multistreamed association,
where during association setup, Host A requested three streams to Host B
(numbered 0-2), and Host B did not request multiple streams to Host A,
therefore Host B maintains only one stream to Host A(numbered 0).





7



Figure 3. SCTP multistreamed association. Streams are unidirectional
logical data flows that the SCTP end points negotiate during association
setup [13].

Within streams, data order and reliability are preserved with the use of
Stream Sequence Numbers (SSNs) for each DATA chunk. However, between
streams, only partial data order will be preserved to address head-of-line
blocking. In TCP, when a sender transmits multiple TCP segments and the
first of such segments is lost, the remainder of the segments must wait in the
receiver's queue until the first segment gets retransmitted and arrive correctly.
This blockage will delay the delivery of data to the application, which in
signalling and some multimedia applications is unacceptable. In SCTP,
however, if data on Stream 1 is lost, only Stream 1 is blocked at the receiver
while awaiting retransmissions. The data on the remaining streams is
deliverable to the application.

An endpoint's multiple streams are logically independent from an endpoint's
multiple interfaces. Data from any stream may potentially travel over any
path. Also, streams themselves do not have sending restrictions. Instead, they
conform to the restrictions placed on the association as a whole, such as the
receiver's advertised window and the destinations' congestion window.
2.5 Conclusion of SCTP and TCP

SCTP is designed to bridge the gap between TCP and UDP and addresses
some shortcomings of both. In this thesis work, it is expected that SCTPs
performance is similar with TCPs performance but with improvement.




8

3 Operating System

OSE Delta and MontaVista Linux 3.1 are the two operating systems that are
used in this thesis work. OSE Delta is a real time operating system kernel
which is provided by ENEA OSE Systems AB, Sweden. MontaVista Linux
3.1 (based on GNU Linux 2.4.20 kernel) is a pre-emptive Real Time OS
kernel provided by MontaVista Software Inc [23].

In this section, these two systems are introduced and compared from the
aspects of processes, scheduling, memory management, inter-process
communication and process synchronization.
3.1 General

OSE
OSE Delta [11] is a proprietary RTOS designed to satisfy the specific needs
of the tele- and data-communication industries. OSE Systems are Wireless
System, Network Infrastructure and Safety Critical systems [24].

Linux
Linux is an UNIX-like general-purpose operating system. It is available in
source code and some distributions are available free of charge. Linux is
designed for high throughput and high average performance. Linux is
considered to be a very reliable general purpose OS with good support for
communication protocols.
3.2 Processes

OSE
There are different types of processes. Interrupt processes are activated in
response to a hardware interrupt or a software event. Timer Interrupt
processes behave in the same way except that the system timer activates
them. All interrupt processes run from start to end unless interrupted by an
interrupt process with higher priority. A prioritised process is the most
common process. It runs as long as no interrupts are received, or until it
blocks on a resource, or until a prioritised process with higher priority is
ready to run. Background processes are run in a time sharing mode (round-
robin) at a priority level below the prioritised processes, i.e. they run only
when no other processes are ready.

Linux
A Linux system allocates CPU time through the use of processes and threads.
A process is independent from its creator with its own memory area and its
own process identification (PID). A new thread of execution has its own
stack (local variables) but shares global variables and file descriptors with its
creator. However, the overhead of creating a new thread is less than that of
creating a new process. Further, switching between threads is more efficient


9

than switching between processes. Both the processes and the threads come
in two types, normal and real-time. The difference between them is in how
they are scheduled.
3.3 Scheduling

OSE
OSE supports pre-emptive priority scheduling [18]. Processes can be
scheduled in a priority-based, a timer-based and a time sharing way.
Regardless of its priority, a process may be pre-empted at any time. After that
the kernel may either return control to the pre-empted process or give control
to a new process.

Linux
Linux has three different scheduling policies, one for normal
applications(SCHED_OTHER) and two for real-time applications
(SCHED_FIFO and SCHED_RR).

SCHED_OTHER uses a time sharing credit based algorithm for fair pre-
emptive scheduling among multiple processes/threads. The more credits a
process has, the more likely it is to run. The process credits, or dynamic
priority, are based on the processs historic running time.

The other scheduling algorithms are designed for real-time applications
where static priorities are more important than fairness. Processes/threads
scheduled with SCHED_FIFO always pre-empt any currently running
SCHED_OTHER process. If two real-time processes are ready to run at the
same time, the scheduler chooses the one with the highest static priority.
3.4 Memory Management

OSE
In an OSE system there is always a system pool of memory[11]. The system
pool is crucial to the kernel and it is located in kernel memory. All processes
can allocate their memory from the system pool, with one major
disadvantage: If the system pool is corrupted the whole system will crash. A
better way that is available is to group processes into blocks and to dedicate
local pools to each block. If this local pool gets corrupted, it will only affect
the processes in that block. Another advantage is that many system calls also
work on entire blocks, i.e. you may kill or start all processes in a block with
one single system call. It is also possible to send signals to a block and that
the block acts as a router and transfers the signal to the right process inside
the block. One or several blocks can be placed in a single segment. However,
the only way to get full security is to isolate each segment with the MMU.





10

Linux
Linux has a virtual memory mechanism that allows demand paging and
process swapping. According to much of the literature, virtual memory
should not be used in a real-time system due to the lack of determinism.
However, in Linux both the process swapping and demand paging can be
turned off with the swapoff and mlock system calls. The memory
management will still have some of the benefits of using a virtual memory
mechanism, such as the protection and lack of external fragmentation, but not
the downside when it comes to real-time performance.

3.5 Inter-process Communication (IPC)

OSE
The way to pass data between processes in an OSE system is by signals. In an
OSE system a signal is a message that is sent directly from one process to
another. A message contains an ID, and the addresses of the sender and the
receiver, as well as data. Once a signal is sent, the sending process can no
longer access it, i.e. ownership of a signal is never shared. The receiving
process may specify what signal it wants to receive at any particular moment.
The process can wait or poll the signal queue for a specific signal.

The kernel manages the ownership of signals. The known ownership makes it
possible for a system debugger to keep track of the communication between
processes, e.g. it makes it possible to set a break point on a specific signal
during system debugging.

Normally when transferring signals, the sending process only sends a pointer
to a signal buffer. The receiving process uses this pointer to access the signal
buffer. This has the advantage of making the system fast, but there is also the
danger of the receiving process destroying the pool of the sending process.
However, if a signal is sent between processes that are located in different
segments the user can choose between having only the pointer transferred or
having the signal buffer copied. Signals will die with the owning process and
be cleaned up by the OS itself.

Linux
Linux supports pipes, FIFOs, message queues and shared memory as ways of
passing data between processes or threads.

Pipes are unidirectional byte streams that connect the standard output from
one process to the standard input of another. Under Linux, pipes appear as
just another type of in-node to the virtual file system. To synchronise the
reader and the writer, each pipe has a pair of wait queues. Signals can also be
used for synchronisation. Pipes can only be used between related processes,
i.e. processes that have been started from a common ancestor process.



11

For communication between unrelated processes Linux supports named
pipes, known as FIFOs because they operate on a First In, First Out basis.
Unlike pipes, FIFOs are not temporary objects, they are entities in the file
system and are created using the mkfifo command. Linux itself will not
delete a FIFO if the communicating processes dies. Message queues allow
one or more processes to write messages that can be read by one or more
reading processes. The lengths of these queues are limited so some protection
against deadlocks is needed. Linux itself will not delete a message queue if
one of the communicating processes dies and it is difficult for the application
to find out that the queue should be deleted.

Shared memory is a fast way to communicate large or small amount of data.
It allows two unrelated processes to access the same logical memory.
However, it must be used with semaphores or some other synchronisation
mechanism since it can not provide any synchronisation by itself.
3.6 Process Synchronization

OSE
The recommended way to synchronise two processes is by signals (see
above).

Linux
The standard Linux mechanism for informing a process that an event has
occurred is a signal. The most important difference between an OSE signal
and a Linux signal is that the Linux signal cannot carry information. Only the
kernel and the superuser can send signals to any other process. Normal
processes can only send signals to processes with the same user-id and
group-id or to processes in the same process group.

3.7 Profiler
Profilers are the tools which are used to measure statistics.
OSE
There are two profilers in OSE which were used in this project, process
profiler and memory profiler.
The process profiler is used to measure statistics for different processes. It is
a "sample-based" profiler.


12

The basic theory of the process profile is shown in Figure 5. A time interval
T is selected in order to analyze the CPU load of processes. T is divided into
N equal time intervals. After each time interval tn (at t0, t1, t2...tn), a timer
interrupt is invoked, and the ISR (Interrupt Service Routine) finds out the
current running process and increments the counter that is associated with
this process. At the end of the interval, all processes have one counter, each
indicating how many times it was running when an interrupt occurred. From
this data, the CPU load can be estimated. One problem is if a process (C) is
running during a time interval less than N and then stops running due to some
reasons (e.g. finished or pre-empted), and another process (A) starts and
continues on running until the time slice expired. Then, only the counter of
(A) will be incremented but not (C). Therefore this is not a perfect way to
measure the CPU load of the processes. But it is easy to implement and
therefore adopted widely. In fact, the precision of the measurement can be
increased by minimizing the time interval N. That is, for the same T, the
bigger N is, the higher precision it has. On the other hand, when N is
increasing, the cost of the timer interrupt will also increase.

Figure 5 Sample based profiling
The memory profiler in OSE divides the memory into a number of equally
sized slices. A counter for a slice is incremented each time an interrupt
occurs and the Program Counter points to an area within the slice. By looking
at the counters for each slice, it is possible to determine which part of the
code that runs most frequently.


13

The Memory Profiler is implemented as a timer interrupt process, reading the
register SRR0 every 2 milliseconds. In the tests every second system tick is
used by another timer interrupt process. This makes it possible to collect
usable data on the remaining ticks. This has the effect that the useable date
collected by this profiler is collected every 4 milliseconds (every second
system tick). The address of the tick process will, hence, take half of the hits
of the profiler.
8Mbyte memory is used to store the hit counters. Each counter is 32 bits, so
the Profiler can divide the memory into max 2M (2097152) slices.
Both the process profiler ant the memory profiler were used in the
experiments to get the pure CPU load of the TIP stack.

Linux
No Linux profilers are used in this project.


14



4 IP Stack
The IP stack used for OSE is the TIP stack produced by Ericsson Telebit in
Denmark. The TCP/IP stack used in Linux is the kernel implemented TCP/IP
stack for Linux kernel 2.4.20. The SCTP/IP stack used for Linux is the Open
SS7 SCTP which is an open-source project[22].
In this section, the basic structures and functions of the two IP stacks, TIP
stack and Linux IP stack(Linux kernel implemented TCP/IP stack and
OpenSS7) are presented in detail.
4.1 TIP stack
The TIP stack in OSE supports raw IP, UDP, TCP and SCTP. It is mainly
constructed of INETL and INETR, see Figure 6. INETL is a unit that
contains the interface library. It contains of header files, OSE signal
definitions and a socket library to be linked into the application. The services
provided by INETL can be divided into two parts:
Stack configuration and management
Traffic and Socket configuration
INETR is the IP stack for GARP. INETR consists of TIP Core and RP
adaptation layer (glue). The TIP core is the generic (OS and HW
independent) IP stack. The TIP core includes a number of modules (RAW,
UDP, TCP and so on). TIP core also include a scheduler which is used to
prioritize the different threads (IP datagrams, configuration requests...).
Because the TIP core is HW and OS independent, it needs a supporting glue
layer for everything related to the HW and OS. The glue layer provides not
only IP stack with access to OS functions, like timer, memory and so on, but
also configuration interface towards applications and OS monitor commands,
such as ifconfig, route, and ping. Additionally, the glue layer helps to attach
interface drivers to the IP stack.
An application interfaces the stack by socket (function) calls to INETL (for
sending and receiving IP datagrams) and by OSE signals to the TIP stack.
The socket calls are forwarded from INETL to INETR by means of OSE
signals.



15


Application

int main()
{
int sd =socket();
}

Process
RAZOR
Library
INETL
Socket GLUE





OSI
GLUE














TIP CORE





OM
GLUE
Socket Layer

INETR
SCTP TCP UDP
IP
Lower Layer
Ethernet Distributor (ED)

OSE Delta
OS
Kernel
Figure 6. This picture shows the structure of TIP stack



16

Note that TIP SCTP is still only a prototype by now. There are many
functions which are not implemented yet, such as fragmentation, fast path
and retransmission, etc.[1,20].
4.2 Linux IP stack
4.2.1 Linux native TCP/IP Stack
The TCP/IP Stack used for Linux is the kernel implemented TCP/IP stack for
Linux kernel 2.4.20. See Figure 7.
`



Socket Layer
OpenSS7
SCTP
TCP UDP
IP
Lower Layer
Application

int main()
{
int sd =socket();
}
Process
OS
Kernel
Other Linux Kernel components
Figure 7. OpenSS7 SCTP and Linux TCP/IP Stack



17

4.2.2 OpenSS7 SCTP
OpenSS7 SCTP is distributed as a Linux kernel patch for various version of
the Linux kernel, because MontaVista Linux 3.1 is based on GNU Linux
kernel 2.4.20. In the experiments in this thesis work, kernel-2.4.20-28.9-
sctp-0.2.19.patch was used.
In order to use the OpenSS7 SCTP, the following steps were made:

1 Apply OpenSS7 SCTP patch to the source code of MontaVista Linux
Kernel.
2 Re-compile the kernel into the kernel image using MontaVista kernel
development tools.
3 Load the image to the GARP via PBOOT and run the kernel.












18


5 Test Application
TEIP is the test application in OSE. Because TEIP is specially designed and
only suitable for OSE, Iperf was chosen as the test application for Linux. To
make the results comparable, modifications was made in Iperf in this project
so that Iperf used the same algorithm with TEIP to generate exactly same
traffic scenarios.
This section consists of two parts. In the first part the general information
about TEIP and Iperf are described. Then follows a description of the traffic
generating algorithm which was used in both TEIP and Iperf .
5.1 General
5.1.1 TEIP
TEIP[9] is the test application used in OSE. It generates different kinds of
traffic and measures the efficiency of the IP stack. TEIP supports RAW,
UDP and TCP. As part of this thesis work, TEIP was modified to support
SCTP.
To be able to send and receive IP traffic, a user has to start up TEIP Traffic
Processes (TTP). These processes can be configured to open many sockets
and handle different protocols. A user can initiate sending of the traffic.
statistics are stored in the TTP and can be fetched with a command.
TEIP can generate acknowledgement-based traffic and time-based traffic. In
acknowledgement-based traffic, a new burst of packets are sent when the
receiver has acknowledged that it has received the last burst. In time-based
traffic, a new burst of packets are sent when a specified time passes. All the
test scenarios in this project are time-based traffic scenarios.
Each started traffic scenario will be measured in two ways. First, the time it
takes to execute the traffic scenario will be measured. Then the idle load will
be measured. This will be done with a process running on priority 31 called
TEIP_IDLE_COUNT. This process will automatically start counting loops
when the first TTP-process is created. The number of loops passed in a traffic
scenario and the time it takes will then be stored in the socket structure and
can be fetched with a command.
5.1.2 Iperf
Iperf is the test application in Linux. Iperf[7] is an open source network
bandwidth measurement tool for TCP and UDP. OpenSS7 has modified Iperf
to support OpenSS7 SCTP.


19

Iperf were modified in this project so that it uses the same algorithm as TEIP
to generate the traffic and measure the CPU load of IP stacks.
5.2 Traffic sending format
The message sending format is shown in figure 8:
Packet 2 Delay Packet N
Traffic Time
Burst Timeout
DelayTime

Figure 8. Traffic sending format.
The following terms are used in Figure 8:
Traffic time: the exact time that all packets in one burst are sent.
Burst Timeout: the predefined time for sending one burst.
DelayTime: the time the sender needs to wait for sending the next burst.
It is calculated by the bit rate, the burst size and the packet size according to
the following formulas:
Burst timeout =Packet Size * Burst Size * 8 / Bit Rate
DelayTime =Burst Timeout Traffic Time
The traffics generated in both TEIP and Iperf are based on bursts. The sender
sends the packets in one burst without any delay. As shown in Figure 8,
packet 1, packet 2, , packet N belong to one burst and are sent without any
delay. After all the packets in one burst are sent, the traffic sender will stop
sending for DelayTimeand then begin to send the next burst.








20

5.3 The CPU load of test applications

TEIP
The process profiler can be used directly to fetch the CPU load of TEIP. See
the detailed information of the process profiler in section 3.7.

Iperf
There are no suitable profilers found in Linux to get the CPU load of Iperf
directly. Therefore the CPU load of Iperf is calculated in the following ways:
There are many functions in Iperf, such as getSystemTime etc., which are
non-related to the traffic. So two versions of Iperf were used to extract the
CPU load of Iperf. One is the normal version and the other is the modified
version where all the non-related operations run twice. The CPU load of IP
stack is calculated for the two versions respectively (see section 7 for detailed
information of calculation the CPU load of IP stack). The difference is then
the CPU load of Iperf.

According to the test results, the CPU load of Iperf is always less than 1%
and will therefore be ignored.



21



6 Hardware Environment
The hardware environment for both OSE and Linux is identical, including the
test target (GARP) and the local network equipments (Ethernet interface,
switch, Ethernet cables, etc.). The only difference is the configuration part, as
it is shown in the figures 9(a) and 9(b). However, this configuration
difference will not influence the test results.See figure 9(a) and 9(b)





Figure 9(a). Hardware environment for OSE



22


Figure 9(b). Hardware environment for Linux










23

7 CPU Load Measurement Algorithms
The algorithm to calculate the CPU load is explained in this section.
The idle process is created in both OSE and Linux as a process which has a
lower priority level and does nothing but counts loops. This process always
stays in a ready state and will run if there is no other process with higher
priority running.
The measurement algorithm is designed in the following way:
1. Measure how many loops the idle process can run per second in a silent
system, S. A silent system refers to a system without traffic. Reading the
value of the counter in the idle process at time T
1
and T
2
results in values
N
1
and N
2
, respectively.
S [loop/second] =( N
2
- N
1
) / ( T
2
- T
1
).
2. Measure how many loops the idle process can run during a time when
there is traffic, L. The traffic starts at the time T
3
and finishes at the time
T
4
, and the value of the counter of the idle process is N
3
at T
3
and N
4
at
T
4

L

[loop] = N
4
-N
3
.
3. The total capacity of the silent system during the time when there is
traffic, C: C=S*(T
4
-T
3
).
4. The CPU load of the Idle process during the time when there is traffic,
N: N=L/C
5. The CPU load of the traffic, Ytraffic: Ytraffic = 1-N/C.
Ytraffic includes not only the CPU load of IP stack, but also the CPU
load of the test application. However, the impact of the test application
(Iperf) in Linux is so small that it is neglectable (see section 5.3). So the
pure CPU load of IP stack in Linux, Ylinux: Ylinux = Ytraffic
6. In OSE, the CPU load of the test application (TEIP) is not neglectable
and should be subtracted from Ytraffic. The CPU load of TEIP, Yteip,
can be fetched directly from the process profiler. However, Yteip
includes part of the CPU load of IP stack because TEIP calls functions
from INETL which also belongs to the TIP stack. In other word, part of
the CPU load of the TIP stack, Ylib, has also been subtracted from
Ytraffic. So Ylib should be added back to Ytraffic. Ylib can be fetched
directly by the help of the memory profiler. Therefore, the pure CPU
load of the IP stack in OSE , Yose:
Yose =Ytraffic-Yteip+Ylib


24

8 Measurement Results
This section consists of three parts. In the first part the measurement results
of SCTP in OSE and Linux are presented with the help of several figures.
Then follows a discussion about some problems that lie in the measurement
results. The last part of this section is the conclusion.
The CPU load for SCTP and TCP are tested and measured for both sender
and receiver sides in both the OSE environment and the Linux environment.
Burst sizes 1, 5 and 20 are chosen for the tests. Packet sizes range from 25 to
1600 bytes (in Linux tests 10 bytes are also tested), and bit rates range from 1
Mbit per second to 60 Mbit per second (in some tests higher bit rates are also
tested). However, due to the limitation of SCTP in OSE, 1400 bytes is as the
biggest packet size. The measurement data of SCTP and TCP in Linux is
given in Appendix D respective Appendix E.
Note:
For business reasons, the measurement data for SCTP and TCP in
OSE is not shown in this report.
For business reasons, all the figures in this section are only sketches to
show the tendency of the comparison results of SCTP or TCP in OSE
and Linux. The figures aim to give an overview of the comparison
results. The data in the figures deviate from the real ones.
8.1 Measurement results of SCTP in OSE and Linux
Figures 10 and 11 shows the performance of SCTP in OSE and Linux for
both sender and receiver sides.






25

Bitrate (Mbit/s)
C
P
U

l
o
a
d
(
%
)
SCTP sender
packet size =100 byte OSE
packet size =100 byte LINUX

Figure 10. The CPU load of SCTP sender

Bitrate (Mbit/s)
C
P
U

l
o
a
d
(
%
)
SCTP receiver
packet size=100 byte OSE
packet size=100 byte LINUX

Figure 11. The CPU load of SCTP receiver



26


The above figures for SCTP shows that the CPU load in OSE increases much
more quickly than that in Linux.
8.2 Discussions about the measurement results
Linux
According to the algorithm of the CPU load measurement, the bias of the
CPU load should be zero when the bit rate is close to zero. It is true in OSE.
But in Linux, the bias of the CPU load is around 10% instead of zero.
It is believed that this is caused by the polling mechanism. In the Linux IP
stack, the lower layer checks the upper layer if there is packet needed to be
sent (Also, the upper layer checks the lower layer whether there is packet
needed to be received) in the polling way. The polling method is efficient for
high bit-rate traffic (more frequently packets checking), but not for low bit-
rate.
OSE
The above comparison results are not that valuable because SCTP in OSE is
just a prototype. It is doubted whether the comparison result will be the same
if the SCTP is a complete protocol in OSE. So the comparison for TCP in
OSE and Linux are also made because TCP in OSE is already a complete
protocol. It is believed that differences between TCP and SCTP performance
in OSE are not that big regarding the CPU load.
According to the test results, the comparison results of TCP in OSE and
Linux are very similar to figure 10 and figure 11 (The detailed figures about
TCP performances are not shown here).
8.3 Conclusion
The following conclusions can be drawn:
1. When the bite rate increases, the CPU load of SCTP in OSE increases much
faster than that in Linux.
2. When the bit rate is low, SCTP in OSE is better than that in Linux because
CPU load in OSE is smaller than that in Linux. When the bit rate is high,
SCTP in Linux is better than that in OSE. The cross point of the bit rate
depends on the packet size. The bigger the packet size is, the higher the
cross point of the bit rate is.
3. According to the test results, the CPU load of the traffic with small packets
changes rapidly as the bit rate changes compared with big packets. This is
not shown in the figures above, but it is true according to the measurement
data.



27

9 Analysis
There are many differences in the software environments between OSE and
Linux which result in different behaviors. But it is difficult to analyze and list
all the differences of the OSs, stacks and test applications. Therefore only
some key aspects are highlighted and analyzed in this section.
9.1 Buffer Copying
OSE
For SCTP and TCP, the protocol stack makes two copies of packets: in the
sender side, INETR copies each packet received from INETL into the TIP
internal memory. When the packet has been processed by the stack, it is
copied into a signal buffer before it is passed on to the driver. In the receiver
side, packets are also copied twice in a similar way.
LINUX:
The packet is managed with kernel data structure sk_buff, which is a dynamic
and flexible structure, and the payload and various headers are added into
sk_buff at different time without copying, but when the user data (payload) is
copied into sk_buff, a user-to-kernel copy is performed.
There are two execution modes in Linux: user mode and kernel mode. The
user mode provides a context for all user processes. The kernel mode
provides a context for the kernel threads (bottom half, task let, softirq, etc)
and all kernel functions. The user mode and the kernel mode have different
memory spaces. If the task in the kernel space uses the data in the user space,
the data must be copied from the user space to the kernel space first.
The Linux IP stack lies in the Linux kernel. All the stack related data
structures and functions are in the kernel space. Therefore, the user space
payload must be copied to kernel space sk_buff before the stack can use the
payload.
9.2 Context Switch
9.2.1 Overview of context switch in OSE and Linux
Context switching is the switching of CPU from one process to another.
OSE
Signal is the tool chosen in OSE on GARP to implement context switch.
TEIP, INETR (TIP stack on GARP) and the idle process are the processes
handling traffic sending in OSE. TEIP and INETR are running on priority 16
while the idle process is running on priority 31. All three processes are
regarded as user processes from the OS kernels point of view. See figure 12.


28






OSE kernel
TEIP TIP
Stack
Context switch
Context switch

Process
OS
Kernel
Idle
Counter

Figure 12. Context switching in OSE
Linux
Iperf is running on behalf of the IP stack and the IP stack (kernel) provides
the services to the Iperf (user process).
There are two modes in Linux: the user mode and the kernel mode (as
mentioned in section 9.1). The switch between the two modes is called mode
switch instead of context switch. The mode switch is made by the help of
system calls. See Figure 13.


29







MontaVista Linux kernel
Iperf
Process
OS
Kernel
OpenSS7 SCTP and
Linux TCP/IP Stack
System call
Mode switch
Idle
Counter
Context Switch

Figure 11. Context switch in Linux

9.2.2 The latency of one context switch in OSE and Linux.
OSE:
A process, P, with the priority 16 is created to do the context switch with
TEIP. The process P sends a signal to TEIP. TEIP receives the signal and
sends back another signal to P. In more detail, the sender process P allocates
a signal buffer, attaches a U32 number to the signal buffer, and then sends the
signal to the receiver process (TEIP). TEIP receives the signal, frees the
signal buffer and then works as a sender process (do all the work a sender
process needs to do) to send back a signal to the process P. And then the
process P will send signal back again to TEIP. The start time and end time
are saved in the program so that the total time of context switching can be
calculated.
A new shell command tipContextSwitchTest is added to implement the
test. During the test time the processes do nothing than context switch.
10,000 context switches happens without any delay. And the test result is:
The test time: 252681 us
The Number of Context Switch: 10,000


30

The mean latency of one Context Switch [test time/the number of Context
Switch]: 25 us
Note that this result is not the pure cost of a context switch. It also includes
the cost of allocating signal buffers and freeing signal buffer.
LINUX:
LMBench is an open source tool to measure various system latencies
(including context switch latency) for UNIX and Linux. It was chosen to
measures the latencies of context switching in Linux in the project.
The latency of a context switch for MontaVista Linux (2.4.20) on GARP is:
1.10 -1.12 us.
9.3 Chunk bundling for SCTP
The performance of SCTP bundling for Linux and OSE is tested for three
scenarios, which is shown in Table 5 and 6. In the table, the following terms
are used:
Np =Number of packets sent from Application to IP stack.
Np2 =Number of packets actually sent on Ethernet.
Average bundling rate =Np/Np2.

Bit rate
(Mbit/s)
Burst
size
Packet size
(Byte)
Np Np2 Average
bundling rate
30 20 100 2400000 165000 14.5
26 5 400 600000 165000 3.6
20 1 50 2400000 82000 29.3
Table 5. SCTP bundling performance for Linux








31

Bit rate
(Mbit/s)
Burst
size
Packet size
(Byte)
Np Np2 Average
bundling rate
30 20 100 2400000 320000 7.5
26 5 400 600000 396000 2.4
20 1 50 2400000 200000 11.8
Table 6. SCTP bundling performance for OSE

For OSE, there is a bundling timeout in TIP SCTP which is defined as
200ms. TIP SCTP will try to bundle more chunks in a SCTP packet within
200ms as long as not exceeding the MTU.
For Linux, OpenSS7 SCTP bundles the data chunks when not exceeding
MTU and the send queue of socket layer is not empty.
According to the above tables, Linux bundles more chunks for each SCTP
packet (about twice) than OSE does.
9.4 Other Aspects
As is mentioned above, there are many differences regarding the software
environments for OSE and Linux. It is difficult to find out and analyze all the
differences that result in the different behaviors.
Other aspects that may also influence the results include different
implementations of TEIP and Iperf, TIP and OpenSS7, Linux IP stack as well
as OS dependent differences like task management and scheduling, memory
management, Ethernet driver and so on.




32

10 Final Conclusion
The goal of this thesis was to perform a practical comparison of SCTP
performance in OSE and Linux. The two test applications, TEIP and Iperf
were used in OSE and Linux, respectively. Since TEIP and Iperf use the
same algorithm to generate traffics, the measurement results are comparable.
According to the measurement results, the CPU load of SCTP in OSE
increases much faster than in Linux. When the bit rate is low, SCTP in OSE
performs better than in Linux. When the bit rate is high, SCTP in Linux
performs better than in OSE. The cross point of the bit rate depends on
packet size. The bigger the packet size is, the higher the cross point of the bit
rate is.
Further, the traffic scenarios are based on bursts in both OSE and Linux.
However, according to the test results, the performance of SCTP has no
relation with the burst size. That is, even if the burst size is different, the
CPU load of SCTP is the same if the same packet size and bit rate are chosen
in the traffic scenarios.
11 Future studies
The SCTP implementation in OSE is still a prototype. Functions such as
retransmissions, have not been implemented yet. So there is still a lot of work
left to make SCTP a complete protocol stack. Further, more smart techniques
need to be designed in the implementation of the SCTP protocol to get better
performance.


33

REFERENCES
1. Henrik Karlsson, (2003), General Build Support, Ericsson No. 198 17-
CAA 139 1318 Uen
2. William LeFebvre, Unix Top, http:// www.groupsys.com/top/about.shtml
3. Michael Biebl, Debian Package: lksctp-tools(1.0.2-1),
http://packages.debian.org/testing/net/lksctp-tools
4. SS7 over IP Signaling Transport & SCTP
http://www.iec.org/online/tutorials/ss7_over/index.html
5. Testing tools http://lksctp.sourceforge.net/testing.html
6. Fredrick Beste, SS7 Introduction, Ericsson No., UAB/UKY/I-02:108 Uen
7. Mark Gates, Ajay Tirumaala, Iperf User Docs,
http://dast.nlanr.net/Projects/Iperf/iperfdocs_1.7.0.ht
8. Solaris Performance Monitoring & Tuning iostat, vmstat & netstat
http://www.adminschoice.com/docs/iostat_vmstat_netstat.htm
9. Anders Magnusson, Block TEIP, Ericsson-Internal information
10. Stefan Mattson, INETR Performance Improvements in MSC R12, Ericsson
No. 095/159 41 FCPW 101 97/F Uen
11. A complete Real-time Operating System Environment for Embedded
System Applications, Ericsson No., UAB/D-00:193 Uen
12. Basic concept of real-time operating systems,
http://linuxdevices.com/articles/AT4627965573.ht
13. Armando L. Caro J r., SCTP: A New Internet Transport Layer Protocol
14. RFC 3286 An Introduction to the Stream Control Transmission
Protocol(SCTP), http://www.faqs.org/rfcs/rfc3286.html
15. RFC 2960 Stream Control Transmission Protocol,
http://rfc.net/rfc2960.html
16. Miguel Rio, A Map of the Networking Code in Linux Kernel 2.4.20,
http://www.datatag.org


34

17. Hans Feldt, PBOOT MANAGER HOWTO , Ericsson No., 1/1551
CNZ 212 268 Uen
18. J rgen Hansson, Dynamic Real-time Scheduling for OSEdelta, Technical
Report No: HS-IDA-TR-94-007
19. H.J ensen, TIP SCTP Feasibility/Prototype, Ericsson No. LMD/RI-05:006
Uen
20. Stefan Mattson, BLOCK INET, Ericsson No., 1551-CNZ 224 06 Uen
21. R.Steward and Q. Xie. Stream Control Transmission Protocol (SCTP): A
Reference Guide. Addison Wesley, New York, NY, 2001











35

APPENDIX A Terms and Abbreviations
ACK Data Acknowledgement
ECN Explicit Congestion Notification
GARP Regional Processor
GPEXR Loadable Operating system on GARP
ICMP Internet Control Messages Protocol,
INET The loadable IP stack. Consists of INET
INETL INET library
INETR The regional program in INET
IP Internet Protocol
ISR Interrupt Service Routine
MMS Memory Management System
MTU Maximum Transmission Unit
OpenSS7 An Open Source Organization
provide SS7 protocol stack
OSE An Real Time OS kernel from ENEA
PROC A virtual file system for Unix/Linux
RP Regional Processor
SCTP Stream Control Transmission Protocol
SLAB A memory allocation method that first
introduced in Solaris OS in version
2.4
SCTP Stream Control Transmission Protocol
SSN Stream Sequence Number
RTOS Real Time Operating System
TCP Transmission Control Protocol
TIP Telebit IP stack


36

TEIP Test application from EAB/UZ/DK for TIP
TTP TEIP Traffic Process
TCP Transmission Control Protocol
UDP User Datagram Protocol



















37



APPENDIX B Test Procedure in OSE

The traffics are tested by using gpexr_CXC146091_R1C01 as OS,
INETR_CXC146091_R1C01 as TIP stack and modified TEIP_R1A06 as test
application.
Test: Sending SCTP packet from RP17 to RP15
Environment: INET running on GARP
Purpose of Test: Get the CPU load of TIP stack
Preset variables: The following ttp variables is set on RP15( the receiver)
tipttpconfig -1 sctp_mode server
tipttpconfig -1 measurement no_check
tipttpcreate 192.168.202.152 sync sctp 1
Calibrate the idle load algorithm:
tipidleloadtest 5000
The following ttp variables is set on RP17( the sender):
tipttpconfig -1 sctp_mode client
tipttpconfig -1 connect yes
tipttpconfig -1 connect_ip 192.168.202.152
tipttpconfig -1 connect_port 1
Set packet size to be 400:
tiptrafficmix -1 400
tipttpcreate 192.168.202.172 sync sctp 2
tipidleloadtest 5000
The following command will send 120000 bursts of scpt packets from socket
0 on the local ip address to socket 0 on port 1 on 192.168.202.152 with burst
size 5:


38

tiptraffictime 0 0 192.168.202.152 5 120000 X 1
The following command can be used on both sender and receiver side to
show the cpu load for TIP stack including impact of TEIP:
tipttpprint m
sender: cpuload: 40.53%
receiver: cpuload: 52.63%

The following command can be used on both sender and receiver sides to get
the cpu load of TEIP:
Connect
set profile proc INETR teip_idle_count teip_sync0 eth2
enable profile
display load
Mean value of CPU load for TEIP(sender): 7%
Mean value of CPU load for TEIP(receiver): 17%

The following commands can be used to get the impact of LIB in TEIP:
Before sending traffic, predefine the memory:
profile_start 0x21000000 0x21066c00 60 4
After the the traffic sending , use the following command to see the hits on
different memory slice:
Get the total hits for TEIP including TEIP_IDLE_COUNT.
profile_print top 0x21000000 0x21066c00
Total hits of TEIP(sender): 11174
Total hits of TEIP(receiver): 11420
Get the total hits for TEIP_IDLE_COUNT:
profile_print acc 0x2100f8b0 0x2100f954
Total hits of TEIP_IDLE_COUNT(receiver): 10779


39

Total hits of TEIP_IDLE_COUNT(receiver): 10293
So the actual hits of TTP (sender) is: 12423-12207=395
So the actual hits of TTP(receiver) is: 11420-10293=1127
By the following commands can we get the total hits for INETL:
profile_print acc 0x2101c2b4 0x2101d227
profile_print acc 0x2101d228 0x2101ee39
profile_print acc 0x2101ee40 0x2102134b
profile_print acc 0x2102134c 0x21021c28
the total hits of INETL (sender) is : 182
the total hits of INETL(receiver) is: 102
the impact of LIB in TEIP(sender): 182/395 =46%
the impact of LIB in TEIP(receiver): 102/1127=9%
the Pure CPU load of TIP stack (Sender): 40.53%-(1-46%)*7%=36.75%
the pure CPU load of TIP stack (receiver): 52.63%-(1- 9%)*17%=37.16%









40


APPENDIX C Implementation of Memory profiler in OSE

In our tests Memory profile is used to calculate the percentage of Lib(INETL)
in TEIP. It is implemented in the following way:
1. With the help of the file TEIPR.abs.map, the detailed information about
the memory address of each function in TEIP can be fetched. So the
memory address of INETL in TEIP and TEIP_IDLE_COUNT (the idle
process)can be fetched directly.
2. Run the command display -l to get the memory address of the process
TEIP.
3. Exactly after sending traffic, run the following command:

Profile_start 0x21000000 0x2126ec00 60 4

This command divides the memory address range of TEIP (0x21000000
- 0x2126ec00) into memory slices of 16 byte(2^4=16) and this memory
profile will run 60 seconds. Note that the memory profile is predefined
to run in 60 seconds because all the traffics is designed to run in 60
seconds.

By this command, how many hits before, in and after this address range
can be fetched. Thereby can the total hits during the traffic time be
fetched, represented as H;

Note that if the total hits of the profile is 30000 , the actual hits for the
traffic should take half of the hits of the profiles. That is 15000.

4. After finishing sending traffic, run the command profile_print top
0x21000000 0x2126ec00. By the help of this command, total hits of
TEIP can be fetched, written as Ht ;
Note that even if the idle process is a separate process, it is built in TEIP.
So, to calculate the hits for the process TEIP, the hits in the memory
range of TEIP_IDLE_COUNT(the idle process) need to be substracted.

If Hidle is used to represent the hits of TEIP_IDLE_COUNT and Hteip
is used to represent the hits of the process TEIP, then Hteip=Ht-Hidle
And the CPU load of TEIP should be: teip=Hteip/H;

5. Run the command profile_print acc 0x2101c2b4 0x21021c28.
0x2101c2b4- 0x21021c28 is the memory range of INETL. By this
command, the total hits in INETL can be fetched, written as Hlib. SO the
procentage of INETL in TEIP can be calculated: Plib = Hlib/Hteip




41

APPENDIX D SCTP Measurement Data for Linux
In this appendix the SCTP measurement data for Linux is presented from
both the sender and receiver sides.
SCTP SENDER
BURST SIZE =1
Bit Rate Packet Size(Bytes)

10 25 50 100 200 400 800 1600
1 15.90 13.97 13.42 13.51 13.04 13.32 13.29 13.26
10 43.07 27.53 21.87 19.74 18.09 17.75 17.44 16.95
20 70.75 41.19 30.93 26.44 23.76 22.73 22.13 21.53
30 94.56 58.54 40.31 33.41 29.82 27.77 26.74 26.33
50 - 77.12 54.46 44.93 39.54 35.83 34.36 33.95
60 - 94.55 74.27 54.99 47.58 42.14 40.52 39.43

BURST SIZE =5

Bit Rate Packet Size(Bytes)

10 25 50 100 200 400 800 1600
1 15.36 14.08 13.31 13.11 13.36 12.96 12.92 13.23
10 37.69 25.24 21.02 18.71 18.05 17.24 17.00 17.22
20 60.90 38.02 29.08 25.04 23.43 22.12 21.65 21.74
30 83.07 49.56 37.74 31.56 28.75 27.11 26.42 26.39
50 - 69.10 53.38 44.10 38.45 35.29 34.41 34.65
60 - 86.94 66.34 51.37 44.20 40.18 40.65 40.86




42


BURST SIZE =10

Bit Rate Packet Size(Bytes)

10 25 50 100 200 400 800 1600
1 14.91 14.05 13.29 13.45 13.34 12.94 13.24 13.31
10 36.66 24.90 20.85 18.96 18.06 17.35 17.30 17.23
20 60.16 37.32 28.95 25.13 23.39 22.04 21.94 21.75
30 81.59 49.59 36.76 31.30 28.79 27.30 26.66 26.39
50 - 69.38 53.11 43.92 38.63 35.35 35.17 35.57
60 - 85.32 64.84 51.83 44.49 40.74 41.54 42.09

BURST SIZE =20

Bit Rate Packet Size(Bytes)

10 25 50 100 200 400 800 1600
1 15.24 13.68 13.62 13.09 13.34 13.26 12.87 13.17
10 36.37 24.56 20.76 18.60 18.04 17.54 16.99 17.21
20 60.69 36.31 28.84 24.82 23.35 22.05 21.60 21.78
30 80.55 48.09 36.49 31.06 28.74 26.99 26.31 26.38
50 - 71.52 52.89 43.49 38.50 35.94 36.09 35.54
60 - 86.07 68.83 51.71 44.50 41.96 42.60 44.89







43




BURST SIZE =80

Bit Rate Packet Size(Bytes)

10 25 50 100 200 400 800 1600
1 15.18 13.99 13.60 13.05 13.29 13.20 13.19 13.17
10 35.90 24.43 20.62 18.57 17.69 17.49 17.37 17.01
20 58.80 36.31 28.42 24.70 22.94 22.34 21.92 21.58
30 83.60 47.17 36.72 30.89 28.40 27.20 26.67 26.35
50 - 70.30 52.60 43.64 39.55 37.11 36.12 35.48
60 - 92.87 67.08 52.72 47.24 47.08 45.50 44.76

SCTP Receiver

BURST SIZE =1

Bit Rate Packet Size(Bytes)

10 25 50 100 200 400 800 1600
1 14.52 14.32 12.32 12.51 13.34 13.12 13.49 13.00
10 43.43 28.48 20.43 18.02 18.23 17.29 17.99 16.46
20 69.35 41.18 31.02 26.02 24.71 22.20 22.18 21.28
30 93.50 57.58 41.22 33.91 29.82 27.43 26.38 26.33
50 - 75.79 55.82 44.10 39.54 35.32 34.90 33.31
60 - 94.38 71.90 54.29 47.58 42.93 40.30 39.23



44



BURST SIZE =5

Bit Rate Packet Size(Bytes)

10 25 50 100 200 400 800 1600
1 15.54 14.10 13.63 13.37 13.27 13.17 13.16 13.15
10 39.13 25.29 21.00 17.56 17.38 16.73 16.40 16.22
20 65.66 37.80 29.49 24.23 21.95 20.67 19.95 19.65
30 89.45 50.07 37.50 29.31 25.71 24.60 23.32 23.29
50 - 68.34 53.60 39.69 33.81 31.94 29.70 29.94
60 - 87.30 66.01 47.44 38.09 36.88 34.04 33.56


BURST SIZE =20

Bit Rate Packet Size(Bytes)

10 25 50 100 200 400 800 1600
1 15.32 14.32 13.63 13.37 13.27 13.11 13.16 13.15
10 38.23 25.21 21.00 17.43 17.38 16.72 16.40 16.12
20 65.39 35.32 29.43 24.01 21.32 20.01 19.76 19.05
30 87.43 50.04 37.50 29.31 25.41 24.90 23.32 23.31
50 - 68.39 53.60 39.69 33.80 31.94 29.43 28.12
60 - 87.98 65.34 46.02 36.09 36.43 33.89 33.01




45


APPENDIX E TCP Measurement Data in Linux
In this appendix the SCTP measurement data for Linux is presented from
both the sender and receiver sides.
TCP Sender
BURST SIZE =1

Bit Rate Packet Size(Bytes)

10 25 50 100 200 400 800 1600
1 15.33 14.23 13.27 13.37 13.23 13.17 13.14 13.11
10 37.70 24.55 19.05 16.79 15.99 15.38 15.07 14.88
20 62.77 35.69 26.13 20.88 19.00 17.82 17.23 16.88
30 94.58 50.04 33.00 25.44 21.94 20.17 19.22 18.72
50 - 67.41 44.70 33.72 28.21 24.43 23.22 22.26
60 - 94.58 58.21 40.84 30.74 28.01 26.29 25.51

BURST SIZE =5
Bit Rate Packet Size(Bytes)

10 25 50 100 200 400 800 1600
1 14.96 13.55 13.53 12.94 13.20 12.79 13.09 13.02
10 31.64 21.37 17.89 16.18 15.59 14.88 14.97 14.81
20 51.56 29.88 23.25 19.86 18.36 17.10 16.98 16.84
30 70.36 38.96 28.42 23.32 20.96 19.54 19.14 18.80
50 - 56.63 38.58 30.45 26.64 23.96 22.79 22.28
60 - 72.17 49.46 36.18 29.62 27.18 25.93 24.83



46

BURST SIZE =10

Bit Rate Packet Size(Bytes)

10 25 50 100 200 400 800 1600
1 14.88 13.52 13.31 12.94 12.83 13.11 12.67 12.96
10 30.75 21.18 17.64 16.12 15.19 15.18 14.64 14.79
20 50.05 29.20 22.88 19.45 17.90 17.39 16.67 16.82
30 69.16 37.78 28.02 23.11 20.66 19.78 19.08 18.83
50 - 55.24 37.50 30.14 26.40 23.55 22.90 22.83
60 - 71.35 44.50 33.99 29.25 26.90 25.40 25.32


BURST SIZE =20

Bit Rate Packet Size(Bytes)

10 25 50 100 200 400 800 1600
1 14.40 13.47 13.50 12.95 13.15 13.04 12.98 12.95
10 30.08 20.81 17.89 16.06 15.53 14.83 14.92 14.41
20 47.54 29.04 21.64 19.39 18.16 17.08 17.01 16.82
30 66.02 37.00 27.82 23.19 20.55 19.40 18.77 18.84
50 - 52.64 37.33 30.07 25.60 24.00 23.07 23.06
60 - 69.42 44.08 32.87 29.47 27.52 25.81 25.47






47

BURST SIZE =80

Bit Rate Packet Size(Bytes)

10 25 50 100 200 400 800 1600
1 14.72 13.75 13.39 13.18 13.04 12.98 12.95 12.91
10 30.01 21.21 17.58 16.40 15.16 15.12 14.89 14.50
20 46.98 28.87 22.27 19.63 18.16 17.37 16.64 16.56
30 65.10 36.85 27.38 23.02 20.76 19.66 18.87 18.63
50 - 52.12 37.01 29.59 25.96 24.50 23.25 23.05
60 - 68.65 46.86 33.29 29.93 29.10 28.00 27.37

TCP Receiver
BURST SIZE =1

Bit Rate Packet Size(Bytes)

10 25 50 100 200 400 800 1600
1 15.65 14.14 13.63 13.37 13.26 13.21 12.10 13.12
10 38.09 24.34 19.79 17.50 16.37 15.77 15.42 14.13
20 61.44 35.63 26.53 22.04 19.77 18.52 17.86 17.41
30 94.59 49.62 33.91 26.91 23.85 21.95 20.47 19.72
50 - 65.69 46.11 34.93 29.89 25.15 23.99 22.98
60 - 94.49 60.05 42.45 33.13 29.50 28.16 26.64






48

BURST SIZE =5

Bit Rate Packet Size(Bytes)

10 25 50 100 200 400 800 1600
1 14.46 13.32 13.39 12.92 13.77 12.32 13.09 13.20
10 32.83 21.37 18.44 16.20 15.63 14.88 14.43 14.39
20 51.32 29.02 23.25 19.86 18.36 17.48 16.00 16.84
30 71.10 39.30 28.42 23.32 20.75 19.74 19.14 18.80
50 - 56.03 38.58 30.45 26.59 23.36 22.30 22.43
60 - 72.57 49.69 36.76 29.60 26.42 25.33 23.28




BURST SIZE =20

Bit Rate Packet Size(Bytes)

10 25 50 100 200 400 800 1600
1 14.12 13.37 13.32 12.92 12.73 13.23 12.09 12.57
10 30.04 21.84 18.93 16.19 15.19 15.83 14.64 14.80
20 51.09 29.95 22.06 19.40 17.80 17.39 16.94 16.45
30 69.39 38.39 28.02 23.12 20.66 18.72 18.34 18.75
50 - 55.00 37.34 30.03 26.40 23.59 22.90 22.94
60 - 72.78 44.43 34.89 29.26 25.99 24.82 25.01



49
TRITA-CSC-E 2006:056
ISRN-KTH/CSC/E--06/056--SE
ISSN-1653-5715






































www.kth.se

You might also like