You are on page 1of 17

Agnostic AJAX:

Asynchronous
JavaScript and Data
Clinton W. Smullen III, Stephanie A. Smullen
University of Tennessee at Chattanooga, Chattanooga, TN 37403, USA

csmullen@utc.edu, Stephanie-Smullen@utc.edu

Introduction
Despite the name AJAX (Asynchronous JavaScript and XML), it is well known
that AJAX does not have to use XML as the data format for AJAX updates. This
effort studies the use of four different data formats for AJAX updates, along with
the use of gzip. The formats tested for AJAX updates are HTML, XML, JSON,
and CSV. Comparisons are made based on the data size required to present the
same information, the time needed by a browser to convert the data format to
HTML for display, and the number of instructions needed to deliver the response
to a query to an end user. These results should provide insight into the use of
AJAX, and encourage developers to select AJAX update data formats appropriate
to their applications, needs, and target clients.

The Application
The application studied in this paper is based on an existing application that
supplies real-time class information extracted from a university student
information system (SIS). The user specifies one or more selection criteria (such
as department, course/section, meeting days, start time/end time, location,
instructor) and the application returns a list of courses meeting the specified
criteria and additional information about each of the courses (including the title
and current enrollment). The application uses a three tier model; the client
communicates with the web server, which communicates with the database
server. It is a production application, used daily by students and faculty, not a
“test” application. The web server is Apache, and the application uses PHP5 and
custom database code to connect with the legacy SIS database. All pages
returned are validated XHTML 1.1.

Fig1.jpg

Figure 1. Initial HTML page for the production application.

The initial page loaded by a user (see Figure 1) contains the HTML form used to
prepare a query. There is a significant amount of “branding” overhead on this
page; all of the University’s pages use the same layout, navigation items, style
sheet, and graphics. These common elements consist of two graphical images, a
CSS style sheet, and JavaScript supporting the common page navigation links,
and total 15,573 bytes. These elements are linked to the HTML page and are
static. For most browsers, they are downloaded once and cached, rather than
being loaded with each query and response.
A typical user would first load the HTML page containing the query form (27KB)
and the common elements (15.2KB). The user prepares a query and submits the
query to a server application. The server application queries the SIS. The data
extracted from the SIS is formatted as XML. The server process then reads the
XML data and applies an XSLT transform to produce XHTML. The web server
returns the XHTML as the response to the client. The page returned as a
response to a query (see Figure 2) links to the common elements described above,
contains the HTML formatted list of courses in answer to the query (or a message
if no results are produced), and also contains the HTML form needed to make
another query. As a result, even a query that produces no results has a response
page of about 27KB (plus the linked common elements).

Fig2.jpg

Figure 2. Results returned by standard production application.


Smullen and Smullen have compared the effects of using AJAX (with XML) to the
HTML based application. In [SMU06] the effects on the client were investigated.
In [SMU07] the effects of using AJAX on the server and on the request service
times were studied. [SMU08] models the process, further analyzes the server
effects and the network impact, and estimates the improvement for a typical user.
For these studies, a test set of 122 queries were designed. These queries produce
a range of responses, with HTML page sizes ranging from 27KB (a query that
produced no results) to over 2MB (a query for all courses offered in the fall
semester). Data on 13,260 queries have been collected and analyzed, totaling
2.7GB. A subset of the set of 122 test queries was used in this study as well.

Agnostic AJAX
In Figure 2 the results of the query are displayed in a table (bounded by the gold
bars). The remainder of the page other than this table is always the same. If the
AJAX application implements the same look-and-feel as the HTML application,
then the only change is how the response data is retrieved, converted to HTML,
and displayed in the table. Hence for this study the other elements (the common
elements, graphics, navigation menu, etc) will be ignored. This study focuses on
the retrieval of a response to a query and the display of the results in an HTML
table format to the user.

The format of the data sent by the server responding to an XMLHttpRequest does
not have to be XML. To study the effects of using different AJAX update data
formats, four different formats were selected: HTML, XML, JSON, and CSV. The
HTML format data used an HTML table that contained the results for the query;
the HTML table was extracted from the production application’s HTML response,
with all other HTML eliminated. It is just a table, and will not validate by itself.
The XML format data was the XML produced by the production application.
JSON refers to the JavaScript Object Notation format (MIME type
application/json; see http://json.org and RFC4627). CSV refers to the
comma-separated-value format (MIME type text/csv; see RFC4180). Data
adapters were written to convert the XML format data produced by the
production application to JSON and CSV formats. Unnecessary white space was
eliminated from each of the AJAX update formats to better standardize the size
comparisons. Another alternative was also studied: having the server return
gzipped data. The GZIP data (MIME type application/x-gzip) was
produced by gzipping the other data format files.
Summary descriptive statistics for the comparative sizes for the responses to each
of the set of 122 queries are shown in Table 1. The full HTML size is labelled
HTMLf; the linked common element sizes are not included in this value. The
label HTMLgz represents the gzipped HTML table data. The other rows contain
values for four other AJAX update formats for the set of 122 queries. Figure 3
displays the response sizes sorted by HTMLf size. It graphically compares the
sizes of the responses. The full HTML (HTMLf) and HTML AJAX update format
(HTML) track closely, while the other formats, in decreasing size, are XML,
JSON, CSV, and HTMLgz.

Mean Std Dev Median Min Max

HTMLf 251,398 371,727 81,231 28,945 2,172,516

HTML 198,134 330,197 46,788 528 1,904,511

HTMLgz 12,722 19,210 3,241 224 105,443

XML 103,191 172,509 23,461 214 995,045

JSON 70,444 117,853 15,724 31 679,365

CSV 40,015 66,724 8,735 11 383,826


Table 1. Summary of response sizes in bytes for set of 122 queries.

A smaller response size is generally better. A smaller response lessens the


download time and the network impact, and means the client must process fewer
bytes. The impact on the server depends on whether the server has the smaller
responses stored or must generate them for each request. Generating a smaller
response may in fact negatively impact the server performance if the effort
required to do so is greater than the effort required to send a larger, easier to
obtain response.
Fig3.jpg

Figure 3. Size in KB for six response formats for the 122 queries.

One measure of the effectiveness of a data format is the Byte Transfer Ratio
(BTR); see [SMU06]. The BTR = (AJAX size/HTML size)*100. This represents
the reduction achieved by using AJAX when compared to the HTML application.
A Byte Transfer Ratio of 40% means that the size of the AJAX transfer is 40% of
the size of the full HTML page displaying the same response (not including the
linked common elements). A small BTR is better; it means the AJAX application
is performing more efficiently than the HTML application. However any value
less than 100% represents a reduction in bytes transferred when using AJAX.
Figure 4 plots the value of BTR (in percentage) versus the size of the full HTML
response page in KB. The BTR savings for the various formats are not linear.
These curves can be statistically fitted by functions of the form (1- e- U); see
[SMU08] for details.
It can be seen from Figure 4 that while small BTR values (hence large percentage
savings) can be found for the smaller HTML response sizes, the percentage
savings level off for larger HTML response sizes. These percentage levels, for this
set of data, are approximately 88% for the HTML table format, 46% for XML,
31% for JSON, 18% for CSV, and 5% for HTMLgz format.

Fig4.jpg

Figure 4. Byte Transfer Ratio versus full HTML size in KB for the 122 queries.

Figure 5 plots the size in KB for various AJAX update possibilities against the size
in KB of the corresponding full HTML response page. These plots appear to be
straight lines; Table 2 contains the regression coefficients supporting this
conclusion. One way to interpret these results is as follows: if the size of the full
HTML response increases by 1 KB, then the size of the XML update will increase
by about 46% of that, JSON by about 32%, CSV by about 18%, and GZIP by about
5%. These results apply across the entire range of query sizes tested.
Slope Intercept R2

HTMLgz 0.05 -0.21 0.991

CSV 0.179 -4.98 0.9997

JSON 0.317 -9.04 0.9999

XML 0.464 -13.16 1

HTML 0.888 -24.59 1

Table 2. Regression coefficients for Figure 5.

Fig5.jpg

Figure 5. AJAX update size in KB versus full HTML size in KB for 122 queries.
Performance
The results presented so far are based on size. Downloading fewer bytes is clearly
an advantage – less work for the server, the network, and the client. However,
making a decision solely on size may miss the complexity of the code needed in
the client to reconstitute the HTML table from the update data format, and the
work needing to be done on each update.

To simplify the performance study, we focused on just the process of updating the
<div> section of the page; all other elements were ignored. Since these elements
were exactly the same for all of the variants considered, no comparative
performance information was lost by doing this. The data produced by the server
application is semantically tabular – a simple, regular structure containing
character data. No attributes or metadata were included with the data. Hence
the results should be, in some sense, a “best case” comparison for the options
considered.

Eight test queries were selected from the set of 122 for performance analysis.
These queries were posted against the production system and the complete
results (HTML and XML) were stored for each. Storing the results was necessary
to ensure uniformity; otherwise the use of live data drawn from the production
system could produce varying results for the same query done at different times.
The set of eight queries included a query that returned no search results, one that
returned five results, and one that returned 380 KB of data. The size of the
HTML page returned for each of the eight queries (not including the common
elements) is shown in the row labelled HTMLf in Table 3. Note that the first case
(HSRV) returns 28.3 KB even though no query results were found. Data files in
the AJAX update formats discussed above were generated for each of the eight
test queries. The comparative sizes for these are shown in Table 3. The gz format
has a mandatory data header that, for very small files, may actually force the gz
file to be larger than the original file; see the HSRV entries for JSON/JSONgz and
CSV/CSVgz. Other than these small files, every data format shows a reduction in
size from the full HTML (HTMLf) size.
HSRV FLNG BUSA BMKT HIST BIOL ENGL u8-11

HTMLf 28945 33472 40532 52896 74042 123390 213142 389211

HTML 528 4541 10818 21790 40500 84454 164549 320822

HTMLgz 224 761 1055 1965 2679 4803 8420 21791

XML 214 2117 5450 11190 20744 44424 88223 167498

XMLgz 181 561 826 1438 1870 3378 6006 16821

JSON 31 1353 3642 7526 13832 30320 61243 115234

JSONgz 61 419 678 1254 1599 2999 5414 15172

CSV 11 784 2083 4207 7433 17101 35814 66999

CSVgz 41 308 541 1064 1412 2656 4890 13657


Table 3. Comparison of response sizes in bytes for set of 8 queries.

An XHTML page was written containing a <div> area, along with JavaScript to
issue an XMLHttpRequest to fetch an AJAX update and retrieve the data. The
AJAX code called a display function to convert the AJAX update to an HTML
table and then display the table on the page in the <div> area. A separate display
function was coded for each of the different AJAX update formats. The exact
same XHTML page and JavaScript were used for each test, other than the display
function and the format of the data sent by the server. The code used is
experimental only, not of production quality. The code does not gracefully
handle exceptions, performs no data validation, and no user-interface is
implemented. Since no user interface is implemented, the AJAX
XMLHttpRequest calls are all synchronous. The code requests the stored data
from the server, downloads the data and displays it. The code represents a
minimal test bed to measure the performance of the AJAX update alternatives.
The XHTML and fixed JavaScript required 941 bytes. The sizes of the display
code for the selected update formats in bytes are shown in Table 4. No special
JavaScript was needed to handle the gzipped data, as unpacking the data was not
performed by JavaScript. The browser handled unpacking the data and then
passed it to the JavaScript code, which then processed it. The maximum size for
any of the test bed pages, including all code, was less than 3 KB. Hence the
differences in code sizes among the alternatives do not appear to be important
performance factors in this case.

HTML XML JSON CSV

Display code 18 1794 1511 1105

Total code 959 2735 2452 2046

Table 4. Test JavaScript code sizes.

Two performance measures were investigated: (a) the time needed in the browser
for JavaScript to process the AJAX update, produce the HTML table, and assign
it to the <div> area, and (b) a count of the total number of instructions executed
by the processor.

The time needed by the JavaScript code running in the browser to convert the
AJAX update data format to an HTML table was measured using a JavaScript
profiler. Firefox 2.0.0.13 with Firebug 1.0 (http://www.getfirebug.com/) was used to
profile the JavaScript code. Ten runs for each of the eight test queries for each of
the eight data format possibilities were made, and the time required in each by
the JavaScript display function was recorded. The averages of the display times
(in milliseconds) for these ten runs are shown in Figure 6.
Fig6.jpg

Figure 6. Average time in ms needed to convert data to HTML table

Pin [PIN05] was used to collect instruction counts and thread counts. Pin is an
instrumentation package that allows a Pintool, written using Pin’s API, to collect
dynamic data about an application. Pin allows for a wide range of investigations
of program performance evaluation and bug detection, including complete
program instruction traces and execution counts. For this work, Pin was used to
record a count of the total number of instructions executed for the following task:

load the Safari (for Windows) browser, download the test XHTML page and
associated JavaScript from the server, load that page in the browser, issue
the XMLHttpRequest for AJAX update data to the server, retrieve the data
from the server, convert the AJAX update to an HTML table and display the
table on the page in the <div> area, and close the browser.

The Pintool collects the total instruction count needed to open a browser and
deliver a response to the user. A modified version of the example Pintool
incount2_mt (see http://rogue.colorado.edu/Pin/) was used to collect thread
and instruction counts for the eight test queries using each AJAX update
possibility. The Safari 3.1 for Windows browser (http://www.apple.com/safari/) was
used because the underlying WebKit browser (http://webkit.org/) is open source.
Future plans include modifying the WebKit browser’s code to allow the collection
of additional, more detailed information on the browser’s performance. Five
runs were made for each of the test possibilities. The average number of
instructions over the five runs for each possibility was used to produce Figure 7.

Fig7.jpg

Figure 7. Instruction counts for AJAX update possibilities.


Conclusions
Table 2 and Figure 5 show that size reductions due to the use of different data
formats are predictable and scalable with high confidence levels over a wide
range of HTML response page sizes. The sample data shows reductions
exhibiting uniform scaling over a range of HTML response sizes. This range
covers a factor of more than 75 in size, from the smallest size to the largest.

Table 3 shows the savings in size due to the use of different AJAX update formats
and gzip. Viewed as percentages of the full HTML size (HTMLf), this data shows
that responses in size up to 12KB can be reduced by 98% or better. That is, a
different format for the data may produce a response that is only 2% of the
original size. The largest response size shows a reduction of over 96%, to 3.5% of
the original HTMLf size. Without the use of gzip, the size reductions from
choosing a different data format are up to 86% (using 14% or less of the original
HTMLf size) for responses up to 12KB in size, with the largest response size
showing a reduction of up to 83% (using 17% of the original HTMLf size). Hence
adopting AJAX with any of the update data formats studied can significantly
reduce the response sizes.

CSV JSON XML HTML

HSRV 372.7% 196.8% 84.6% 42.4%

FLNG 39.3% 31.0% 26.5% 16.8%

BUSA 26.0% 18.6% 15.2% 9.8%

BMKT 25.3% 16.7% 12.9% 9.0%

HIST 19.0% 11.6% 9.0% 6.6%

BIOL 15.5% 9.9% 7.6% 5.7%

ENGL 13.7% 8.8% 6.8% 5.1%

8-11 20.4% 13.2% 10.0% 6.8%


Table 5. Gzip file size as a percentage of the source file size

Using gzip provides an additional reduction for all but the trivial files. Table 5
summarizes the size of the gzipped file compared to the original file. A value of
39% means the gz file is only 39% of the size of the original; hence the reduction
was 60%. As seen in Table 5 the reduction provided by gzip is at least 60% and
can be up to 95%. The largest files, when gzipped, can have at most 5% of the size
of the original file size. The greatest reductions take place in the largest files.
Hence if the size of a response is the primary consideration, then for any data
format gzip can provide a reduction.

Figure 6 suggests that the time used by the JavaScript code in the browser
depends on the amount of work to be done; parsing XML takes more work, and
hence uses more time, than does processing CSV and JSON. In all formats, the
time needed to convert the gzip file for display is similar to, but slightly less than,
that needed to convert the same data format not gzipped. For most of the cases
examined the percentage difference here was about 1%. However, since the effort
for unpacking is not accounted for by the JavaScript profiler (the JavaScript code
does not perform the unpacking), there is an additional burden on the processor
not shown in Figure 6. That extra effort is reflected in the instruction counts
shown in Figure 7. The Pintool does show that the browser displaying the
gzipped data formats used additional threads; these additional threads may
account for the slightly better performance in JavaScript conversion times.
Careful instrumentation inserted into the code for a browser is needed to better
understand this.

In general, figure 6 suggests that for small to moderate sized responses, the exact
AJAX data format does not affect the time needed to convert the data very much.
For the test queries, all formats performed similarly; the best time and the worst
time were both within about 11% of the AJAX HTML table format time. For the
moderate sized responses, the CSV format was the absolute fastest, while for the
larger queries the CSVgz format was the absolute fastest.

An examination of Figure 7 shows that when measured by instruction count,


typically the CSV code executes the fewest instructions, followed by JSON and
XML. Except for the largest files, the instruction counts are close together for a
given format, within 5%. Unless the number of instructions is a critical factor,
the update data formats CSV, JSON, and XML provide about the same
performance. For the largest files, XML uses about 10% more instructions than
does CSV, with JSON between these two, using about 5% more than does CSV.
The use of gz, however, does incur an execution penalty. Table 6 summarizes the
increase in the number of instructions executed in each case when the gz data
format is used. This shows that an average of 14% more instructions are needed
when a gz data format is used when compared to the instructions used for the
same data format when no gzipping is used. This penalty could be a factor when
moving AJAX to devices with reduced processing power.

CSV JSON XML HTML

HSRV 21.4% 15.4% 11.2% 16.5%

FLNG 13.9% 7.7% 18.1% 5.5%

BUSA 10.9% 11.3% 12.5% 7.5%

BMKT 13.3% 14.8% 13.9% 14.4%

HIST 14.2% 14.7% 10.8% 13.7%

BIOL 18.8% 20.8% 21.3% 21.0%

ENGL 11.6% 12.3% 8.7% 11.4%

8-11 16.4% 16.4% 13.4% 16.2%


Table 6. % increase in instructions due to use of gz

One extension of this work involves modifications to the code in WebKit to allow
the collection of more detailed information about instruction counts and thread
usage. This may allow a more complete picture of how the browser is using the
processor resource during AJAX updates. Another extension is using the PIN
ARM emulator on Ubuntu to collect instruction counts from the Google ARM
emulator, Android, for the test cases. This will allow analysis of the performance
of AJAX on ARM architectures.

Many readers will look for an answer to the question “Which data format should I
use?”. As might be expected, we found no answer this question that can be
applied without regard to the circumstances. With client targets of desktop or
laptop systems using modern browsers and connected to high speed networks,
then the use of AJAX with gzipped CSV or JSON data update format provides
minimal response sizes without excessive execution penalty. If the target client
provides more limited processor resources, then the use of gzip should be
carefully considered.

Bibliography
[PIN05] Chi-Keung Luk, Robert Cohn, Robert Muth, Harish Patil, Artur Klauser,
Geoff Lowney, Steven Wallace, Vijay Janapa Reddi, Kim Hazelwood, "Pin:
Building Customized Program Analysis Tools with Dynamic Instrumentation,"
Proceedings of the ACM SIGPLAN 2005 Conference on Programming
Language Design and Implementation (PLDI), Chicago, Illinois, USA, June
2005, pages 191-200.

[SMU06] C.W. Smullen III and S.A. Smullen, "Modeling AJAX Application
Performance", 524-074, Proceedings of Web Technologies, Applications, and
Services, WTAS 2006, July 17-19, 2006, Calgary, Alberta, Canada, ed. J. T. Yao.
IASTED/Acta Press, Calgary, AB, Canada; ISBN 0-88986-575-2.

[SMU07] Smullen, C. and Smullen, S., "AJAX Application Server Performance",


Proceedings of the IEEE SoutheastCon 2007 (CH37882), March 22-25, 2007,
Richmond, Virginia, pp. 154-158; ISBN 1-880094-63-0.

[SMU08] Clinton W. Smullen III and Stephanie A. Smullen, "An Experimental


Study of AJAX Application Performance", pp. 30-37, JOURNAL OF
SOFTWARE (JSW), ISSN:1796-217X, Volume 3, Issue3, March 2008.

You might also like