Introduction SSL accelerators have for a long time had their performance measured with a single number, namely, RSA operations per second. This provided a tidy way to compare one accelerator or offloader product to another, but unfortunately offered little in the way of helping users to assess the usefulness of a product. While RSA ops per second are certainly an important performance metric, they are only one-third of the total performance picture that should be considered when evaluating accelerators. A complete evaluation should include: RSA Operations per Second Concurrent Connections Sustainable Throughput
Most accelerators, including first generation offerings, perform a minimum of 200 RSA operations per second. By todays standards, that does not seem capacious, but consider that 200 operations per second equates to over 17 million operations per day, and then even that modest number might seem excessive; how much more so when the average accelerator today performs 800 or more RSA operations per second, or over 69 million operations per day? While it is not inconceivable for a site to receive tens of millions of hit a day, it is difficult to imagine that such a site would not have a highly fault tolerant configuration. Implicit in fault tolerance is redundancy, preferably at every level, including routers, switches, load-balancers, and SSL offloaders. Unlike offloading solutions that are integrated directly into content-switches, dedicated appliance-based offloaders provide the ability to operate in an active-active mode, offering both fault tolerance and aggregated levels of performance.
Highlights Company Intensifynance.com Industry/Market Online Brokerage Challenges Increase performance of online services Maintain persistence of client sessions Redundancy and scalability for future growth
Solution 12 SonicWALL SSL-R Offloaders or 3 SonicWALL SSL-RX Offloaders Increased performance, guaranteed persistence, and scalability for future growth.
Based on todays performance levels and the RSA metric alone, just five of todays appliance- based offloaders could ostensibly support all of Amazon.coms 140 million hits per day and Schwab.coms 76 million hits per day. Clearly there is more to the complete picture.
A White Paper by SonicWALL, Inc. Page 3
Analyzing Your Sites Traffic The best way to recognize the three key performance metrics of SSL acceleration is to look at the nature of traffic in a high volume, secure web-site. Take a small to moderately sized hypothetical e-business called Intensifynance.com. Intensifynance.com hosts and provides online brokerage services, as well as access to extensive financial reports. Additionally, they host servers providing banner advertisements that appear on all of their pages. The three common measurements of web-site activity are: User-Sessions - the actual number of unique visitors Impressions - the number of HTML pages loaded per user- session Hits - the number of requests made of the web-servers per impression
User Sessions Per Day 150,000 Impression Per User-session 9 Hits (Elements) Per Impression 7 User Session Duration 15 Minutes Hours of Peak Engagement 17 Hours User Sessions/Hour (150,000/17) 8,824 User Sessions/Minute (8,824/60) 147 User Session Concurrency (147*15) 2,206 Burst User Session Concurrency (2,206*3) 6,618 Total Daily User Session Data Transfer 108 Gigabytes Session Megabits/Sec 14.2 Mbits Ads Per Day (150,000*9*4) 5.4 Million Ad Hits/Hour (5.4 Million/17) 318,000 Ad Hits/Minute (318,000/60) 5,294 Ad Hits/Second (5,294/60) 88
Page 4 - SSL Performance and Capacity Planning
Ad Impressions/Minute (5,294/4) 1,324 Ad Impressions/Second (88/4) 22 # Daily Report Downloads 15,000 Average Report Size 5 Megabytes Total Report Data Transferred 75 Gigabytes Reports Megabits/Sec (1.23*8) 10 Mbits # of Application Web Servers 20 # of Ad Servers 10 Sustained Bandwidth Consumption 25 Mbits Burst Bandwidth Consumption (25*3) 75 Mbits Total Hits Per Day 9.5 Million Hits Per Minute (9.5M/17/60) 9,265 Hits Per Second 154 Burst Hits Per Second 462
Table 1. Intensifynance.coms Traffic Characteristics. Intensifynance.com hosts approximately 150,000 unique user-sessions per day on 30 x P4 1.6 GHz servers with 512MB RAM as their application servers, and 10 P4 1.4 GHz servers with 512MB RAM as their ad servers. Each user-session comprises an average of 9 impressions or unique page loads, and each page contains an average of 7 elements (including 4 ad images). This equates to approximately 9.5 million hits per day (150,000 x 9 x 7 = 9,450,000). Each of the 9 pages contains an average of 20K of HTML, 40K of ad images, and 20K of other image data for a total of 80K per page. Once a user has authenticated to their services, all user-sessions are HTTPS. 95% of their traffic occurs during their peak hours: the 17 hour stretch between 6am and 11pm PST. Burst traffic loads can reach up to 3 times average traffic loads. Characteristically, the banner advertisements are very small, about 10K each, and an average of four ad images load with every page. Approximately 5.4 million total banner ad images are served daily. To
A White Paper by SonicWALL, Inc. Page 5 maintain contextual consistency with their secure pages, all banner ads are linked via HTTPS. The online financial reports that they offer are in PDF format, and can range in size from 500K up to several megabytes. The average size is estimated to be 5 megabytes. Approximately 15,000 such reports are downloaded securely via HTTPS daily. Finally, Intensifynance.coms core businesstheir online services constitutes the vast majority of the sites 150,000 daily visits. The average brokerage-service user-session (combined average of quotes and trades) lasts approximately 15 minutes, and results in approximately 720K of transferred data (9 pages x 80K per page). Cookie-based session persistence is employed to keep users attached to the same application server through the life of a user-session.
User-Sessions and Flows Considering that most traffic today is HTTP/1.1 as opposed to HTTP/1.0, is it fair to assume that HTTP keep-alives will typically be used. HTTP keep-alives allow for TCP sessions (or flows) to be kept open so that multiple pages and elements can be retrieved from a persistent server without having to establish a separate TCP session for each retrieval. Prior to keep-alive support, HTTP required that a separate flow be established for each element or hit. In an SSL environment, this could have meant (assuming no session reuse) a separate RSA operation for each hit. In the case of Intensifynance.com, that could have equaled nearly 10 million RSA operations per day. With keep-alives, the real number of flows can more accurately be calculated first by multiplying the number of user-sessions by the number of impressions or pages (150,000 x 9 = 1,350,000). Next, since most web-browsers open multiple simultaneous connections (current versions of Microsoft IE and Netscape open 2 and 4, respectively) we must multiply by an (averaged) factor of 3 (3 x 1,350,000 = 4,050,000). With a combination of HTTP keep-alives and SSL session reuse, we can estimate 4 million daily flows and RSA operations. Since Intensifynance.com uses dedicated servers for their ads, separate TCP sessions must be established to these ad servers, precluding the use of keep-alives or persistent sessions from the main HTML/app server. This essentially translates to a doubling of flows since each page visited within each user-session will invoke an average of 3 browser connections to the ad server. Our previous estimate of 4 million now doubles to 8 million. Intensifynance.com estimates that 10% of its clients do not use HTTP/1.1 compliant browsers or proxy servers, and instead connect via HTTP/1.0,
Page 6 - SSL Performance and Capacity Planning generally without keep-alives. This lead to them to add an additional 500,000 short-lived flows to the total estimate of flows and RSA operations they must support, arriving at a grand total of 8.5 million flows per day.
Table 2. Flows (TCP Sessions). The relationship between user-sessions and flows can be seen as follows: User-Sessions Flows Per Day 150,000 8,500,000 Per Hour 8,824 500,000 Per Minute 147 8,333 Per Second 2 139 Concurrent 2,206 8,333 Peak Hour 26,471 1,500,000 Peak Minute 441 25,000 Peak Setup/Second 7 417 Peak Concurrent 6,618 25,000
Table 3. Relationship between User-sessions and Flows.Relating Web Traffic to SSL Offloader Performance Flows Per Day 8.5 million Flows Per Hour (8.5M/17) 500,000 Flows Per Minute (500,000/60) 8334 Flows Per Second (8334/60) 139
A White Paper by SonicWALL, Inc. Page 7 With the sites raw statistical data translated into usable flow patterns, we can correlate these numbers directly to SSL offloaders and the applicability of their performance capabilities. RSA operations occur when a new SSL session is established. Although session reuse helps to minimize the number of operations invoked, reuse is generally only employed by browsers within an existing TCP session; once the TCP session or flow closes (typically after 15 seconds of inactivity) a new session must be established, invoking another RSA operation. Since we know that hits (table 1) do not always translate to unique flows, we will not use hits to estimate our RSA requirements, but rather we will use Peak Flow Setups per Second (table 3). In Intensifynance.coms case, the number of Burst Hits per Second (462) and the number of Peak Flow Setups per Second (417) deviate by only 9%, but this similarity should not be taken for granted. There are quite a few factors that affect the relationship between these numbers, with the most profound being the number of distinct servers to which a user can potentially connect during a session. In our case, that number is 2 (the one persistent user-session to the application server itself, and the one user-session to the image server per page); the greater the number of distinct servers, and the greater the frequency of connection to these servers, the greater the number of flows. Concurrent connections correlate directly to Peak Concurrent Flows (table 3) The considerable difference we see between Peak Concurrent User-Sessions (6,618) and Peak Concurrent Flows (25,000) exaggerates somewhat the divergence between these two criteria because a burst multiple of 3 is applied to the base (average) numbers of 2,206 and 8,333. Nonetheless, they are very much incongruent, and no rule of thumb is employable to calculate their relativity. Concurrent connections on an offloader refer to the bi-directional, proxied connection between the user and the offloader, and the offloader and the origin server. The number of concurrent sessions that an offloader can support is generally a factor of memory management and overall networking efficiency of the underlying operating system, and of the SSL proxy code itself. Dedicated real-time operating systems have a considerable advantage over generic Unix-like operating systems in this area because they are specialized rather than generalized in their design, and they provide predictable process times, virtually eliminating unwanted process degradation. Sustainable Throughput is the speed at which an offloader can encrypt and decrypt or move data. Unlike other networking equipment that can offer fixed-throughput measurements, SSL offloaders must offer variable measurements of sustainable throughput based upon the various cipher suites that can be negotiated by SSL. A simple 40-bit export cipher is generally significantly less computationally intensive than 168-bit DES3,
Page 8 - SSL Performance and Capacity Planning and will result in higher throughputs. Intensifynance.coms Burst Bandwidth Consumption (table 1) is estimated to be 75 megabits, but they have little control over cipher negotiation because for reasons of compatibility they accept them all. They therefore decided to use the most demanding cipher, DES3, for measuring sustainable throughput. Here is where an appreciable difference is discernable between offloaders. Most offloaders use dedicated acceleration hardware for RSA operations, but perform bulk-cryptographic operations on their general-purpose or host CPU. Modern CPUsfor example, a Pentium III 1GHzcan sustain DES3 at approximately 45 megabits per second at 100% utilization. In multi-purpose offloading appliances, such as caches or content-switches with shared-CPU integrated SSL processing, this means that when faced with high sustained rates of DES3, or even DES traffic, they will either have to markedly degrade throughput, or they will have no CPU availability to perform their primary task of switching or caching.
Table 4. Relationship DES3 and RC4 cipher throughputs. Table 4 shows the throughput for DES3 with SHA1 hashing, and a commonly negotiated domestic cipher, 128-bit RC4 with MD5 hashing. Performance figures for the SonicWALL SSL-R and SSL-RX, and a 1GHz Pentium III are shown. DES3-SHA1 is the strongest cipher currently available within the SSL framework, and is a FIPS (Federal Information Processing Standards 140-2) requirement for strong encryption in Government applications. It is also the recommended cipher for long-lived SSL sessions because of the relative theoretical complexity of cracking DES3 compared to other ciphers such as RC2 or RC4.
Selecting Hardware Intensifynance.com decided to employ an n+1 strategy in their site design, where n=the number of units they need to sustain peak load. This way they can survive a failure of one of their units without it ever degrading site performance. To recap their requirements: RSA Operations per Second: 462 Concurrent Connections: 25,000
A White Paper by SonicWALL, Inc. Page 9 Sustained Throughput (DES3): 75 Mbit
The SonicWALL SSL-R Offloader performs up to 200 RSA operations per second, supports up to 5,000 concurrent connections, and can sustain 7 Mbits of DES3 throughput. Plugged into their performance and n+1 design requirements, they would need (12) SonicWALL SSL-R Offloaders. Substituting a 50/50 mix of RC4/DES3 in this calculation would drop their requirement to (9) SonicWALL SSL-R Offloaders. The SonicWALL SSL-RX Offloader supports up to 4,400 RSA operations per second, up to 30,000 concurrent connections, and sustains 54 Mbit DES3 throughput. Plugged into their performance and n+1 design requirements, they would need (3) SonicWALL SSL-RX Offloaders. In selecting (3) SonicWALL SSL-RX Offloaders, Intensifynance.com satisfied all of their performance and redundancy design goals without compromising any of their requirements or evaluation criteria. Additionally, by introducing SSL offloading and removing the burden of SSL from their servers, they were able to reduce the number of servers in their application and image farms from 30 and 10 to 20 and 4, respectively. To learn how SonicWALL's SSL offloading solutions can meet your specific business needs, contact SonicWALL at (888) 557-6642 or visit us online at www.sonicwall.com