You are on page 1of 8

Country/region [select]

Home Solutions Services Products Technicallibrary Support&downloads developerWorks Webdevelopment

AllofdW

Search

MyIBM

SpeedWebdeliverywithHTTPcompression
AlookatthepagedeliveryeffectsofdatacompressioninHTTP1.1
RadhakrishnanSrinivasan(radhakr@onebox.com),SeniorArchitect,eBusiness,TATA Consultancy Summary:HTTPcompression,arecommendationoftheHTTP1.1protocolspecification forimprovedpagedownloadtime,requiresacompressionfeatureimplementedattheWeb serverandadecompressionfeatureimplementedatthebrowser.Whilepopularbrowsers wereabletoreceivethecompresseddataasearlyasthreeyearsago,Webserverswere notreadytodelivercompressedcontent.Thesituationischanging,though,asserver compressionmodulesareintroduced.Dr.S.RadhakrishnandissectsWebcompression, examinesthebenefitsofHTTPcompression,offersseveralcompressiontools,and highlightstheeffectivenessofthetechnologyinacasestudy. Tagthis! Date:22Jul2003 Level:Intermediate Alsoavailablein:Chinese Activity:18226views Comments:

ManyInternetapplicationsdeliverdataandcontentintheformofdynamicallygeneratedHTMLtheHTMLdynamiccontentis generatedbyaWeborapplicationserverusingsuchtechnologiesasJavaServlet,JavaServerPages,PersonalHomePages (PHP),Perlscripts,orActiveServerPages(ASP).ThespeedwithwhichtheseWebpagesareavailabletotheclientbrowseron requestmainlydependsontwothings:


l

Tableofcontents Typesofcompression ThebenefitsofHTTPcompression Toolsforcompression Adetailedlookatrealworld compression Inconclusion Resources Abouttheauthor Comments

TheWeborapplicationserver'sabilitytogeneratethecontent.Thisisrelatedtothegeneralperformancecharacteristics oftheapplicationandtheservers. Thenetworkbandwidth.

TheperformanceoftheWebapplicationisdeterminedbygooddesign,tuningtheapplicationforperformance,andifneeded,by providingmorehardwarepowerfortheservers.Thenetworkbandwidthavailabletotheuser,directlyrelatedtothepage downloadtime,isnormallytakenforgranted.Butfortheuser,itisthespeedofWebpagedeliverythatindicatestheperformance level,nothowfasttheapplicationisexecutedontheserver. Therefore,toensureagooduserexperience,theperformanceofthenetworkanditsbandwidthisconsideredanimportantpart oftheoverallperformanceoftheapplication.Thisbecomesevenmoreimportantwhennetworkspeedislow,networktrafficis high,orthesizeoftheWebpagesislarge. InthecaseoftheInternet,thetrafficmaynotbecontrollable,buttheuser'snetworksegment(modemorothertechnology)and theserver'sconnectiontotheInternetcanbeaugmented.InthecaseofWebapplicationshostedandaccessedinclose premisesthroughLocalAreaNetworks(LANs),thebandwidthisusuallysufficientforfastpagedownload.InthecaseofWide AreaNetworks(WANs),segmentsofthenetworkmayhavelowspeedandhightraffic.Inthiscase,theuseraccessingthe applicationmightexperiencepoorpagedownloadtime. Ideally,itisdesirabletohaveincreasedbandwidthinthenetworkpractically,itresultsinadditionalcost.However,youcanhave increasedbandwidthwithoutasubstantialcashinvestment.IfWebpages(containingmainlyplaintextdocumentsandimages) couldbecompressedandsenttothebrowseronrequest,thespeedofpagedownloadsimproveswithoutregardforthetrafficor speedonthenetwork.TheuserreceivesfasterresponsetimeforanHTTPrequest. Inthisarticle,IexploretheintricaciesofWebbasedcompressiontechnology,detailhowtoimproveWebpagedownloadtimes bycompressingtheWebpagesfromtheWebserver,highlightthecurrentstatusofthetechnology,andprovidearealworld casestudythatexaminestheparticularrequirementsofaproject.(Throughoutthearticle,thetermWebapplicationreferstoan applicationgeneratingdynamiccontentforinstance,anycontentcreatedonthefly.) Now,lookatthespecificsofWebrelatedcompressiontechnology.

Localresources

IBMInnovationCenterAustin,TX IBMInnovationCenterChicago,IL IBMInnovationCenterDallas,TX IBMInnovationCenterSanMateo, CA IBMInnovationCenterWaltham, MA TechnicalBriefingsinNorth America

MydeveloperWorkscommunity Interact,share,andcommunicate withdevelopersworldwide. MyHome Profiles Groups Blogs Bookmarks Activities Spaces Forums Wikis Podcasts Exchange

Typesofcompression
Ifirstexaminethefollowingvarioustypesandattributesofcompression:
l l l l

HTTPcompression.CompressingcontentfromaWebserver Gzipcompression.Alosslesscompressed dataformat Staticcompression.Precompression,forwhenstaticpagesarethedelivery Contentandtransferencoding.IETF'stwolevelstandardforcompressingHTTPcontents

HTTPcompression
HTTPcompressionisthetechnologyusedtocompresscontentsfromaWebserver(alsoknownasanHTTPserver).TheWeb servercontentmaybeintheformofanyofthemanyavailableMIMEtypes:HTML,plaintext,imagesformats,PDFfiles,and more.HTMLandimageformatsarethemostwidelyusedMIMEformatsinaWebapplication. MostimagesusedinWebapplications(forexample,GIFandJPG)arealreadyincompressedformatanddonotcompress muchfurthercertainlynodiscernibleperformanceisgainedbyanotherincrementalcompressionofthesefiles.However,static orontheflycreatedHTMLcontentcontainsonlyplaintextandisidealforcompression. ThefocusofHTTPcompressionistoenabletheWebsitetoservefewerbytesofdata.Forthistoworkeffectively,acoupleof thingsarerequired:
l l

MydeveloperWorksoverview

DigdeeperintoWeb developmenton developerWorks Overview NewtoWebdevelopment

TheWebservershouldcompressthedata Thebrowsershoulddecompressthedataanddisplaythepagesintheusualmanner
Downloadsandproducts Opensourceprojects Technicallibrary(articles, tutorials,training,andmore) Forums Events Newsletter

Thisisobvious.Ofcourse,theprocessofcompressionanddecompressionshouldnotconsumeasignificantamountoftimeor resources. Sowhat'stheholdupinthisseeminglysimpleprocess?TherecommendationsforHTTPcompressionwerestipulatedbythe IETF(InternetEngineeringTaskForce)whilespecifyingtheprotocolspecificationsofHTTP1.1.Thepubliclyavailablegzip compressionformatwasintendedtobethecompressionalgorithm.Popularbrowsershavealreadyimplementedthe decompressionfeatureandwerereadytoreceivetheencodeddata(aspertheHTTP1.1protocolspecifications),butHTTP compressionontheWebserversidewasnotimplementedasquicklynorinaseriousmanner.

Thisisobvious.Ofcourse,theprocessofcompressionanddecompressionshouldnotconsumeasignificantamountoftimeor resources. Sowhat'stheholdupinthisseeminglysimpleprocess?TherecommendationsforHTTPcompressionwerestipulatedbythe IETF(InternetEngineeringTaskForce)whilespecifyingtheprotocolspecificationsofHTTP1.1.Thepubliclyavailablegzip compressionformatwasintendedtobethecompressionalgorithm.Popularbrowsershavealreadyimplementedthe decompressionfeatureandwerereadytoreceivetheencodeddata(aspertheHTTP1.1protocolspecifications),butHTTP compressionontheWebserversidewasnotimplementedasquicklynorinaseriousmanner.

Opensourceprojects Technicallibrary(articles, tutorials,training,andmore) Forums Events Newsletter

Gzipcompression
Gzipisalosslesscompressed dataformat.Thedeflationalgorithmusedbygzip(alsozip andzlib )isanopensource,patent freevariationoftheLZ77(LempelZiv1977)algorithm. Thealgorithmfindsduplicatedstringsintheinputdata.Thesecondoccurrenceofastringisreplacedbyapointer(intheformof apairdistanceandlength)tothepreviousstring.Distancesarelimitedto32KBandlengthsarelimitedto258bytes.Whena stringdoesnotoccuranywhereintheprevious32KB,itisemittedasasequenceofliteralbytes.(Inthisdescription, stringis definedasanarbitrarysequenceofbytesandisnotrestrictedtoprintablecharacters.)
CloudComputingfor Developers,4Aug

Staticcompression
IftheWebcontentispregeneratedandrequiresnoserversidedynamicinteractionwithothersystems,thecontentcanbepre compressedandplacedintheWebserver,withthesecompressedpagesbeingdeliveredtotheuser.Publiclyavailable compressiontools(gzip,Unix compress )canbeusedtocompressthestaticfiles. Staticcompression,though,isnotusefulwhenthecontenthastobegenerateddynamically,suchasonecommercesitesoron siteswhicharedrivenbyapplicationsanddatabases.Thebettersolutionistocompressthedataonthefly.

Learnhowtosolvebusinessand technicalchallengesinthecloud.

Specialoffers

Contentandtransferencoding
TheIETF'sstandardforcompressingHTTPcontentsincludestwolevelsofencoding:contentencodingandtransferencoding. Contentencodingappliestomethodsofencodingandcompressionthathavebeenalreadyappliedtodocumentsbeforethe Webuserrequeststhem.Thisisalsoknownas precompressingpages orstaticcompression.Thisconceptneverreallycaught onbecauseofthecomplexfilemaintenanceburdenitrepresentsandfewInternetsitesuseprecompressedpages. Ontheotherhand,transferencodingappliestomethodsofencodingduringtheactualtransmissionofthedata. Inmodernpracticethedifferencebetweencontentandtransferencodingisblurredsincethepagesrequesteddonotexistuntil aftertheyarerequested(theyarecreatedinreal time).Thereforetheencodinghastobealwaysinreal time Thebrowsers,takingthecuefromIETFrecommendations,implementedthe AcceptEncodingfeatureby199899.Thisallows browserstoreceiveanddecompressfilescompressedusingthepublicalgorithms.Inthiscase,theHTTPrequestheader fieldssentfromthebrowserindicatethatthebrowseriscapableofreceivingencodedinformation.WhentheWebserver receivesthisrequest,itcan 1. 2. 3. Sendprecompressedfilesasrequested.Iftheyarenotavailable,thenitcan: Compresstherequestedstaticfiles,sendthecompresseddata,andkeepthecompressedfileinatemporarydirectory forfurtherrequestsor Iftransferencodingisimplemented,compresstheWebserveroutputonthefly.
Trialsoftwareoffers

AsImentioned,precompressingfiles,aswellasrealtimecompressionofstaticfilesbytheWebserver(thefirsttwopoints, above)nevercaughtonbecauseofthecomplexitiesoffilemaintenance,thoughsomeWebserverssupportedthesefunctionsto anextent. ThefeatureofcompressingWebserverdynamicoutputontheflywasn'tseriouslyconsidereduntilrecently,sinceitsimportance isonlynowbeingrealized.So,sendingdynamicallycompressedHTTPdataoverthenetworkhasremainedadreameven thoughmanybrowserswerereadytoreceivethecompressedformats. Backtotop

ThebenefitsofHTTPcompression
ThreeindependentstudiestwoconductedbytheWWWConsortium(W3C)andoneconductedfortheMozillaorganization highlightthebenefitsofHTTPcompression.ThefirstW3Cstudy,reportedin1997,focusedontestingtheeffectsofHTTP1.1 persistentconnections,pipelining,andlinkleveldocumentcompression.ThesecondW3Cstudy,reportedin2000,lookedat thepossiblebenefitsforperformanceusingcompressionofHTMLfilesoveraLANwithcompositeHTMLdata(compressed) andimagecontent(uncompressed).TheMozillastudy,reportedin1998,observestheperformanceofcontentencoded compression. Followingarebriefsummariesoftheresultsofthesestudies,offeredtohighlightthebenefitsofHTTPcompression.(Thestudy resultsarenotcompletelydiscussedinthisarticlereadersmayrefertotheoriginalstudyforfulldiscussion.Forfurtherdetails, checkResources forlinkstotheoriginalstudies.)

W3C:OnperformanceofHTTP1.1
ThisstudyemployedtwoWebservers,JigsawandApache,andreportsthesavingsinthenumberof packetssent(Pa)and downloadtimeinseconds (Sec).Thestudywasconductedusinga28.8kbpsmodemandanHTMLfilecontainingnoimages. Table1illustratesthecompressionratiosanddownloadtimesachieved.

Table1.Compressionratiosanddownloadtimes

JigsawPa JigsawSec 12.21 4.35 64.4 ApachePa 67 4.35 68.7 ApacheSec 12.13 4.43 64.5

UncompressedHTML CompressedHTML Savedusingcompression(percent)

67 21.0 68.7

W3C:EffectofcompressioninaLAN
ThisstudyinvolvesamixofimagesandHTMLcontent.Theoverallpayloadthatistransferredintheuncompressedversionof thedownloadisa42KBHTMLfilewith41inlineGIFimagesforagrandtotalof125KB.Thecompressiondecreasesthesizeof theHTMLpagefrom42KBto11KB(73.8percentcompression),buttheimagesareuntouched.Thismeansthattheoverall payloadisdecreased31KB,orapproximately19percent. Table2reportsthefollowing:

Table2.Compressionratiosanddownloadtimeswithimage/HTMLmix
JigsawPa

ApachePa ApacheSec

JigsawSec

theHTMLpagefrom42KBto11KB(73.8percentcompression),buttheimagesareuntouched.Thismeansthattheoverall payloadisdecreased31KB,orapproximately19percent. Table2reportsthefollowing:

Table2.Compressionratiosanddownloadtimeswithimage/HTMLmix
JigsawPa Pipelining PipeliningandHTMLcompression Savedusingcompression(percent) 167.4 140.6 16

ApachePa 161.6 137.4 15 ApacheSec 0.64 0.49 23

JigsawSec 0.85 0.62 27

Thestudyauthornotesthat, Thetableshowsthat,fortheJigsawserver,compressionprovidesanetgainof15percentlesspacketsbutasmuchasa27 percentgainintime.Likewise,forApacheapacketgainof16percentisseen,butatimegainof23percent.Theinterestingthing isthattheoverallpayloadisdecreasedby19percent,whichismorethanthegaininTCPpackets.Fromthisperspective, compressiongivesaslightlyworse"TCPpacketusage".However,thegainintimeisrelativelybetterthanthegaininpayload. Thisindicatesthattherelationshipbetweenpayload,TCPpackets,andtransfertimeisnon linearandthatthefirstpacketsona connectionarerelativelymoreexpensivethantherest.

Mozilla:Performanceofcontentencodedcompression
Thethirdstudy,reportedforMozilla,usestheApacheWebserverversion1.3,whichiscapableofparsingtheHTTPheaderfor contentencoding,Acceptencodinggzip ,andcansendprecompressedHTMLfilestothebrowser. Table3illustrateswhathappenswhenonlyplainHTMLissentwithnoimages.It'sclearthatanimprovementindownloadtime isachievedwithaslowernetwork.

Table3.MozillaandApachewithplainHTML
ISDN 64kbits/sec NoGZIP 105.1 GZIP 83.2

28.8kbits/sec NoGZIP 327.9 GZIP 121.8 63%faster

21%faster

TheresultsforamixofimagesandHTMLaregiveninTable4.

Table4.AmixofHTMLandimages
ISDN 128kbits/sec NoGZIP 82.1

28.8kbits/sec GZIP 77.6 5.5%faster NoGZIP 264.7 GZIP 184.4 30%faster

Readingtheresults
ItisclearfromthesestudiesthatgoodcompressionratiosarepossibleandthedownloadtimeofWebcontentcanbe acceleratedusingHTTPcompression.ThestudiesusedamixtureofHTMLandimagesinsuchawaythattheimagesoccupied asignificantportionofthepayloadreportandshoweda20to30percentimproveddownloadtime.Whenthepayloadconsists onlyofHTMLcontent,approximatelya65percentimprovementindownloadtimeisreported. ItisclearthatforWebapplicationscontainingaratiooffewerimages(mostlyafewbuttons)andmoreHTMLcontent,theoverall improvementindownloadtimeiscloserto65percentthan20or30percent.Thesestudiesindicatestronglythatemploying HTTPcompressioninWebapplicationsisbeneficialtodownloadtime,andthustoagooduserexperience. AnotherindirectbenefitofHTTPcompressionisthatthedatapassingbetweentheWebserverandthebrowserisencryptedby virtueofthecompressionalgorithm(thoughit'snotstrongencryption),addingmoresecuritytothedata.Ofcourse,databeing sentfromthebrowsertotheserverisnotcompressedandthereforedoesn'tcarrythisextraencryption. Backtotop

Toolsforcompression
WhilethebenefitsofHTTPcompressionhavelongbeensuspected,andthecapabilityhasbeenimplementedinpopular browsersasearlyas1998,implementationofthistechnologyinWebservershastrulylagged. TheApacheWebserver1.3candeliverprecompressedstaticdatatothebrowser.And,theMicrosoftInternetInformationServer 5.0(IIS)compressesastaticpagewhenitisrequestedforthefirsttimeandstoresthecompressedcontentinacachedirectory. Whenthesamepageisrequestedagain,theserverdeliversthepagefromthetemporarydirectoryinsteadofdeliveringitfrom theWebserverdocumentdirectory.AnynewerversionofthestaticcontentplacedintheWebserverwhosecompressedcontent isalreadyavailableinthecachedirectorywillbeautomaticallycompressedandthecachedirectorywillbeupdatedwithlatest content.Also,withIIS5.0,compressingdynamiccontentcanbeenabled. ButwithmostWebservervendorsbeingmoreorlesssilentaboutintroducingdynamiccompression,othercompanieshave startedproducingcompressionplug insforpopularWebservers.Followingisalistofsomeofthepromisingproducts.

mod_gzip
RemoteCommunicationshasintroducedthefirstpubliclyavailablecompressionmodulefortheApacheWebserver,themost widelyusedWebserveronInternet.ThemodulewasbuiltonApache'sAddmodulespecificationsbywhichthirdpartymodules canbeincorporatedwithApacheproducts.Thismodule,named mod_gzip ,usesthepubliclyavailablegzipalgorithmto compressdataintransitfromtheWebserver. Sincetheintroductionofthismodule,whichreceivedwidespreadapprovalfromtheopensourcecommunityofWebserver users,newerversionsandfixeshavebeenintroduced.ManydevoteesusingApacheWebserverreportgoodcompression ratios.Benchmarkresultsforthisproductarealsoavailable.

Hyperspace
Thisisacommercialversionofacompressionmodulefromthecreatorsof mod_gzip .Unlike mod_gzip ,theHyperspace productcompressionmoduleneednotbeintegratedwiththeWebserverandcanbeusedwithanyWebserver.Thisproduct interactswiththebaseWebserverbyusinganadditionalporttowhichboththeWebserverandthecompressionproductwill

Sincetheintroductionofthismodule,whichreceivedwidespreadapprovalfromtheopensourcecommunityofWebserver users,newerversionsandfixeshavebeenintroduced.ManydevoteesusingApacheWebserverreportgoodcompression ratios.Benchmarkresultsforthisproductarealsoavailable.

Hyperspace
Thisisacommercialversionofacompressionmodulefromthecreatorsof mod_gzip .Unlike mod_gzip ,theHyperspace productcompressionmoduleneednotbeintegratedwiththeWebserverandcanbeusedwithanyWebserver.Thisproduct interactswiththebaseWebserverbyusinganadditionalporttowhichboththeWebserverandthecompressionproductwill listen. FollowingaresomeofthefeaturesoftheHyperspacemodule:
l l l l l

Theproductcanbeinstalledinaremotehost,separatedfromtheWebserverhost CustomizablelogentriesforHTTPaccessandcompressionstatistics Aseparateadminserverfordisplayingrealtimecompressionstatisticsthatindicatestotalbytessentandsaved Abilitytospecifythecontenttypetobecompressed Imagecompression

AnSSLversionofthisproductisalsoavailable.

VigosWebsiteaccelerator
AcommercialproductfromVigosAG,thissoftwaretool(thecompanyalsooffersahardwareversion)alsocompressestheWeb serverresponsesonthefly.BasedonaproprietarySmartShrinktechnique,theVigosacceleratorcandecidewhetherthe browseriscapableofacceptingthecompresseddataanditwillsendtheappropriatecompressedoruncompresseddata.This product,too,willactasastandaloneunitand,therefore,canbeusedforanyWebserver.Benchmarkresultsareavailable. SomeofthemainfeaturesoftheVigosaccelerator:
l l l

Theproductcanbeusedasaremotehost,separatedfromtheWebserver Automaticdeterminationofwhetherthebrowseracceptscompressedfilesornot CustomizablelogentriesforAccessandErrorlogsandcompressionstatistics

AnSSLversionoftheproductisavailable.

WebWarper
ThisisafreeWebservicethroughwhichthecontentsofaWebsitecanbeaccessed.Whilethisservicesoundsinteresting,the potentialdelayinIPforwardingandthenecessityofaclientsideplugin(tochangetheURLentriesfromtheWebpagetobe forwardedbytheaccelerator)makesthisunsuitableforaWebapplication.Still,generalInternetusersmaybenefitfromthe service. ThecompanyalsohasapaywaremodulewritteninPerl,designedforHTTPcompressionwithboththeIISandApacheWeb servers. Note:HTTPcompressionforApacheisachievableusing mod_gzip ,anopensourceoffering.However,forotherWebservers whichdonotimplementHTTPcompression,acommercialproductmightbeneeded. Thefollowingdiscussionpresentsareal worldcasestudyusing mod_gzip . Backtotop

Adetailedlookatrealworldcompression
Amajordivisionofalargecompany(animportantclientforTCS)hasalegacyapplicationforwhichabrowser baseduser interfaceneedstobedeveloped.TheexistingapplicationlogicresidesinanOS390basedsystem.CorporateITchosea WebSphereapplicationserverwiththeIBMHTTPServer(andotherloadbalancingandsecurityproducts)forallthecompany's Webbasedapplications.ThisenvironmentwillbeusedtohostanyWeb basedapplicationsdevelopedbyeachdivisionofthe company. Thisparticularapplicationunderdevelopmentisacriticalonlinemoduletobeusedbythesalesandcustomer care representativesofthecompanywhoaredistributedallovertheworld.Therepresentatives,whiletalkingtocustomersover phone,willneedaccesstotheapplicationtoreceiveandupdateinformationpertainingtothecustomer(suchasorderstatus, history,orID).Theapplication'sresponsetimeneedstobeveryshort:Asstipulated,itistobeontheorderofthreeseconds. Exploitingmoremusclepowerintheserverscouldenhancetheapplication'sperformance:Moreserversandloadbalancing, moreCPUpower,andincreasedRAM.Similarlytheapplicationdesigncanbetunedforperformance:Fewerobjectcreations, ongoingdatabaserefinements,andusingdatabaseconnectionpooling.Let'sassumethattheseconsiderationswillbe optimallyhandledbytheserverfarminfrastructureandtheapplicationdesign. However,theapplicationwillbeaccessedoveraWANwithsegmentsofithavingaslowabandwidthas8kbits/second,sofor theuser,theslowtransferoftheWebpagesinthenetworkoffsetsanyimprovementsinperformancemadeintheserver.

Whatarethebestoptions?
GiventhenecessityforaWebbasedUIandafastresponsetimeoversomewhatuncontrollablenetworkbandwidthandtraffic conditions,thefollowingavailablechoiceswerewinnoweddowninthisorder: 1. BecauseaWebpageislikeanHTMLfilewhichcontainsbothdataandformattinginformationinterlaced,thelatterbeing largerthantheformer,onlydatacanbesentdownstreamtothebrowser.Thiscanbeachievedbyemployingappletsat thebrowser.However,forthefollowingreasonstheuseofappletsarediscouraged:

Manyclientbrowsersrunbehindlocalfirewallswhichcanrestrictoutsideaccess.Configuringtheselocations forappletsisbeyondtheauthorityheldbytheorganization. Manyusersdonotlikethelookandfeelofathickclientapplication. Javasupportisrequiredforexecutingtheappletsandshouldresidewiththeclientmachines,requiring installationandmaintenanceofappropriateJVMs.

2. 3.

Sinceusingappletswasnotadesirablestate,sendinglessdataoverthenetworkwillimprovethepagedownload speed(orHTTPcompression). EvenusingHTTPcompressionwon'thelpifthebandwidthisdespairinglylow,sothecompanydecidedtoupgradeor discardthenetworksegmentswithspeedslessthanaparticularlimit.Thisvalueisdecidedtobe32kbits/sec.Clientsin thesediscardedsegmentswillbeadvisedtoaccesstheapplicationdirectlyfromInternet.

Simplearithmetic
AtypicalWebpagefortheWebapplicationinconsiderationconsistsofpagesthatare8to15KB(excludingimages).Some informationpagesmightbe25KB,butthiswouldbearareoccurrenceintheapplication.Takingasanexamplean8KBWeb page,asingleuser,anda32kbpsline,wefind

3.

EvenusingHTTPcompressionwon'thelpifthebandwidthisdespairinglylow,sothecompanydecidedtoupgradeor discardthenetworksegmentswithspeedslessthanaparticularlimit.Thisvalueisdecidedtobe32kbits/sec.Clientsin thesediscardedsegmentswillbeadvisedtoaccesstheapplicationdirectlyfromInternet.

Simplearithmetic
AtypicalWebpagefortheWebapplicationinconsiderationconsistsofpagesthatare8to15KB(excludingimages).Some informationpagesmightbe25KB,butthiswouldbearareoccurrenceintheapplication.Takingasanexamplean8KBWeb page,asingleuser,anda32kbpsline,wefind Downloadtime=(8*1024bytes)/(32*1000/8bytes/seconds)=2.048seconds (ignoringnetworklatenciesandanydelayintroducedbytheWebserverandbrowser) Assumingthattheprocessingdonebytheapplicationserverandthemainframesystemtakesabout1.5seconds,theWeb pagescouldnotbedeliveredtothebrowserina3 secondperiod.Inaddition,ifmanyusersareusingthesameline,thetraffic willbehighandnospaceisavailableintheline,whichresultsinslowerresponsetimeforalltheusers. Ifthesamepagecouldbecompressedbyafactorof50percent,thenthedownloadtimedropstohalf.Furthermore,otherusers canusethesavedbandwidth. Clearly,applyingHTTPcompressionforthisapplicationwillboosttheperformancefromtheuser'sperspective.

Alistofdesiredbehaviors
IfHTTPcompressionisenabledinaWebserverthatishostedinacomplexnetworkedserverfarmenvironmentandaccessed byestablishedusers,thefollowingbehaviorsaredesiredfromthecompressionproduct:
l l

Theproductshouldnotdemandanybrowsersideplugins. TheproductshouldhavefeaturestoallowanddisallowspecificMIMEtypes.Notalltypesofcontentshouldbe compressedautomatically.Forexample,somebrowsersmaynotproperlyinterpretcompressedJavaScriptand CascadingStyleSheet(CSS)files.SimilarlyPDFdocumentsandHTMLhelpfilesmaynotbecompressed. Theproductshouldnotconsumesignificantcomputationalpowerandtimefromtheserverenvironment.Asmaller footprintisalwaysdesired. TheproductshouldallowcompressionoffilesdeliveredfromspecificdirectoriesandURLs.Thisfeatureisimportant whenmorethanoneapplicationishostedinanetworkedenvironmentandonlycontentfromspecificapplicationsneed tobecompressed. Theproductshouldallowforadynamichealthchecktodeterminewhetherthecompressionfeatureisbehavingproperly ornot.Apartfromlogfiles,theabilitytogetabrowserbaseddisplayofruntimestatisticsisnecessary. Theproductshouldofferadditionalimagecompressionfeaturesevenifsinceissmallbecausetheimagesusedinthe Webpagesarealreadyincompressedformat.

Anditgoeswithoutsaying,yourproductmanufacturershouldprovidegoodsupport.

Nextstep:Findingtherightproduct
Weighingtheprosandcons,thecompanydecidestouseHTTPcompressionwiththeWebserverfortheWebapplication.The serverofchoice,theIBMHTTPServerinthiscase,comeswiththeWebSphereenvironment.Butoneproblemus:Byitself,the IBMHTTPServerdoesnotsupportHTTPcompression. SincetheIBMHTTPServerisanApacheserverclone,itwasthoughtpossibletousethefreelyavailable mod_gzip module. Thisdidn'tworkbecauseapparentlyaheaderfile( core.h )usedinthecompiledbinaryofIBMHTTPServerisdifferentfromthe oneusedintheoriginalApacheheaderfile.Becauseofthisincompatibility,the mod_gzip binarydoesnotworkwiththisHTTP server.(Furtheralonginthearticle,though,you'llfindaworkaroundforthisproblem.) Afeaturesstudyisahandytoolwhentryingtodecideonwhichproductorversionoftechnologytoimplementforyourproject.I carriedoutafeaturesstudytoweighthecomparativebenefitsofthe mod_gzip module(withanApacheWebserver)withtwo othercommercialproducts(usingtheIBMHTTPServer).Idiscoveredthatthecommercialproductsoffercomparable compressionratios,butdonotofferanyoverwhelmingbenefitfortheproject'sgoals.SoIofferthedetailsofthisstudyonthe featuresofthe mod_gzip module(withobservationsrelatedtothecommercialproductswhenevernecessary). First,Imadealistoftheavailabilityoffeatures.Table5isalistof mod_gzip features:

Table5.Featuresstudyformod_gzip
Desiredfeature WhatWebserversaresupported? CanitlistentoaremoteWebserver? IsSSLsupportavailable? What'sthesource? What'sthefootprint? Whatplatformsaresupported? Whatadditionalbrowserpluginsrequired? Whatbrowsersettingsarerequired?

Supported? Apache. No. Yes. Freedownload. Small. Win9x,NT,2000,Linux,FreeBSD,Unix,andothers. None. IE:Set"UseHTTP1.1." Yes. Yes. Yes. Yes. Yes. None. Logfile/HTMLscreens. Customlogformat. Yes(partoftheApacheerrorlog).

CanitcompresscontentsfromspecificWebserverdirectory? CanitcompresscontentscomingfromspecificURLstrings? Doesitinclude/excludespecificMIMEtypes? Doesitinclude/excludespecificbrowsertypes? Canitspecifyminimumandmaximumfilesizeforcontentthatshouldbecompressed? CananadditionalportbespecifiedinWebserverconfigurations? Wherearecompressionstatisticsavailable? Whataretheloggingdetails? Areerrorlogspresent?

Canitspecifyminimumandmaximumfilesizeforcontentthatshouldbecompressed? CananadditionalportbespecifiedinWebserverconfigurations? Wherearecompressionstatisticsavailable? Whataretheloggingdetails? Areerrorlogspresent? Doesithaveadisablecompressionfeature? Doesitofferimagecompression? IsanHTMLbasedscreenavailableforaquickglanceofrun timecompressionstatistics? Canyouspecifynumbersforthreads/pools(simultaneoususers)? Numberofconcurrentuserssupported? Canyouspecifycompressionlevel? Canyoustartandstopthecompressionproduct? Whatistheeaseofinstallationandconfiguration?

Yes. None. Logfile/HTMLscreens. Customlogformat. Yes(partoftheApacheerrorlog). Yes. No. Yes. No(thiscanbedoneinApacheWebserver). AfeatureoftheWebserver. No. Webservershouldbestopped. Good.

Note:Theversionsofcommercialproductsstudiedwerefoundtooffermanyoftheabovefeaturesandsomevariations.Some importantvariationstonote:
l

ManycommercialproductsessentiallystandoutsidetheHTTPserverandthereforethecompressionfeaturescanbe turnedonandoffwithoutrestartingtheWebserver. Administratorscanspecifyvariouscompressionlevels. Oneofthedesiredimportantfeatureswhichwasmissingfromthecommercialproductsstudiedwascompressionof contentfromaspecificURL.Thismeansthateitherallthecontentfromtheserveriscompressedornoneis compressed.However,itshouldbenotedthatthevendorsmayimplementthefeaturesintheirnextversionsorforan additionalfeeiftherequirementsarestated.

l l

NextIlookedatthecompressionratios.Forevaluatingandpresentingthecompressionratios,Isetupaseparateenvironment (aWindowsNTworkstation256MBRAMand1GHzPentium4processor).Theproductswereinstalledwithdefaultsettingsas providedbythevendors. Thefollowingtypesoffileswereusedfortesting:


l l

HTMLfilesresemblingsometypicalapplicationoutputservedbytheapplicationserver. Dynamicoutput(ServletandJSPoutput)fromWebSpheresampleapplications.

ThefilesusedareindicatedintheTable6.(Toviewafewofthescreensusedinthesamples,seethefirstentryin Resources .)

Table6.Filesusedfortesting
Samplenumber. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15

Size(bytes)* 822 864 1370 1523 4588 5248 6201 6443 6760 7915 9563 13382 14717 15211 27815 Filetype WebSphereDynamicoutput WebSphereDynamicoutput WebSphereDynamicoutput WebSphereDynamicoutput WebSphereDynamicoutput WebSphereDynamicoutput AtypicalapplicationJSPoutput AtypicalapplicationJSPoutput AtypicalapplicationJSPoutput WebSphereDynamicoutput WebSphereDynamicoutput AtypicalapplicationJSPoutput WebSphereDynamicoutput AtypicalapplicationJSPoutput AtypicalapplicationJSPoutput

*ThefilesizeisdeterminedbynotingtheentryforoutputdatasentfromWebserverlogsandthesizerepresentsonlytheHTML contentandnotanyimages. ThegraphinFigure1depictstheCompressionRatioobservedfromthelogfiles.

Figure1.Compressionratiosobservedfromthelogfiles

ThegraphinFigure1depictstheCompressionRatioobservedfromthelogfiles.

Figure1.Compressionratiosobservedfromthelogfiles

Imadethefollowingobservationsfromthedataonthecompressionratios:
l

Thecompressionratiosarequitegoodfortheapplication.Thebenefitismorepronouncedwhenthesizeofthefileis larger. Whitespaceeliminationisnotincludedhowever,thisislikelytobeincorporatedinthecommercialversionsonrequest. Thecompressionratioswerenotedfromindividualproducts'logfiles.Noattemptismadetolimitthebandwidthbetween theclientandservermachinesandobserve(orsimulate)thedownloadtime. Allthreeproductsshowalmostequalcompressionratios,probablybecausetheunderlyingalgorithmusedwasthe same. Thecomputationalefficiency(speedandamountofserverresourcesused)andtheproducts'behaviorwithmultiple concurrentaccesswerenotstudiedsincethiswasnotthegoal.Ididobservetheproducts'generalbehaviorwithmultiple usersinthedevelopmentenvironmentinwhich20peopleusedtheapplicationwhileunderdevelopment.Noabnormal utilizationofCPUortimewasnoticed. SampleapplicationfilescompressedcontainedsomeinlineJavaScriptwhichexecutednormallyinthebrowser. TestingwithSSLhasnotbeendone. NoneoftheproductscompresspagescachedbytheWebserver,sincetheproductscannotaccessthecacheregionsof theWebserver,soitappearsthatcarefulevaluationneedstobemadeaboutservicingtypicalstaticWebpages,focusing onwhethertoenablecachingintheWebserverorenablecompression.

l l

l l l

Aboutthatlastpoint:ChoosingcompressionforlargeWebsitesservingstaticpagesisnotlikelytoproduceanyperformance improvementsexceptforthefirsttimetheWebpageisaccessed,sincetheseWebsitesmaymaintaindedicatedcaching servers.However,whenthesamestaticpagesaredeliveredfromaServletenginedynamically,cachingdoesnotoccurand, hence,compressionwaspossible.SinceforaWebapplicationmostofthepageswillbegeneratedonlyonrequest,caching willnothappenandhencecompressionissuitedforthesetypesofapplications.

Therecommendation
Iobservedthatallthecompressionproductsinvestigatedrendermoreorlessequalcompressionratios.Therefore,other featuresofaserverfarmenvironmentbecomemoreimportantaspectsofthedecision(liketheabilityoftheproducttocompress onlyspecifictypesofcontentsorcontentsfromspecificURLs). SinceHTTPcompressiontechnology(atleastontheserverside)isstillinitsnascentstage,theprobabilityofunforeseen problemsoccurringinacomplexnetworkedenvironmentisstillhigh,makingpostsalevendorsupportofsuchtechnologies crucialforthesuccessfulimplementation. Inthiscase,becausetheITdepartmentchoseIBMasthevendorprovidingmostofthehardware,software,installation,and supportforthisproject,itonlymakessensetoevaluateacompressionsolutionfromIBMbeforelookingtootherproducts.But whataboutthatsupportproblem? IBM'sWebserveraclosecousinoftheApacheWebserverforwhich mod_gzip compressionsupportisavailabledoesnot officiallysupport mod_gzip ,aswementionedearlier.ButIBMresearchteamshavedevelopedapatchforsupportof mod_gzip compression.ThepatchisnotlikelytoincorporatedinthecurrentversionoftheWebservergiventhatthistypeof compressionisslatedtobeincorporatedwithlaterversionsoftheserver. Withthisinmind,thecompanymadearequestofIBMtosupportcompressionforitsserverfarmasaspecialcase,whichthe IBMteamagreedto.Andthecompanywentwith mod_gzip . Backtotop

Inconclusion
IhopethisarticlehasdemonstratedclearlythatHTTPcompressionforWebapplicationsisamustforasatisfyinguser experience.WheneverusersandserversareconnectedtotheInternetoverlow speedconnectionsoronhightrafficroutes, HTTPcompressioncankeepthelinesofeffectivecommunicationopen. Inaddition,whenintegratedwiththeWebserverasinthethecaseof mod_gzip forApache,compressionprovidesamore pleasantuserexperience.TheWebserverdirectlyserveslessdataandhencetheoverallthroughputoftheWebserverwill improve. AddoncompressionproductswhichtaketheoutputfromtheWebserverandcompressthedatathroughsoftwareorhardware maynotdirectlyimprovetheWebserverperformance.Buttheseproductsalsooffersuchbenefitsasservingcontentsfrom cachedserversandservingmultipleWebserversintheload balancedsituation. Thecomparativestudyofferedasetofstepstoapproachintegratingcompressionintoyourexistingsystems,including identifyingfeatures,theirusabilityinaserverfarmenvironment,andthegeneralbenefitsofHTTPcompression.Theauthor wouldliketonotethatthecomparativestudy: ...wascarriedoutwithspecificrequirementsinmindandwithevaluationcopiesobtainedfromrespectivevendors.Theresults presentedareasobservedbytheauthoratthetimeoftrialruns.Theauthoracknowledgesthattheproductspresentedherein performedwellduringtrialrunsandwillnotberesponsibleforfailingtopresentanyadditionalfeaturesoranyminorfeatures presentedinaccurately.Theauthorhasnomotives,financialorotherwise,otherthantechnical,whileevaluatingthetools.The informationprovidedinthispaperisforknowledgesharingonlyandanycommercialgain/lossfordecisionsmadebasedonthis studymaynotbeattributedtotheauthor.

identifyingfeatures,theirusabilityinaserverfarmenvironment,andthegeneralbenefitsofHTTPcompression.Theauthor wouldliketonotethatthecomparativestudy: ...wascarriedoutwithspecificrequirementsinmindandwithevaluationcopiesobtainedfromrespectivevendors.Theresults presentedareasobservedbytheauthoratthetimeoftrialruns.Theauthoracknowledgesthattheproductspresentedherein performedwellduringtrialrunsandwillnotberesponsibleforfailingtopresentanyadditionalfeaturesoranyminorfeatures presentedinaccurately.Theauthorhasnomotives,financialorotherwise,otherthantechnical,whileevaluatingthetools.The informationprovidedinthispaperisforknowledgesharingonlyandanycommercialgain/lossfordecisionsmadebasedonthis studymaynotbeattributedtotheauthor.

Resources
l

Viewscreensfromthefollowingsamplefilesfromthetestsinthisarticle:Sample5Sample7Sample9Sample12 Sample14. LearnfromIBMITarchitectBrianGoodmanashedetailsGZIPencodingoveranHTTPtransportforimprovingthe performanceofWebapplicationsin" SqueezingSOAP"(developerWorks,March2003). FormoreontheprotocolspecificationsofHTTP1.1(includingHTTPcompression),see"HypertextTransferProtocol: HTTP/1.1"(RFC2616,NetworkWorkingGroup,R.Fieldingetal,June1999). ReadP.Deutsch's"GZIPfileformatspecificationversion4.3"(RFC1952,NetworkWorkingGroup,May1996)fordetails onthepubliclyavailablegzipcompressionformat. Checkout"HTTPCompressionSpeedsuptheWeb"asPeterCranstonedealswiththeIETF'sstandardforcompressing HTTPcontentsthroughtwolevelsofencoding. Findthe1997W3Cstudythatfocusedontestingpersistentconnections,pipelining,andlinkleveldocument compressioninHTTP1.1in" NetworkPerformanceEffectsofHTTP/1.1,CSS1,andPNG"(June1997,W3CNOTE pipelining 970624,HenrikFrystykNielsenetal). Readthe2000W3Cstudy'sexaminationofpossibleperformancebenefitswithcompressionofHTMLfilesoveraLAN thedataisavailablein"TheEffectofHTMLCompressiononaLAN"(June2000,HenrikFrystykNielsen). Findtheresultsofthe1998Mozillastudy(thatobservestheperformanceofcontent encodedcompression)in "Performance:HTTPCompression"(September1998,JohnGiannandreaandEricBina). Getalltheinformationoncompressionalgorithm mod_gzip . ReviewBenchmarkresultsfor mod_gzip onApache. CheckoutthecommercialHyperspacecompressionmodule ,astandbetweenthatyouneednotintegrateintoaWeb server. Lookatthecommercialcompressionmodulefrom Vigos,AG.Basedonproprietarytechnology,itcanalsobeastand between,andcomesinasoftwareandhardwareversion. TryWebWarper,afreeWebservicedesignedtoenhancedownloadspeeds. TrytheIBMHTTPServerversion2.0withitsnewenhancements,includingaFastResponseCacheAcceleratorfor WindowsandsupportforHTTPcompression.

Abouttheauthor
RadhakrishnanSrinivasanisaseniorarchitectinChennai,IndiaintheeBusinesspracticeofTATAConsultancyservices,a globalsoftwareandservicescompanybasedinIndia.Hisprimaryfocusistoanalyze,define,andimplementarchitecturesfor enterpriseapplications,andhehasbeenresponsibleforcreatinghighvolume,mission criticalapplicationsandforproviding guidanceonarchitectureissuestoamultitudeofinternationalcustomers.TheauthorholdsaPhDincomputervision/image processingfromIndianInstituteofTechnology,Chennai.HisactiveprofessionalinterestsincludeWebanddistributed architecturesandapplicationintegration.Theopinionsexpressedinthisarticlearehisownandarenotareflectionofthoseof hisemployer.Hecanbecontactedatsradhakrishnan@chennai.tcs.co.in orradhakr@onebox.com.

Comments
Backtotop Trademarks |MydeveloperWorkstermsandconditions

AboutIBM

Privacy

Contact

Termsofuse

You might also like