You are on page 1of 6

ComparisonofBrotli,Deflate,Zopfli,LZMA,LZHAM

andBzip2CompressionAlgorithms
JyrkiAlakuijala,EvgeniiKliuchnikov,ZoltanSzabadka,andLodeVandevenne
Google,Inc.

Abstract
This paper compares six compression techniques, and based on theresults
proposes that brotli could be used as a replacement of the common deflate algorithm.
We compared the performance of brotli bymeasuringthecompressionratioandspeed,
as wellasdecompressionspeedonthreedifferentcorpora:theCanterburycompression
corpus, an ad hoc crawled web content corpus, and enwik8. On all three corpora we
show performance superior to that of deflate. Further, we show that Zopfli, LZMA,
LZHAM and bzip2 use significantly more CPU time for either compression or
decompressionandcouldnotalwaysworkasdirectreplacementsofdeflate.

Introduction
Muchofthepracticallosslessdatacompressionisdonewiththedeflatealgorithm,notonly
becauseitiswellsupportedbyexistingsystems,butalsobecauseitisrelativelysimpleandfast
toencodeanddecode.In2013welaunchedZopfli
[1]
,acompressionalgorithmthatallowsfor
densercompressionwhileremainingcompatiblewiththedeflateformat.WhileZopfliisnow
wellacceptedinthefield,therewereopinionsexpressedthatweshouldmoveonfromthe
deflatefileformattoamodernsolution.Brotli
[2]
isourattemptatbuildingacompressionformat
andanexampleimplementationofthisformatthatisfundamentallymoreefficientthandeflate.
Inthispaperwemeasuretheperformanceofourimplementationandcompareitwithdeflate
andafewothercompressionalgorithms.

Methods
Thetestswererunwitha22bitwindowsizeforbrotli,LZMAandLZHAM,anda15bitwindow
sizefordeflateandzopfli.Weuseda22bitwindowsizebecausepastexperienceshowedthat
largerwindowscanbeslowertodecode.Largerwindowsizestendtogiveahigher
compressionratioattheexpenseofdecodingspeed.Fordeflateandzopfliweusedthe
maximumsizeallowedbytheformat.Theversionsofthealgorithmstestedare:

brotliversion0.2.0
[2]
,
deflatealgorithmfromzlib1.2.8
[3]
,
Zopfliversionfromgithub20150901
[4]
,
LZMAimplementationin7zip9.20.1
[5]
,
LZHAM1.0stable1
[6]
,and
bzip21.0.6,6Sept2010
[7]
.

ThetestcomputerweusedisanIntelXeonCPUE51650v2runningat3.5GHzwithsix
coresandsixadditionalhyperthreadingcontexts.Werunlinux3.13.0.Allcodecswere
compiledusingthesamecompiler,GCC4.8.4atO2leveloptimization.Alltestswererun
singlethreadedonanotherwiseidlecomputer.

ThecompressioncorporaweusedinthetestingaretheCanterburycompressioncorpus
[8]
,an
adhoccrawledwebcontentcorpus,1285files,70611753bytestotal,andenwik8,asinglefile
corpusthatisusedintheHutterprize
[9]
.Theaveragefilesizeonthewebcontentcorpusis
only55kB,sothelargerwindowsizeadvantageofadvancedalgorithmsoverdeflatemostly
disappearsthere.

Wemeasuredthecompressionratio,compressionspeedanddecompressionspeedfor
selectedalgorithmsandcompressionlevels.Thecompressionanddecompressionspeedof
eachalgorithmweremeasuredwiththesamebenchmarkprogramthatcalledthecompression
anddecompressionroutinesofeachalgorithmfromstaticallylinkedlibraries.

Welimitedtheselectionofalgorithmstothosethatgenerallyhaveahighercompressionratio
thanthatofdeflate.Forthisreasonweexcludedalgorithmslikelz4andzstdfromthisstudy.

Unlikeotheralgorithmscomparedhere,brotliincludesastaticdictionary.Itcontains13504
wordsorsyllablesofEnglish,Spanish,Chinese,Hindi,RussianandArabic,aswellascommon
phrasesusedinmachinereadablelanguages,particularlyHTMLandJavaScript.Thetotalsize
ofthestaticdictionaryis122784bytes.Thestaticdictionaryisextendedbyamechanismof
transformsthatslightlychangethewordsinthedictionary.Atotalof1633984sequences,
althoughnotallofthemunique,canbeconstructedbyusingthe121transforms.Toreducethe
amountofbiasthestaticdictionarygivestotheresults,weusedamultilingualwebcorpusof93
differentlanguageswhereonly122ofthe1285documents(9.5%)areinlanguagessupported
byourstaticdictionary.

Inaveragingovertheresultsofindividualfilesandoverthecorporawechosetousegeometric
meaninsteadofthemorecommonarithmeticmean.Thegeometricmeangivesabitmore
weightforpoorperformance,i.e.,ifaparticularalgorithmcompressesonefiletypeextremely
fastordensely,itwillnotbepropagatedintotheresultsasstronglyaswithanarithmeticmean.

Results
Tables1,2and3showtheresultsofthreedifferentcorpora.Figure1showsthedecompression
speedvs.compressionratioonCanterburycorpus,showinggraphicallysomeoftheresults
fromTable1.Wecanseefromthetablesthatbrotliatqualitysetting1(shorthandnotationin
thisdocumentbrotli:1)compressesanddecompressesroughlythesamespeedasdeflate:1,but
offers1216%highercompressionratio.Brotli:9isagainroughlysimilarwithdeflate:9onthe
Canterburyandwebcontentcorpora,butgivesaspeedincreaseof28%indecodingofenwik8,
andacompressionratioincreaseof13
2
1%.Brotli:11issignificantlyfasterincompression
thanzopfliandgives2026%highercompressionratio.

Brotligivesslightlyfasterdecompressionthandeflateforthetestedcorpora,whileother
advancedalgorithms(LZMA,LZHAMandbzip2)areslowerthandeflate.Thegeometricmeans
forallreporteddecompressionspeedsinthetableare342.2MB/sforbrotliand323.6MB/sfor
deflate,a5.7%advantageforbrotli.

Incompressionbrotli:1issimilarly5.7%fasterthandeflate:1,butbrotli:9happenstobe32.3%
slowerthandeflate:9.However,oneshouldnotcomparecompressionspeedsimplybythe
qualitysetting.Amoreusefulcomparisonistoconsidercompressionspeedforanaimed
compressionratio.Oftenbrotli:1isclosetodeflate:9,andsometimesevenexceedingits
compressionratio.Forexample,whencompressingtheCanterburycorpusdownto3.3ratio
onecoulduseeitherbrotli:1at98.3MB/sordeflate:9at15.5MB/s.

LZMAcancompressenwik8witha2.5%highercompressionratio,butthatcomeswitha
penaltyof3.5timeslongerdecompressiontime.LZHAM:4issomewhatsimilarinperformance
withbrotli:11:1%highercompressionratioatacostof25%slowerdecompressionspeed.On
shorterfiles,likeCanterburycorpusandwebdocuments,brotliscompressionratiosare
unmatchedbyLZMAandLZHAM.

Bzip2wasabletocompresssometextfilesratherwell,butintheoverallresultsitfallsbehind.

Table1.
ThistableshowstheresultsofcompressionalgorithmsontheCanterburycorpus.The
Canterburycorpuscontains11files,andweshowthegeometricmeanforthemeasured
attributes:compressionratio,compressionspeedanddecompressionspeed.
Algorithm:
qualitysetting

Compression
ratio

Compressionspeed Decompression
[MB/s]
speed[MB/s]

brotli:1

3.381

98.3

334.0

brotli:9

3.965

17.0

354.5

brotli:11

4.347

0.5

289.5

deflate:1

2.913

93.5

323.0

deflate:9

3.371

15.5

347.3

zopfli

3.580

0.2

342.1

lzma:1

3.847

10.2

70.0

lzma:9

4.240

3.9

71.7

lzham:1

3.836

3.9

116.0

lzham:4

3.952

0.5

117.7

bzip2:1

3.757

11.8

40.4

bzip2:9

3.869

12.0

40.2

Figure 1.
The decompression speed vs. compression ratio of the Canterbury corpus as a
scatter plot. For the decompression speed vs. compression ratio, brotli:9 and brotli:11 form the
paretooptimalfront.

Table2.
Resultsofthecompressionalgorithmsonasampleofdocumentscrawledfromthe
Internet.Thesampleconsistsof1285HTMLdocuments,with93differentlanguages.

Algorithm:
qualitysetting

Compression
ratio

Compressionspeed Decompression
[MB/s]
speed[MB/s]

brotli:1

5.217

145.2

508.4

brotli:9

6.253

30.1

508.7

brotli:11

6.938

0.6

441.8

deflate:1

4.666

146.9

434.8

deflate:9

5.528

32.9

484.1

zopfli

5.770

0.2

460.1

lzma:1

5.825

7.9

100.5

lzma:9

6.231

4.4

102.2

lzham:1

5.580

4.7

168.7

lzham:4

5.768

0.2

172.7

bzip2:1

5.710

11.0

52.3

bzip2:9

5.867

11.1

52.3

Table3.
Resultsofdifferentcompressionalgorithmsontheenwik8file.
Algorithm:
qualitysetting

Compression
ratio

Compressionspeed Decompression
[MB/s]
speed[MB/s]

brotli:1

2.711

78.3

228.6

brotli:9

3.308

5.6

279.4

brotli:11

3.607

0.4

257.4

deflate:1

2.364

70.8

211.7

deflate:9

2.742

18.1

217.4

zopfli

2.857

0.6

227.7

lzma:1

3.106

9.8

60.6

lzma:9

3.696

3.44

71.8

lzham:1

3.335

2.4

177.9

lzham:4

3.643

0.4

192.2

bzip2:1

3.007

12.3

30.8

bzip2:9

3.447

12.4

30.3

DiscussionandConclusions
Thesecomparisonsaredonewithafixedwidth(22bits)backwardreferencewindow.Other
algorithmscouldpossiblybenefitfromdifferentwidths.LZMAandLZHAMarecommonly
appliedwithalargerwindow.Inthisstudywearelookingforareplacementcandidatealgorithm
forthedeflatealgorithm,andalargerwindowcouldslowdownencodinganddecodingaswell
asusemorememoryduringdecoding.Applyingthesamebackwardreferencewindowsizein
LZMA,LZHAMandbrotliremovesonecomplicationfromthecomparison.

Brotliusesastaticdictionarythatcanbehelpfulforcompressingshortfiles.Otheralgorithms
couldbeeasilymodifiedtodothesame,andtheywouldobtainslightlybettercompression
ratios.Foralongfilelikeenwik8astaticdictionaryisnotveryhelpful.Canterburycorpus
containsshortdocumentswithEnglish,andtherebrotlisstaticdictionarymightbegivingitan
unfairadvantage.

Ourresultsindicatethatbrotli,andonlybrotlioutofallthebenchmarkedalgorithms,wouldbea
goodreplacementforthecommonusecasesofthedeflatealgorithminallthreeaspects,
compressionratio,compressionspeed,anddecompressionspeed.

References
1.
2.
3.
4.
5.
6.
7.
8.
9.

https://zopfli.googlecode.com/files/Data_compression_using_Zopfli.pdf
https://github.com/google/brotli/releases/tag/v0.2.0
http://www.zlib.net/
https://github.com/google/zopfli/commit/89cf773beef75d7f4d6d378debdf299378c3314e
http://www.7zip.org/history.txt
https://github.com/richgel999/lzham_codec/releases/tag/v1_0_stable1
http://www.bzip.org/
http://corpus.canterbury.ac.nz/
http://prize.hutter1.net/

You might also like